* [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-08 23:22 ` Eric Biggers
2018-10-06 2:56 ` [PATCH net-next v7 04/28] zinc: ChaCha20 generic C implementation and selftest Jason A. Donenfeld
` (22 subsequent siblings)
23 siblings, 1 reply; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
Zinc stands for "Zinc Is Neat Crypto" or "Zinc as IN Crypto". It's also
short, easy to type, and plays nicely with the recent trend of naming
crypto libraries after elements. The guiding principle is "don't overdo
it". It's less of a library and more of a directory tree for organizing
well-curated direct implementations of cryptography primitives.
Zinc is a new cryptography API that is much more minimal and lower-level
than the current one. It intends to complement it and provide a basis
upon which the current crypto API might build, as the provider of
software implementations of cryptographic primitives. It is motivated by
three primary observations in crypto API design:
* Highly composable "cipher modes" and related abstractions from
90s cryptographers did not turn out to be as terrific an idea as
hoped, leading to a host of API misuse problems.
* Most programmers are afraid of crypto code, and so prefer to
integrate it into libraries in a highly abstracted manner, so as to
shield themselves from implementation details. Cryptographers, on
the other hand, prefer simple direct implementations, which they're
able to verify for high assurance and optimize in accordance with
their expertise.
* Overly abstracted and flexible cryptography APIs lead to a host of
dangerous problems and performance issues. The kernel is in the
business usually not of coming up with new uses of crypto, but
rather implementing various constructions, which means it essentially
needs a library of primitives, not a highly abstracted enterprise-ready
pluggable system, with a few particular exceptions.
This last observation has seen itself play out several times over and
over again within the kernel:
* The perennial move of actual primitives away from crypto/ and into
lib/, so that users can actually call these functions directly with
no overhead and without lots of allocations, function pointers,
string specifier parsing, and general clunkiness. For example:
sha256, chacha20, siphash, sha1, and so forth live in lib/ rather
than in crypto/. Zinc intends to stop the cluttering of lib/ and
introduce these direct primitives into their proper place, lib/zinc/.
* An abundance of misuse bugs with the present crypto API that have
been very unpleasant to clean up.
* A hesitance to even use cryptography, because of the overhead and
headaches involved in accessing the routines.
Zinc goes in a rather different direction. Rather than providing a
thoroughly designed and abstracted API, Zinc gives you simple functions,
which implement some primitive, or some particular and specific
construction of primitives. It is not dynamic in the least, though one
could imagine implementing a complex dynamic dispatch mechanism (such as
the current crypto API) on top of these basic functions. After all,
dynamic dispatch is usually needed for applications with cipher agility,
such as IPsec, dm-crypt, AF_ALG, and so forth, and the existing crypto
API will continue to play that role. However, Zinc will provide a non-
haphazard way of directly utilizing crypto routines in applications
that do have neither the need nor desire for abstraction and dynamic
dispatch.
It also organizes the implementations in a simple, straight-forward,
and direct manner, making it enjoyable and intuitive to work on.
Rather than moving optimized assembly implementations into arch/, it
keeps them all together in lib/zinc/, making it simple and obvious to
compare and contrast what's happening. This is, notably, exactly what
the lib/raid6/ tree does, and that seems to work out rather well. It's
also the pattern of most successful crypto libraries. The architecture-
specific glue-code is made a part of each translation unit, rather than
being in a separate one, so that generic and architecture-optimized code
are combined at compile-time, and incompatibility branches compiled out by
the optimizer.
All implementations have been extensively tested and fuzzed, and are
selected for their quality, trustworthiness, and performance. Wherever
possible and performant, formally verified implementations are used,
such as those from HACL* [1] and Fiat-Crypto [2]. The routines also take
special care to zero out secrets using memzero_explicit (and future work
is planned to have gcc do this more reliably and performantly with
compiler plugins). The performance of the selected implementations is
state-of-the-art and unrivaled on a broad array of hardware, though of
course we will continue to fine tune these to the hardware demands
needed by kernel contributors. Each implementation also comes with
extensive self-tests and crafted test vectors, pulled from various
places such as Wycheproof [9].
Regularity of function signatures is important, so that users can easily
"guess" the name of the function they want. Though, individual
primitives are oftentimes not trivially interchangeable, having been
designed for different things and requiring different parameters and
semantics, and so the function signatures they provide will directly
reflect the realities of the primitives' usages, rather than hiding it
behind (inevitably leaky) abstractions. Also, in contrast to the current
crypto API, Zinc functions can work on stack buffers, and can be called
with different keys, without requiring allocations or locking.
SIMD is used automatically when available, though some routines may
benefit from either having their SIMD disabled for particular
invocations, or to have the SIMD initialization calls amortized over
several invocations of the function, and so Zinc utilizes function
signatures enabling that in conjunction with the recently introduced
simd_context_t.
More generally, Zinc provides function signatures that allow just what
is required by the various callers. This isn't to say that users of the
functions will be permitted to pollute the function semantics with weird
particular needs, but we are trying very hard not to overdo it, and that
means looking carefully at what's actually necessary, and doing just that,
and not much more than that. Remember: practicality and cleanliness rather
than over-zealous infrastructure.
Zinc provides also an opening for the best implementers in academia to
contribute their time and effort to the kernel, by being sufficiently
simple and inviting. In discussing this commit with some of the best and
brightest over the last few years, there are many who are eager to
devote rare talent and energy to this effort.
Following the merging of this, I expect for the primitives that
currently exist in lib/ to work their way into lib/zinc/, after intense
scrutiny of each implementation, potentially replacing them with either
formally-verified implementations, or better studied and faster
state-of-the-art implementations.
Also following the merging of this, I expect for the old crypto API
implementations to be ported over to use Zinc for their software-based
implementations.
As Zinc is simply library code, its config options are un-menued, with
the exception of CONFIG_ZINC_SELFTEST and CONFIG_ZINC_DEBUG, which enables
various selftests and debugging conditions.
[1] https://github.com/project-everest/hacl-star
[2] https://github.com/mit-plv/fiat-crypto
[3] https://cr.yp.to/ecdh.html
[4] https://cr.yp.to/chacha.html
[5] https://cr.yp.to/snuffle/xsalsa-20081128.pdf
[6] https://cr.yp.to/mac.html
[7] https://blake2.net/
[8] https://tools.ietf.org/html/rfc8439
[9] https://github.com/google/wycheproof
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
MAINTAINERS | 8 ++++++++
lib/Kconfig | 2 ++
lib/Makefile | 2 ++
lib/zinc/Kconfig | 41 +++++++++++++++++++++++++++++++++++++++++
lib/zinc/Makefile | 3 +++
5 files changed, 56 insertions(+)
create mode 100644 lib/zinc/Kconfig
create mode 100644 lib/zinc/Makefile
diff --git a/MAINTAINERS b/MAINTAINERS
index bb5f431043f7..ab349f7e8d53 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16202,6 +16202,14 @@ Q: https://patchwork.linuxtv.org/project/linux-media/list/
S: Maintained
F: drivers/media/dvb-frontends/zd1301_demod*
+ZINC CRYPTOGRAPHY LIBRARY
+M: Jason A. Donenfeld <Jason@zx2c4.com>
+M: Samuel Neves <sneves@dei.uc.pt>
+S: Maintained
+F: lib/zinc/
+F: include/zinc/
+L: linux-crypto@vger.kernel.org
+
ZPOOL COMPRESSED PAGE STORAGE API
M: Dan Streetman <ddstreet@ieee.org>
L: linux-mm@kvack.org
diff --git a/lib/Kconfig b/lib/Kconfig
index a3928d4438b5..3e6848269c66 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -485,6 +485,8 @@ config GLOB_SELFTEST
module load) by a small amount, so you're welcome to play with
it, but you probably don't need it.
+source "lib/zinc/Kconfig"
+
#
# Netlink attribute parsing support is select'ed if needed
#
diff --git a/lib/Makefile b/lib/Makefile
index ca3f7ebb900d..571d28d3f143 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -214,6 +214,8 @@ obj-$(CONFIG_PERCPU_TEST) += percpu_test.o
obj-$(CONFIG_ASN1) += asn1_decoder.o
+obj-y += zinc/
+
obj-$(CONFIG_FONT_SUPPORT) += fonts/
obj-$(CONFIG_PRIME_NUMBERS) += prime_numbers.o
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
new file mode 100644
index 000000000000..90e066ea93a0
--- /dev/null
+++ b/lib/zinc/Kconfig
@@ -0,0 +1,41 @@
+config ZINC_SELFTEST
+ bool "Zinc cryptography library self-tests"
+ help
+ This builds a series of self-tests for the Zinc crypto library, which
+ help diagnose any cryptographic algorithm implementation issues that
+ might be at the root cause of potential bugs. It also adds various
+ traps for incorrect usage.
+
+ Unless you are optimizing for machines without much disk space or for
+ very slow machines, it is probably a good idea to say Y here, so that
+ any potential cryptographic bugs translate into easy bug reports
+ rather than long-lasting security issues.
+
+config ZINC_DEBUG
+ bool "Zinc cryptography library debugging"
+ help
+ This turns on a series of additional checks and debugging options
+ that are useful for developers but probably will not provide much
+ benefit to end users.
+
+ Most people should say N here.
+
+config ZINC_ARCH_ARM
+ def_bool y
+ depends on ARM
+
+config ZINC_ARCH_ARM64
+ def_bool y
+ depends on ARM64
+
+config ZINC_ARCH_X86_64
+ def_bool y
+ depends on X86_64 && !UML
+
+config ZINC_ARCH_MIPS
+ def_bool y
+ depends on MIPS && CPU_MIPS32_R2 && !64BIT
+
+config ZINC_ARCH_MIPS64
+ def_bool y
+ depends on MIPS && 64BIT
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
new file mode 100644
index 000000000000..a61c80d676cb
--- /dev/null
+++ b/lib/zinc/Makefile
@@ -0,0 +1,3 @@
+ccflags-y := -O2
+ccflags-y += -D'pr_fmt(fmt)="zinc: " fmt'
+ccflags-$(CONFIG_ZINC_DEBUG) += -DDEBUG
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 04/28] zinc: ChaCha20 generic C implementation and selftest
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
2018-10-06 2:56 ` [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 05/28] zinc: import Andy Polyakov's ChaCha20 x86_64 implementation Jason A. Donenfeld
` (21 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
This implements the ChaCha20 permutation as a single C statement, by way
of the comma operator, which the compiler is able to simplify
terrifically.
Information: https://cr.yp.to/chacha.html
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
include/zinc/chacha20.h | 70 +
lib/zinc/Kconfig | 4 +
lib/zinc/Makefile | 3 +
lib/zinc/chacha20/chacha20.c | 179 +++
lib/zinc/selftest/chacha20.c | 2698 ++++++++++++++++++++++++++++++++++
lib/zinc/selftest/run.h | 49 +
6 files changed, 3003 insertions(+)
create mode 100644 include/zinc/chacha20.h
create mode 100644 lib/zinc/chacha20/chacha20.c
create mode 100644 lib/zinc/selftest/chacha20.c
create mode 100644 lib/zinc/selftest/run.h
diff --git a/include/zinc/chacha20.h b/include/zinc/chacha20.h
new file mode 100644
index 000000000000..0b98bd6946ae
--- /dev/null
+++ b/include/zinc/chacha20.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_CHACHA20_H
+#define _ZINC_CHACHA20_H
+
+#include <asm/unaligned.h>
+#include <linux/simd.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+
+enum {
+ CHACHA20_NONCE_SIZE = 16,
+ CHACHA20_KEY_SIZE = 32,
+ CHACHA20_KEY_WORDS = CHACHA20_KEY_SIZE / sizeof(u32),
+ CHACHA20_BLOCK_SIZE = 64,
+ CHACHA20_BLOCK_WORDS = CHACHA20_BLOCK_SIZE / sizeof(u32),
+ HCHACHA20_NONCE_SIZE = CHACHA20_NONCE_SIZE,
+ HCHACHA20_KEY_SIZE = CHACHA20_KEY_SIZE
+};
+
+enum { /* expand 32-byte k */
+ CHACHA20_CONSTANT_EXPA = 0x61707865U,
+ CHACHA20_CONSTANT_ND_3 = 0x3320646eU,
+ CHACHA20_CONSTANT_2_BY = 0x79622d32U,
+ CHACHA20_CONSTANT_TE_K = 0x6b206574U
+};
+
+struct chacha20_ctx {
+ union {
+ u32 state[16];
+ struct {
+ u32 constant[4];
+ u32 key[8];
+ u32 counter[4];
+ };
+ };
+};
+
+static inline void chacha20_init(struct chacha20_ctx *ctx,
+ const u8 key[CHACHA20_KEY_SIZE],
+ const u64 nonce)
+{
+ ctx->constant[0] = CHACHA20_CONSTANT_EXPA;
+ ctx->constant[1] = CHACHA20_CONSTANT_ND_3;
+ ctx->constant[2] = CHACHA20_CONSTANT_2_BY;
+ ctx->constant[3] = CHACHA20_CONSTANT_TE_K;
+ ctx->key[0] = get_unaligned_le32(key + 0);
+ ctx->key[1] = get_unaligned_le32(key + 4);
+ ctx->key[2] = get_unaligned_le32(key + 8);
+ ctx->key[3] = get_unaligned_le32(key + 12);
+ ctx->key[4] = get_unaligned_le32(key + 16);
+ ctx->key[5] = get_unaligned_le32(key + 20);
+ ctx->key[6] = get_unaligned_le32(key + 24);
+ ctx->key[7] = get_unaligned_le32(key + 28);
+ ctx->counter[0] = 0;
+ ctx->counter[1] = 0;
+ ctx->counter[2] = nonce & U32_MAX;
+ ctx->counter[3] = nonce >> 32;
+}
+void chacha20(struct chacha20_ctx *ctx, u8 *dst, const u8 *src, u32 len,
+ simd_context_t *simd_context);
+
+void hchacha20(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE], simd_context_t *simd_context);
+
+#endif /* _ZINC_CHACHA20_H */
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
index 90e066ea93a0..d271be37cecb 100644
--- a/lib/zinc/Kconfig
+++ b/lib/zinc/Kconfig
@@ -1,3 +1,7 @@
+config ZINC_CHACHA20
+ tristate
+ select CRYPTO_ALGAPI
+
config ZINC_SELFTEST
bool "Zinc cryptography library self-tests"
help
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index a61c80d676cb..3d80144d55a6 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -1,3 +1,6 @@
ccflags-y := -O2
ccflags-y += -D'pr_fmt(fmt)="zinc: " fmt'
ccflags-$(CONFIG_ZINC_DEBUG) += -DDEBUG
+
+zinc_chacha20-y := chacha20/chacha20.o
+obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
diff --git a/lib/zinc/chacha20/chacha20.c b/lib/zinc/chacha20/chacha20.c
new file mode 100644
index 000000000000..03209c15d1ca
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * Implementation of the ChaCha20 stream cipher.
+ *
+ * Information: https://cr.yp.to/chacha.html
+ */
+
+#include <zinc/chacha20.h>
+#include "../selftest/run.h"
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/vmalloc.h>
+#include <crypto/algapi.h> // For crypto_xor_cpy.
+
+static bool *const chacha20_nobs[] __initconst = { };
+static void __init chacha20_fpu_init(void)
+{
+}
+static inline bool chacha20_arch(struct chacha20_ctx *ctx, u8 *dst,
+ const u8 *src, size_t len,
+ simd_context_t *simd_context)
+{
+ return false;
+}
+static inline bool hchacha20_arch(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ return false;
+}
+
+#define QUARTER_ROUND(x, a, b, c, d) ( \
+ x[a] += x[b], \
+ x[d] = rol32((x[d] ^ x[a]), 16), \
+ x[c] += x[d], \
+ x[b] = rol32((x[b] ^ x[c]), 12), \
+ x[a] += x[b], \
+ x[d] = rol32((x[d] ^ x[a]), 8), \
+ x[c] += x[d], \
+ x[b] = rol32((x[b] ^ x[c]), 7) \
+)
+
+#define C(i, j) (i * 4 + j)
+
+#define DOUBLE_ROUND(x) ( \
+ /* Column Round */ \
+ QUARTER_ROUND(x, C(0, 0), C(1, 0), C(2, 0), C(3, 0)), \
+ QUARTER_ROUND(x, C(0, 1), C(1, 1), C(2, 1), C(3, 1)), \
+ QUARTER_ROUND(x, C(0, 2), C(1, 2), C(2, 2), C(3, 2)), \
+ QUARTER_ROUND(x, C(0, 3), C(1, 3), C(2, 3), C(3, 3)), \
+ /* Diagonal Round */ \
+ QUARTER_ROUND(x, C(0, 0), C(1, 1), C(2, 2), C(3, 3)), \
+ QUARTER_ROUND(x, C(0, 1), C(1, 2), C(2, 3), C(3, 0)), \
+ QUARTER_ROUND(x, C(0, 2), C(1, 3), C(2, 0), C(3, 1)), \
+ QUARTER_ROUND(x, C(0, 3), C(1, 0), C(2, 1), C(3, 2)) \
+)
+
+#define TWENTY_ROUNDS(x) ( \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x), \
+ DOUBLE_ROUND(x) \
+)
+
+static void chacha20_block_generic(struct chacha20_ctx *ctx, __le32 *stream)
+{
+ u32 x[CHACHA20_BLOCK_WORDS];
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(x); ++i)
+ x[i] = ctx->state[i];
+
+ TWENTY_ROUNDS(x);
+
+ for (i = 0; i < ARRAY_SIZE(x); ++i)
+ stream[i] = cpu_to_le32(x[i] + ctx->state[i]);
+
+ ctx->counter[0] += 1;
+}
+
+static void chacha20_generic(struct chacha20_ctx *ctx, u8 *out, const u8 *in,
+ u32 len)
+{
+ __le32 buf[CHACHA20_BLOCK_WORDS];
+
+ while (len >= CHACHA20_BLOCK_SIZE) {
+ chacha20_block_generic(ctx, buf);
+ crypto_xor_cpy(out, in, (u8 *)buf, CHACHA20_BLOCK_SIZE);
+ len -= CHACHA20_BLOCK_SIZE;
+ out += CHACHA20_BLOCK_SIZE;
+ in += CHACHA20_BLOCK_SIZE;
+ }
+ if (len) {
+ chacha20_block_generic(ctx, buf);
+ crypto_xor_cpy(out, in, (u8 *)buf, len);
+ }
+}
+
+void chacha20(struct chacha20_ctx *ctx, u8 *dst, const u8 *src, u32 len,
+ simd_context_t *simd_context)
+{
+ if (!chacha20_arch(ctx, dst, src, len, simd_context))
+ chacha20_generic(ctx, dst, src, len);
+}
+EXPORT_SYMBOL(chacha20);
+
+static void hchacha20_generic(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE])
+{
+ u32 x[] = { CHACHA20_CONSTANT_EXPA,
+ CHACHA20_CONSTANT_ND_3,
+ CHACHA20_CONSTANT_2_BY,
+ CHACHA20_CONSTANT_TE_K,
+ get_unaligned_le32(key + 0),
+ get_unaligned_le32(key + 4),
+ get_unaligned_le32(key + 8),
+ get_unaligned_le32(key + 12),
+ get_unaligned_le32(key + 16),
+ get_unaligned_le32(key + 20),
+ get_unaligned_le32(key + 24),
+ get_unaligned_le32(key + 28),
+ get_unaligned_le32(nonce + 0),
+ get_unaligned_le32(nonce + 4),
+ get_unaligned_le32(nonce + 8),
+ get_unaligned_le32(nonce + 12)
+ };
+
+ TWENTY_ROUNDS(x);
+
+ memcpy(derived_key + 0, x + 0, sizeof(u32) * 4);
+ memcpy(derived_key + 4, x + 12, sizeof(u32) * 4);
+}
+
+/* Derived key should be 32-bit aligned */
+void hchacha20(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE], simd_context_t *simd_context)
+{
+ if (!hchacha20_arch(derived_key, nonce, key, simd_context))
+ hchacha20_generic(derived_key, nonce, key);
+}
+EXPORT_SYMBOL(hchacha20);
+
+#include "../selftest/chacha20.c"
+
+static bool nosimd __initdata = false;
+
+static int __init mod_init(void)
+{
+ if (!nosimd)
+ chacha20_fpu_init();
+ if (!selftest_run("chacha20", chacha20_selftest, chacha20_nobs,
+ ARRAY_SIZE(chacha20_nobs)))
+ return -ENOTRECOVERABLE;
+ return 0;
+}
+
+static void __exit mod_exit(void)
+{
+}
+
+module_param(nosimd, bool, 0);
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("ChaCha20 stream cipher");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/zinc/selftest/chacha20.c b/lib/zinc/selftest/chacha20.c
new file mode 100644
index 000000000000..b8c9c709071d
--- /dev/null
+++ b/lib/zinc/selftest/chacha20.c
@@ -0,0 +1,2698 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+struct chacha20_testvec {
+ const u8 *input, *output, *key;
+ u64 nonce;
+ size_t ilen;
+};
+
+struct hchacha20_testvec {
+ u8 key[HCHACHA20_KEY_SIZE];
+ u8 nonce[HCHACHA20_NONCE_SIZE];
+ u8 output[CHACHA20_KEY_SIZE];
+};
+
+/* These test vectors are generated by reference implementations and are
+ * designed to check chacha20 implementation block handling, as well as from
+ * the draft-arciszewski-xchacha-01 document.
+ */
+
+static const u8 input01[] __initconst = { };
+static const u8 output01[] __initconst = { };
+static const u8 key01[] __initconst = {
+ 0x09, 0xf4, 0xe8, 0x57, 0x10, 0xf2, 0x12, 0xc3,
+ 0xc6, 0x91, 0xc4, 0x09, 0x97, 0x46, 0xef, 0xfe,
+ 0x02, 0x00, 0xe4, 0x5c, 0x82, 0xed, 0x16, 0xf3,
+ 0x32, 0xbe, 0xec, 0x7a, 0xe6, 0x68, 0x12, 0x26
+};
+enum { nonce01 = 0x3834e2afca3c66d3ULL };
+
+static const u8 input02[] __initconst = {
+ 0x9d
+};
+static const u8 output02[] __initconst = {
+ 0x94
+};
+static const u8 key02[] __initconst = {
+ 0x8c, 0x01, 0xac, 0xaf, 0x62, 0x63, 0x56, 0x7a,
+ 0xad, 0x23, 0x4c, 0x58, 0x29, 0x29, 0xbe, 0xab,
+ 0xe9, 0xf8, 0xdf, 0x6c, 0x8c, 0x74, 0x4d, 0x7d,
+ 0x13, 0x94, 0x10, 0x02, 0x3d, 0x8e, 0x9f, 0x94
+};
+enum { nonce02 = 0x5d1b3bfdedd9f73aULL };
+
+static const u8 input03[] __initconst = {
+ 0x04, 0x16
+};
+static const u8 output03[] __initconst = {
+ 0x92, 0x07
+};
+static const u8 key03[] __initconst = {
+ 0x22, 0x0c, 0x79, 0x2c, 0x38, 0x51, 0xbe, 0x99,
+ 0xa9, 0x59, 0x24, 0x50, 0xef, 0x87, 0x38, 0xa6,
+ 0xa0, 0x97, 0x20, 0xcb, 0xb4, 0x0c, 0x94, 0x67,
+ 0x1f, 0x98, 0xdc, 0xc4, 0x83, 0xbc, 0x35, 0x4d
+};
+enum { nonce03 = 0x7a3353ad720a3e2eULL };
+
+static const u8 input04[] __initconst = {
+ 0xc7, 0xcc, 0xd0
+};
+static const u8 output04[] __initconst = {
+ 0xd8, 0x41, 0x80
+};
+static const u8 key04[] __initconst = {
+ 0x81, 0x5e, 0x12, 0x01, 0xc4, 0x36, 0x15, 0x03,
+ 0x11, 0xa0, 0xe9, 0x86, 0xbb, 0x5a, 0xdc, 0x45,
+ 0x7d, 0x5e, 0x98, 0xf8, 0x06, 0x76, 0x1c, 0xec,
+ 0xc0, 0xf7, 0xca, 0x4e, 0x99, 0xd9, 0x42, 0x38
+};
+enum { nonce04 = 0x6816e2fc66176da2ULL };
+
+static const u8 input05[] __initconst = {
+ 0x48, 0xf1, 0x31, 0x5f
+};
+static const u8 output05[] __initconst = {
+ 0x48, 0xf7, 0x13, 0x67
+};
+static const u8 key05[] __initconst = {
+ 0x3f, 0xd6, 0xb6, 0x5e, 0x2f, 0xda, 0x82, 0x39,
+ 0x97, 0x06, 0xd3, 0x62, 0x4f, 0xbd, 0xcb, 0x9b,
+ 0x1d, 0xe6, 0x4a, 0x76, 0xab, 0xdd, 0x14, 0x50,
+ 0x59, 0x21, 0xe3, 0xb2, 0xc7, 0x95, 0xbc, 0x45
+};
+enum { nonce05 = 0xc41a7490e228cc42ULL };
+
+static const u8 input06[] __initconst = {
+ 0xae, 0xa2, 0x85, 0x1d, 0xc8
+};
+static const u8 output06[] __initconst = {
+ 0xfa, 0xff, 0x45, 0x6b, 0x6f
+};
+static const u8 key06[] __initconst = {
+ 0x04, 0x8d, 0xea, 0x67, 0x20, 0x78, 0xfb, 0x8f,
+ 0x49, 0x80, 0x35, 0xb5, 0x7b, 0xe4, 0x31, 0x74,
+ 0x57, 0x43, 0x3a, 0x64, 0x64, 0xb9, 0xe6, 0x23,
+ 0x4d, 0xfe, 0xb8, 0x7b, 0x71, 0x4d, 0x9d, 0x21
+};
+enum { nonce06 = 0x251366db50b10903ULL };
+
+static const u8 input07[] __initconst = {
+ 0x1a, 0x32, 0x85, 0xb6, 0xe8, 0x52
+};
+static const u8 output07[] __initconst = {
+ 0xd3, 0x5f, 0xf0, 0x07, 0x69, 0xec
+};
+static const u8 key07[] __initconst = {
+ 0xbf, 0x2d, 0x42, 0x99, 0x97, 0x76, 0x04, 0xad,
+ 0xd3, 0x8f, 0x6e, 0x6a, 0x34, 0x85, 0xaf, 0x81,
+ 0xef, 0x36, 0x33, 0xd5, 0x43, 0xa2, 0xaa, 0x08,
+ 0x0f, 0x77, 0x42, 0x83, 0x58, 0xc5, 0x42, 0x2a
+};
+enum { nonce07 = 0xe0796da17dba9b58ULL };
+
+static const u8 input08[] __initconst = {
+ 0x40, 0xae, 0xcd, 0xe4, 0x3d, 0x22, 0xe0
+};
+static const u8 output08[] __initconst = {
+ 0xfd, 0x8a, 0x9f, 0x3d, 0x05, 0xc9, 0xd3
+};
+static const u8 key08[] __initconst = {
+ 0xdc, 0x3f, 0x41, 0xe3, 0x23, 0x2a, 0x8d, 0xf6,
+ 0x41, 0x2a, 0xa7, 0x66, 0x05, 0x68, 0xe4, 0x7b,
+ 0xc4, 0x58, 0xd6, 0xcc, 0xdf, 0x0d, 0xc6, 0x25,
+ 0x1b, 0x61, 0x32, 0x12, 0x4e, 0xf1, 0xe6, 0x29
+};
+enum { nonce08 = 0xb1d2536d9e159832ULL };
+
+static const u8 input09[] __initconst = {
+ 0xba, 0x1d, 0x14, 0x16, 0x9f, 0x83, 0x67, 0x24
+};
+static const u8 output09[] __initconst = {
+ 0x7c, 0xe3, 0x78, 0x1d, 0xa2, 0xe7, 0xe9, 0x39
+};
+static const u8 key09[] __initconst = {
+ 0x17, 0x55, 0x90, 0x52, 0xa4, 0xce, 0x12, 0xae,
+ 0xd4, 0xfd, 0xd4, 0xfb, 0xd5, 0x18, 0x59, 0x50,
+ 0x4e, 0x51, 0x99, 0x32, 0x09, 0x31, 0xfc, 0xf7,
+ 0x27, 0x10, 0x8e, 0xa2, 0x4b, 0xa5, 0xf5, 0x62
+};
+enum { nonce09 = 0x495fc269536d003ULL };
+
+static const u8 input10[] __initconst = {
+ 0x09, 0xfd, 0x3c, 0x0b, 0x3d, 0x0e, 0xf3, 0x9d,
+ 0x27
+};
+static const u8 output10[] __initconst = {
+ 0xdc, 0xe4, 0x33, 0x60, 0x0c, 0x07, 0xcb, 0x51,
+ 0x6b
+};
+static const u8 key10[] __initconst = {
+ 0x4e, 0x00, 0x72, 0x37, 0x0f, 0x52, 0x4d, 0x6f,
+ 0x37, 0x50, 0x3c, 0xb3, 0x51, 0x81, 0x49, 0x16,
+ 0x7e, 0xfd, 0xb1, 0x51, 0x72, 0x2e, 0xe4, 0x16,
+ 0x68, 0x5c, 0x5b, 0x8a, 0xc3, 0x90, 0x70, 0x04
+};
+enum { nonce10 = 0x1ad9d1114d88cbbdULL };
+
+static const u8 input11[] __initconst = {
+ 0x70, 0x18, 0x52, 0x85, 0xba, 0x66, 0xff, 0x2c,
+ 0x9a, 0x46
+};
+static const u8 output11[] __initconst = {
+ 0xf5, 0x2a, 0x7a, 0xfd, 0x31, 0x7c, 0x91, 0x41,
+ 0xb1, 0xcf
+};
+static const u8 key11[] __initconst = {
+ 0x48, 0xb4, 0xd0, 0x7c, 0x88, 0xd1, 0x96, 0x0d,
+ 0x80, 0x33, 0xb4, 0xd5, 0x31, 0x9a, 0x88, 0xca,
+ 0x14, 0xdc, 0xf0, 0xa8, 0xf3, 0xac, 0xb8, 0x47,
+ 0x75, 0x86, 0x7c, 0x88, 0x50, 0x11, 0x43, 0x40
+};
+enum { nonce11 = 0x47c35dd1f4f8aa4fULL };
+
+static const u8 input12[] __initconst = {
+ 0x9e, 0x8e, 0x3d, 0x2a, 0x05, 0xfd, 0xe4, 0x90,
+ 0x24, 0x1c, 0xd3
+};
+static const u8 output12[] __initconst = {
+ 0x97, 0x72, 0x40, 0x9f, 0xc0, 0x6b, 0x05, 0x33,
+ 0x42, 0x7e, 0x28
+};
+static const u8 key12[] __initconst = {
+ 0xee, 0xff, 0x33, 0x33, 0xe0, 0x28, 0xdf, 0xa2,
+ 0xb6, 0x5e, 0x25, 0x09, 0x52, 0xde, 0xa5, 0x9c,
+ 0x8f, 0x95, 0xa9, 0x03, 0x77, 0x0f, 0xbe, 0xa1,
+ 0xd0, 0x7d, 0x73, 0x2f, 0xf8, 0x7e, 0x51, 0x44
+};
+enum { nonce12 = 0xc22d044dc6ea4af3ULL };
+
+static const u8 input13[] __initconst = {
+ 0x9c, 0x16, 0xa2, 0x22, 0x4d, 0xbe, 0x04, 0x9a,
+ 0xb3, 0xb5, 0xc6, 0x58
+};
+static const u8 output13[] __initconst = {
+ 0xf0, 0x81, 0xdb, 0x6d, 0xa3, 0xe9, 0xb2, 0xc6,
+ 0x32, 0x50, 0x16, 0x9f
+};
+static const u8 key13[] __initconst = {
+ 0x96, 0xb3, 0x01, 0xd2, 0x7a, 0x8c, 0x94, 0x09,
+ 0x4f, 0x58, 0xbe, 0x80, 0xcc, 0xa9, 0x7e, 0x2d,
+ 0xad, 0x58, 0x3b, 0x63, 0xb8, 0x5c, 0x17, 0xce,
+ 0xbf, 0x43, 0x33, 0x7a, 0x7b, 0x82, 0x28, 0x2f
+};
+enum { nonce13 = 0x2a5d05d88cd7b0daULL };
+
+static const u8 input14[] __initconst = {
+ 0x57, 0x4f, 0xaa, 0x30, 0xe6, 0x23, 0x50, 0x86,
+ 0x91, 0xa5, 0x60, 0x96, 0x2b
+};
+static const u8 output14[] __initconst = {
+ 0x6c, 0x1f, 0x3b, 0x42, 0xb6, 0x2f, 0xf0, 0xbd,
+ 0x76, 0x60, 0xc7, 0x7e, 0x8d
+};
+static const u8 key14[] __initconst = {
+ 0x22, 0x85, 0xaf, 0x8f, 0xa3, 0x53, 0xa0, 0xc4,
+ 0xb5, 0x75, 0xc0, 0xba, 0x30, 0x92, 0xc3, 0x32,
+ 0x20, 0x5a, 0x8f, 0x7e, 0x93, 0xda, 0x65, 0x18,
+ 0xd1, 0xf6, 0x9a, 0x9b, 0x8f, 0x85, 0x30, 0xe6
+};
+enum { nonce14 = 0xf9946c166aa4475fULL };
+
+static const u8 input15[] __initconst = {
+ 0x89, 0x81, 0xc7, 0xe2, 0x00, 0xac, 0x52, 0x70,
+ 0xa4, 0x79, 0xab, 0xeb, 0x74, 0xf7
+};
+static const u8 output15[] __initconst = {
+ 0xb4, 0xd0, 0xa9, 0x9d, 0x15, 0x5f, 0x48, 0xd6,
+ 0x00, 0x7e, 0x4c, 0x77, 0x5a, 0x46
+};
+static const u8 key15[] __initconst = {
+ 0x0a, 0x66, 0x36, 0xca, 0x5d, 0x82, 0x23, 0xb6,
+ 0xe4, 0x9b, 0xad, 0x5e, 0xd0, 0x7f, 0xf6, 0x7a,
+ 0x7b, 0x03, 0xa7, 0x4c, 0xfd, 0xec, 0xd5, 0xa1,
+ 0xfc, 0x25, 0x54, 0xda, 0x5a, 0x5c, 0xf0, 0x2c
+};
+enum { nonce15 = 0x9ab2b87a35e772c8ULL };
+
+static const u8 input16[] __initconst = {
+ 0x5f, 0x09, 0xc0, 0x8b, 0x1e, 0xde, 0xca, 0xd9,
+ 0xb7, 0x5c, 0x23, 0xc9, 0x55, 0x1e, 0xcf
+};
+static const u8 output16[] __initconst = {
+ 0x76, 0x9b, 0x53, 0xf3, 0x66, 0x88, 0x28, 0x60,
+ 0x98, 0x80, 0x2c, 0xa8, 0x80, 0xa6, 0x48
+};
+static const u8 key16[] __initconst = {
+ 0x80, 0xb5, 0x51, 0xdf, 0x17, 0x5b, 0xb0, 0xef,
+ 0x8b, 0x5b, 0x2e, 0x3e, 0xc5, 0xe3, 0xa5, 0x86,
+ 0xac, 0x0d, 0x8e, 0x32, 0x90, 0x9d, 0x82, 0x27,
+ 0xf1, 0x23, 0x26, 0xc3, 0xea, 0x55, 0xb6, 0x63
+};
+enum { nonce16 = 0xa82e9d39e4d02ef5ULL };
+
+static const u8 input17[] __initconst = {
+ 0x87, 0x0b, 0x36, 0x71, 0x7c, 0xb9, 0x0b, 0x80,
+ 0x4d, 0x77, 0x5c, 0x4f, 0xf5, 0x51, 0x0e, 0x1a
+};
+static const u8 output17[] __initconst = {
+ 0xf1, 0x12, 0x4a, 0x8a, 0xd9, 0xd0, 0x08, 0x67,
+ 0x66, 0xd7, 0x34, 0xea, 0x32, 0x3b, 0x54, 0x0e
+};
+static const u8 key17[] __initconst = {
+ 0xfb, 0x71, 0x5f, 0x3f, 0x7a, 0xc0, 0x9a, 0xc8,
+ 0xc8, 0xcf, 0xe8, 0xbc, 0xfb, 0x09, 0xbf, 0x89,
+ 0x6a, 0xef, 0xd5, 0xe5, 0x36, 0x87, 0x14, 0x76,
+ 0x00, 0xb9, 0x32, 0x28, 0xb2, 0x00, 0x42, 0x53
+};
+enum { nonce17 = 0x229b87e73d557b96ULL };
+
+static const u8 input18[] __initconst = {
+ 0x38, 0x42, 0xb5, 0x37, 0xb4, 0x3d, 0xfe, 0x59,
+ 0x38, 0x68, 0x88, 0xfa, 0x89, 0x8a, 0x5f, 0x90,
+ 0x3c
+};
+static const u8 output18[] __initconst = {
+ 0xac, 0xad, 0x14, 0xe8, 0x7e, 0xd7, 0xce, 0x96,
+ 0x3d, 0xb3, 0x78, 0x85, 0x22, 0x5a, 0xcb, 0x39,
+ 0xd4
+};
+static const u8 key18[] __initconst = {
+ 0xe1, 0xc1, 0xa8, 0xe0, 0x91, 0xe7, 0x38, 0x66,
+ 0x80, 0x17, 0x12, 0x3c, 0x5e, 0x2d, 0xbb, 0xea,
+ 0xeb, 0x6c, 0x8b, 0xc8, 0x1b, 0x6f, 0x7c, 0xea,
+ 0x50, 0x57, 0x23, 0x1e, 0x65, 0x6f, 0x6d, 0x81
+};
+enum { nonce18 = 0xfaf5fcf8f30e57a9ULL };
+
+static const u8 input19[] __initconst = {
+ 0x1c, 0x4a, 0x30, 0x26, 0xef, 0x9a, 0x32, 0xa7,
+ 0x8f, 0xe5, 0xc0, 0x0f, 0x30, 0x3a, 0xbf, 0x38,
+ 0x54, 0xba
+};
+static const u8 output19[] __initconst = {
+ 0x57, 0x67, 0x54, 0x4f, 0x31, 0xd6, 0xef, 0x35,
+ 0x0b, 0xd9, 0x52, 0xa7, 0x46, 0x7d, 0x12, 0x17,
+ 0x1e, 0xe3
+};
+static const u8 key19[] __initconst = {
+ 0x5a, 0x79, 0xc1, 0xea, 0x33, 0xb3, 0xc7, 0x21,
+ 0xec, 0xf8, 0xcb, 0xd2, 0x58, 0x96, 0x23, 0xd6,
+ 0x4d, 0xed, 0x2f, 0xdf, 0x8a, 0x79, 0xe6, 0x8b,
+ 0x38, 0xa3, 0xc3, 0x7a, 0x33, 0xda, 0x02, 0xc7
+};
+enum { nonce19 = 0x2b23b61840429604ULL };
+
+static const u8 input20[] __initconst = {
+ 0xab, 0xe9, 0x32, 0xbb, 0x35, 0x17, 0xe0, 0x60,
+ 0x80, 0xb1, 0x27, 0xdc, 0xe6, 0x62, 0x9e, 0x0c,
+ 0x77, 0xf4, 0x50
+};
+static const u8 output20[] __initconst = {
+ 0x54, 0x6d, 0xaa, 0xfc, 0x08, 0xfb, 0x71, 0xa8,
+ 0xd6, 0x1d, 0x7d, 0xf3, 0x45, 0x10, 0xb5, 0x4c,
+ 0xcc, 0x4b, 0x45
+};
+static const u8 key20[] __initconst = {
+ 0xa3, 0xfd, 0x3d, 0xa9, 0xeb, 0xea, 0x2c, 0x69,
+ 0xcf, 0x59, 0x38, 0x13, 0x5b, 0xa7, 0x53, 0x8f,
+ 0x5e, 0xa2, 0x33, 0x86, 0x4c, 0x75, 0x26, 0xaf,
+ 0x35, 0x12, 0x09, 0x71, 0x81, 0xea, 0x88, 0x66
+};
+enum { nonce20 = 0x7459667a8fadff58ULL };
+
+static const u8 input21[] __initconst = {
+ 0xa6, 0x82, 0x21, 0x23, 0xad, 0x27, 0x3f, 0xc6,
+ 0xd7, 0x16, 0x0d, 0x6d, 0x24, 0x15, 0x54, 0xc5,
+ 0x96, 0x72, 0x59, 0x8a
+};
+static const u8 output21[] __initconst = {
+ 0x5f, 0x34, 0x32, 0xea, 0x06, 0xd4, 0x9e, 0x01,
+ 0xdc, 0x32, 0x32, 0x40, 0x66, 0x73, 0x6d, 0x4a,
+ 0x6b, 0x12, 0x20, 0xe8
+};
+static const u8 key21[] __initconst = {
+ 0x96, 0xfd, 0x13, 0x23, 0xa9, 0x89, 0x04, 0xe6,
+ 0x31, 0xa5, 0x2c, 0xc1, 0x40, 0xd5, 0x69, 0x5c,
+ 0x32, 0x79, 0x56, 0xe0, 0x29, 0x93, 0x8f, 0xe8,
+ 0x5f, 0x65, 0x53, 0x7f, 0xc1, 0xe9, 0xaf, 0xaf
+};
+enum { nonce21 = 0xba8defee9d8e13b5ULL };
+
+static const u8 input22[] __initconst = {
+ 0xb8, 0x32, 0x1a, 0x81, 0xd8, 0x38, 0x89, 0x5a,
+ 0xb0, 0x05, 0xbe, 0xf4, 0xd2, 0x08, 0xc6, 0xee,
+ 0x79, 0x7b, 0x3a, 0x76, 0x59
+};
+static const u8 output22[] __initconst = {
+ 0xb7, 0xba, 0xae, 0x80, 0xe4, 0x9f, 0x79, 0x84,
+ 0x5a, 0x48, 0x50, 0x6d, 0xcb, 0xd0, 0x06, 0x0c,
+ 0x15, 0x63, 0xa7, 0x5e, 0xbd
+};
+static const u8 key22[] __initconst = {
+ 0x0f, 0x35, 0x3d, 0xeb, 0x5f, 0x0a, 0x82, 0x0d,
+ 0x24, 0x59, 0x71, 0xd8, 0xe6, 0x2d, 0x5f, 0xe1,
+ 0x7e, 0x0c, 0xae, 0xf6, 0xdc, 0x2c, 0xc5, 0x4a,
+ 0x38, 0x88, 0xf2, 0xde, 0xd9, 0x5f, 0x76, 0x7c
+};
+enum { nonce22 = 0xe77f1760e9f5e192ULL };
+
+static const u8 input23[] __initconst = {
+ 0x4b, 0x1e, 0x79, 0x99, 0xcf, 0xef, 0x64, 0x4b,
+ 0xb0, 0x66, 0xae, 0x99, 0x2e, 0x68, 0x97, 0xf5,
+ 0x5d, 0x9b, 0x3f, 0x7a, 0xa9, 0xd9
+};
+static const u8 output23[] __initconst = {
+ 0x5f, 0xa4, 0x08, 0x39, 0xca, 0xfa, 0x2b, 0x83,
+ 0x5d, 0x95, 0x70, 0x7c, 0x2e, 0xd4, 0xae, 0xfa,
+ 0x45, 0x4a, 0x77, 0x7f, 0xa7, 0x65
+};
+static const u8 key23[] __initconst = {
+ 0x4a, 0x06, 0x83, 0x64, 0xaa, 0xe3, 0x38, 0x32,
+ 0x28, 0x5d, 0xa4, 0xb2, 0x5a, 0xee, 0xcf, 0x8e,
+ 0x19, 0x67, 0xf1, 0x09, 0xe8, 0xc9, 0xf6, 0x40,
+ 0x02, 0x6d, 0x0b, 0xde, 0xfa, 0x81, 0x03, 0xb1
+};
+enum { nonce23 = 0x9b3f349158709849ULL };
+
+static const u8 input24[] __initconst = {
+ 0xc6, 0xfc, 0x47, 0x5e, 0xd8, 0xed, 0xa9, 0xe5,
+ 0x4f, 0x82, 0x79, 0x35, 0xee, 0x3e, 0x7e, 0x3e,
+ 0x35, 0x70, 0x6e, 0xfa, 0x6d, 0x08, 0xe8
+};
+static const u8 output24[] __initconst = {
+ 0x3b, 0xc5, 0xf8, 0xc2, 0xbf, 0x2b, 0x90, 0x33,
+ 0xa6, 0xae, 0xf5, 0x5a, 0x65, 0xb3, 0x3d, 0xe1,
+ 0xcd, 0x5f, 0x55, 0xfa, 0xe7, 0xa5, 0x4a
+};
+static const u8 key24[] __initconst = {
+ 0x00, 0x24, 0xc3, 0x65, 0x5f, 0xe6, 0x31, 0xbb,
+ 0x6d, 0xfc, 0x20, 0x7b, 0x1b, 0xa8, 0x96, 0x26,
+ 0x55, 0x21, 0x62, 0x25, 0x7e, 0xba, 0x23, 0x97,
+ 0xc9, 0xb8, 0x53, 0xa8, 0xef, 0xab, 0xad, 0x61
+};
+enum { nonce24 = 0x13ee0b8f526177c3ULL };
+
+static const u8 input25[] __initconst = {
+ 0x33, 0x07, 0x16, 0xb1, 0x34, 0x33, 0x67, 0x04,
+ 0x9b, 0x0a, 0xce, 0x1b, 0xe9, 0xde, 0x1a, 0xec,
+ 0xd0, 0x55, 0xfb, 0xc6, 0x33, 0xaf, 0x2d, 0xe3
+};
+static const u8 output25[] __initconst = {
+ 0x05, 0x93, 0x10, 0xd1, 0x58, 0x6f, 0x68, 0x62,
+ 0x45, 0xdb, 0x91, 0xae, 0x70, 0xcf, 0xd4, 0x5f,
+ 0xee, 0xdf, 0xd5, 0xba, 0x9e, 0xde, 0x68, 0xe6
+};
+static const u8 key25[] __initconst = {
+ 0x83, 0xa9, 0x4f, 0x5d, 0x74, 0xd5, 0x91, 0xb3,
+ 0xc9, 0x97, 0x19, 0x15, 0xdb, 0x0d, 0x0b, 0x4a,
+ 0x3d, 0x55, 0xcf, 0xab, 0xb2, 0x05, 0x21, 0x35,
+ 0x45, 0x50, 0xeb, 0xf8, 0xf5, 0xbf, 0x36, 0x35
+};
+enum { nonce25 = 0x7c6f459e49ebfebcULL };
+
+static const u8 input26[] __initconst = {
+ 0xc2, 0xd4, 0x7a, 0xa3, 0x92, 0xe1, 0xac, 0x46,
+ 0x1a, 0x15, 0x38, 0xc9, 0xb5, 0xfd, 0xdf, 0x84,
+ 0x38, 0xbc, 0x6b, 0x1d, 0xb0, 0x83, 0x43, 0x04,
+ 0x39
+};
+static const u8 output26[] __initconst = {
+ 0x7f, 0xde, 0xd6, 0x87, 0xcc, 0x34, 0xf4, 0x12,
+ 0xae, 0x55, 0xa5, 0x89, 0x95, 0x29, 0xfc, 0x18,
+ 0xd8, 0xc7, 0x7c, 0xd3, 0xcb, 0x85, 0x95, 0x21,
+ 0xd2
+};
+static const u8 key26[] __initconst = {
+ 0xe4, 0xd0, 0x54, 0x1d, 0x7d, 0x47, 0xa8, 0xc1,
+ 0x08, 0xca, 0xe2, 0x42, 0x52, 0x95, 0x16, 0x43,
+ 0xa3, 0x01, 0x23, 0x03, 0xcc, 0x3b, 0x81, 0x78,
+ 0x23, 0xcc, 0xa7, 0x36, 0xd7, 0xa0, 0x97, 0x8d
+};
+enum { nonce26 = 0x524401012231683ULL };
+
+static const u8 input27[] __initconst = {
+ 0x0d, 0xb0, 0xcf, 0xec, 0xfc, 0x38, 0x9d, 0x9d,
+ 0x89, 0x00, 0x96, 0xf2, 0x79, 0x8a, 0xa1, 0x8d,
+ 0x32, 0x5e, 0xc6, 0x12, 0x22, 0xec, 0xf6, 0x52,
+ 0xc1, 0x0b
+};
+static const u8 output27[] __initconst = {
+ 0xef, 0xe1, 0xf2, 0x67, 0x8e, 0x2c, 0x00, 0x9f,
+ 0x1d, 0x4c, 0x66, 0x1f, 0x94, 0x58, 0xdc, 0xbb,
+ 0xb9, 0x11, 0x8f, 0x74, 0xfd, 0x0e, 0x14, 0x01,
+ 0xa8, 0x21
+};
+static const u8 key27[] __initconst = {
+ 0x78, 0x71, 0xa4, 0xe6, 0xb2, 0x95, 0x44, 0x12,
+ 0x81, 0xaa, 0x7e, 0x94, 0xa7, 0x8d, 0x44, 0xea,
+ 0xc4, 0xbc, 0x01, 0xb7, 0x9e, 0xf7, 0x82, 0x9e,
+ 0x3b, 0x23, 0x9f, 0x31, 0xdd, 0xb8, 0x0d, 0x18
+};
+enum { nonce27 = 0xd58fe0e58fb254d6ULL };
+
+static const u8 input28[] __initconst = {
+ 0xaa, 0xb7, 0xaa, 0xd9, 0xa8, 0x91, 0xd7, 0x8a,
+ 0x97, 0x9b, 0xdb, 0x7c, 0x47, 0x2b, 0xdb, 0xd2,
+ 0xda, 0x77, 0xb1, 0xfa, 0x2d, 0x12, 0xe3, 0xe9,
+ 0xc4, 0x7f, 0x54
+};
+static const u8 output28[] __initconst = {
+ 0x87, 0x84, 0xa9, 0xa6, 0xad, 0x8f, 0xe6, 0x0f,
+ 0x69, 0xf8, 0x21, 0xc3, 0x54, 0x95, 0x0f, 0xb0,
+ 0x4e, 0xc7, 0x02, 0xe4, 0x04, 0xb0, 0x6c, 0x42,
+ 0x8c, 0x63, 0xe3
+};
+static const u8 key28[] __initconst = {
+ 0x12, 0x23, 0x37, 0x95, 0x04, 0xb4, 0x21, 0xe8,
+ 0xbc, 0x65, 0x46, 0x7a, 0xf4, 0x01, 0x05, 0x3f,
+ 0xb1, 0x34, 0x73, 0xd2, 0x49, 0xbf, 0x6f, 0x20,
+ 0xbd, 0x23, 0x58, 0x5f, 0xd1, 0x73, 0x57, 0xa6
+};
+enum { nonce28 = 0x3a04d51491eb4e07ULL };
+
+static const u8 input29[] __initconst = {
+ 0x55, 0xd0, 0xd4, 0x4b, 0x17, 0xc8, 0xc4, 0x2b,
+ 0xc0, 0x28, 0xbd, 0x9d, 0x65, 0x4d, 0xaf, 0x77,
+ 0x72, 0x7c, 0x36, 0x68, 0xa7, 0xb6, 0x87, 0x4d,
+ 0xb9, 0x27, 0x25, 0x6c
+};
+static const u8 output29[] __initconst = {
+ 0x0e, 0xac, 0x4c, 0xf5, 0x12, 0xb5, 0x56, 0xa5,
+ 0x00, 0x9a, 0xd6, 0xe5, 0x1a, 0x59, 0x2c, 0xf6,
+ 0x42, 0x22, 0xcf, 0x23, 0x98, 0x34, 0x29, 0xac,
+ 0x6e, 0xe3, 0x37, 0x6d
+};
+static const u8 key29[] __initconst = {
+ 0xda, 0x9d, 0x05, 0x0c, 0x0c, 0xba, 0x75, 0xb9,
+ 0x9e, 0xb1, 0x8d, 0xd9, 0x73, 0x26, 0x2c, 0xa9,
+ 0x3a, 0xb5, 0xcb, 0x19, 0x49, 0xa7, 0x4f, 0xf7,
+ 0x64, 0x35, 0x23, 0x20, 0x2a, 0x45, 0x78, 0xc7
+};
+enum { nonce29 = 0xc25ac9982431cbfULL };
+
+static const u8 input30[] __initconst = {
+ 0x4e, 0xd6, 0x85, 0xbb, 0xe7, 0x99, 0xfa, 0x04,
+ 0x33, 0x24, 0xfd, 0x75, 0x18, 0xe3, 0xd3, 0x25,
+ 0xcd, 0xca, 0xae, 0x00, 0xbe, 0x52, 0x56, 0x4a,
+ 0x31, 0xe9, 0x4f, 0xae, 0x8a
+};
+static const u8 output30[] __initconst = {
+ 0x30, 0x36, 0x32, 0xa2, 0x3c, 0xb6, 0xf9, 0xf9,
+ 0x76, 0x70, 0xad, 0xa6, 0x10, 0x41, 0x00, 0x4a,
+ 0xfa, 0xce, 0x1b, 0x86, 0x05, 0xdb, 0x77, 0x96,
+ 0xb3, 0xb7, 0x8f, 0x61, 0x24
+};
+static const u8 key30[] __initconst = {
+ 0x49, 0x35, 0x4c, 0x15, 0x98, 0xfb, 0xc6, 0x57,
+ 0x62, 0x6d, 0x06, 0xc3, 0xd4, 0x79, 0x20, 0x96,
+ 0x05, 0x2a, 0x31, 0x63, 0xc0, 0x44, 0x42, 0x09,
+ 0x13, 0x13, 0xff, 0x1b, 0xc8, 0x63, 0x1f, 0x0b
+};
+enum { nonce30 = 0x4967f9c08e41568bULL };
+
+static const u8 input31[] __initconst = {
+ 0x91, 0x04, 0x20, 0x47, 0x59, 0xee, 0xa6, 0x0f,
+ 0x04, 0x75, 0xc8, 0x18, 0x95, 0x44, 0x01, 0x28,
+ 0x20, 0x6f, 0x73, 0x68, 0x66, 0xb5, 0x03, 0xb3,
+ 0x58, 0x27, 0x6e, 0x7a, 0x76, 0xb8
+};
+static const u8 output31[] __initconst = {
+ 0xe8, 0x03, 0x78, 0x9d, 0x13, 0x15, 0x98, 0xef,
+ 0x64, 0x68, 0x12, 0x41, 0xb0, 0x29, 0x94, 0x0c,
+ 0x83, 0x35, 0x46, 0xa9, 0x74, 0xe1, 0x75, 0xf0,
+ 0xb6, 0x96, 0xc3, 0x6f, 0xd7, 0x70
+};
+static const u8 key31[] __initconst = {
+ 0xef, 0xcd, 0x5a, 0x4a, 0xf4, 0x7e, 0x6a, 0x3a,
+ 0x11, 0x88, 0x72, 0x94, 0xb8, 0xae, 0x84, 0xc3,
+ 0x66, 0xe0, 0xde, 0x4b, 0x00, 0xa5, 0xd6, 0x2d,
+ 0x50, 0xb7, 0x28, 0xff, 0x76, 0x57, 0x18, 0x1f
+};
+enum { nonce31 = 0xcb6f428fa4192e19ULL };
+
+static const u8 input32[] __initconst = {
+ 0x90, 0x06, 0x50, 0x4b, 0x98, 0x14, 0x30, 0xf1,
+ 0xb8, 0xd7, 0xf0, 0xa4, 0x3e, 0x4e, 0xd8, 0x00,
+ 0xea, 0xdb, 0x4f, 0x93, 0x05, 0xef, 0x02, 0x71,
+ 0x1a, 0xcd, 0xa3, 0xb1, 0xae, 0xd3, 0x18
+};
+static const u8 output32[] __initconst = {
+ 0xcb, 0x4a, 0x37, 0x3f, 0xea, 0x40, 0xab, 0x86,
+ 0xfe, 0xcc, 0x07, 0xd5, 0xdc, 0xb2, 0x25, 0xb6,
+ 0xfd, 0x2a, 0x72, 0xbc, 0x5e, 0xd4, 0x75, 0xff,
+ 0x71, 0xfc, 0xce, 0x1e, 0x6f, 0x22, 0xc1
+};
+static const u8 key32[] __initconst = {
+ 0xfc, 0x6d, 0xc3, 0x80, 0xce, 0xa4, 0x31, 0xa1,
+ 0xcc, 0xfa, 0x9d, 0x10, 0x0b, 0xc9, 0x11, 0x77,
+ 0x34, 0xdb, 0xad, 0x1b, 0xc4, 0xfc, 0xeb, 0x79,
+ 0x91, 0xda, 0x59, 0x3b, 0x0d, 0xb1, 0x19, 0x3b
+};
+enum { nonce32 = 0x88551bf050059467ULL };
+
+static const u8 input33[] __initconst = {
+ 0x88, 0x94, 0x71, 0x92, 0xe8, 0xd7, 0xf9, 0xbd,
+ 0x55, 0xe3, 0x22, 0xdb, 0x99, 0x51, 0xfb, 0x50,
+ 0xbf, 0x82, 0xb5, 0x70, 0x8b, 0x2b, 0x6a, 0x03,
+ 0x37, 0xa0, 0xc6, 0x19, 0x5d, 0xc9, 0xbc, 0xcc
+};
+static const u8 output33[] __initconst = {
+ 0xb6, 0x17, 0x51, 0xc8, 0xea, 0x8a, 0x14, 0xdc,
+ 0x23, 0x1b, 0xd4, 0xed, 0xbf, 0x50, 0xb9, 0x38,
+ 0x00, 0xc2, 0x3f, 0x78, 0x3d, 0xbf, 0xa0, 0x84,
+ 0xef, 0x45, 0xb2, 0x7d, 0x48, 0x7b, 0x62, 0xa7
+};
+static const u8 key33[] __initconst = {
+ 0xb9, 0x8f, 0x6a, 0xad, 0xb4, 0x6f, 0xb5, 0xdc,
+ 0x48, 0xfa, 0x43, 0x57, 0x62, 0x97, 0xef, 0x89,
+ 0x4c, 0x5a, 0x7b, 0x67, 0xb8, 0x9d, 0xf0, 0x42,
+ 0x2b, 0x8f, 0xf3, 0x18, 0x05, 0x2e, 0x48, 0xd0
+};
+enum { nonce33 = 0x31f16488fe8447f5ULL };
+
+static const u8 input34[] __initconst = {
+ 0xda, 0x2b, 0x3d, 0x63, 0x9e, 0x4f, 0xc2, 0xb8,
+ 0x7f, 0xc2, 0x1a, 0x8b, 0x0d, 0x95, 0x65, 0x55,
+ 0x52, 0xba, 0x51, 0x51, 0xc0, 0x61, 0x9f, 0x0a,
+ 0x5d, 0xb0, 0x59, 0x8c, 0x64, 0x6a, 0xab, 0xf5,
+ 0x57
+};
+static const u8 output34[] __initconst = {
+ 0x5c, 0xf6, 0x62, 0x24, 0x8c, 0x45, 0xa3, 0x26,
+ 0xd0, 0xe4, 0x88, 0x1c, 0xed, 0xc4, 0x26, 0x58,
+ 0xb5, 0x5d, 0x92, 0xc4, 0x17, 0x44, 0x1c, 0xb8,
+ 0x2c, 0xf3, 0x55, 0x7e, 0xd6, 0xe5, 0xb3, 0x65,
+ 0xa8
+};
+static const u8 key34[] __initconst = {
+ 0xde, 0xd1, 0x27, 0xb7, 0x7c, 0xfa, 0xa6, 0x78,
+ 0x39, 0x80, 0xdf, 0xb7, 0x46, 0xac, 0x71, 0x26,
+ 0xd0, 0x2a, 0x56, 0x79, 0x12, 0xeb, 0x26, 0x37,
+ 0x01, 0x0d, 0x30, 0xe0, 0xe3, 0x66, 0xb2, 0xf4
+};
+enum { nonce34 = 0x92d0d9b252c24149ULL };
+
+static const u8 input35[] __initconst = {
+ 0x3a, 0x15, 0x5b, 0x75, 0x6e, 0xd0, 0x52, 0x20,
+ 0x6c, 0x82, 0xfa, 0xce, 0x5b, 0xea, 0xf5, 0x43,
+ 0xc1, 0x81, 0x7c, 0xb2, 0xac, 0x16, 0x3f, 0xd3,
+ 0x5a, 0xaf, 0x55, 0x98, 0xf4, 0xc6, 0xba, 0x71,
+ 0x25, 0x8b
+};
+static const u8 output35[] __initconst = {
+ 0xb3, 0xaf, 0xac, 0x6d, 0x4d, 0xc7, 0x68, 0x56,
+ 0x50, 0x5b, 0x69, 0x2a, 0xe5, 0x90, 0xf9, 0x5f,
+ 0x99, 0x88, 0xff, 0x0c, 0xa6, 0xb1, 0x83, 0xd6,
+ 0x80, 0xa6, 0x1b, 0xde, 0x94, 0xa4, 0x2c, 0xc3,
+ 0x74, 0xfa
+};
+static const u8 key35[] __initconst = {
+ 0xd8, 0x24, 0xe2, 0x06, 0xd7, 0x7a, 0xce, 0x81,
+ 0x52, 0x72, 0x02, 0x69, 0x89, 0xc4, 0xe9, 0x53,
+ 0x3b, 0x08, 0x5f, 0x98, 0x1e, 0x1b, 0x99, 0x6e,
+ 0x28, 0x17, 0x6d, 0xba, 0xc0, 0x96, 0xf9, 0x3c
+};
+enum { nonce35 = 0x7baf968c4c8e3a37ULL };
+
+static const u8 input36[] __initconst = {
+ 0x31, 0x5d, 0x4f, 0xe3, 0xac, 0xad, 0x17, 0xa6,
+ 0xb5, 0x01, 0xe2, 0xc6, 0xd4, 0x7e, 0xc4, 0x80,
+ 0xc0, 0x59, 0x72, 0xbb, 0x4b, 0x74, 0x6a, 0x41,
+ 0x0f, 0x9c, 0xf6, 0xca, 0x20, 0xb3, 0x73, 0x07,
+ 0x6b, 0x02, 0x2a
+};
+static const u8 output36[] __initconst = {
+ 0xf9, 0x09, 0x92, 0x94, 0x7e, 0x31, 0xf7, 0x53,
+ 0xe8, 0x8a, 0x5b, 0x20, 0xef, 0x9b, 0x45, 0x81,
+ 0xba, 0x5e, 0x45, 0x63, 0xc1, 0xc7, 0x9e, 0x06,
+ 0x0e, 0xd9, 0x62, 0x8e, 0x96, 0xf9, 0xfa, 0x43,
+ 0x4d, 0xd4, 0x28
+};
+static const u8 key36[] __initconst = {
+ 0x13, 0x30, 0x4c, 0x06, 0xae, 0x18, 0xde, 0x03,
+ 0x1d, 0x02, 0x40, 0xf5, 0xbb, 0x19, 0xe3, 0x88,
+ 0x41, 0xb1, 0x29, 0x15, 0x97, 0xc2, 0x69, 0x3f,
+ 0x32, 0x2a, 0x0c, 0x8b, 0xcf, 0x83, 0x8b, 0x6c
+};
+enum { nonce36 = 0x226d251d475075a0ULL };
+
+static const u8 input37[] __initconst = {
+ 0x10, 0x18, 0xbe, 0xfd, 0x66, 0xc9, 0x77, 0xcc,
+ 0x43, 0xe5, 0x46, 0x0b, 0x08, 0x8b, 0xae, 0x11,
+ 0x86, 0x15, 0xc2, 0xf6, 0x45, 0xd4, 0x5f, 0xd6,
+ 0xb6, 0x5f, 0x9f, 0x3e, 0x97, 0xb7, 0xd4, 0xad,
+ 0x0b, 0xe8, 0x31, 0x94
+};
+static const u8 output37[] __initconst = {
+ 0x03, 0x2c, 0x1c, 0xee, 0xc6, 0xdd, 0xed, 0x38,
+ 0x80, 0x6d, 0x84, 0x16, 0xc3, 0xc2, 0x04, 0x63,
+ 0xcd, 0xa7, 0x6e, 0x36, 0x8b, 0xed, 0x78, 0x63,
+ 0x95, 0xfc, 0x69, 0x7a, 0x3f, 0x8d, 0x75, 0x6b,
+ 0x6c, 0x26, 0x56, 0x4d
+};
+static const u8 key37[] __initconst = {
+ 0xac, 0x84, 0x4d, 0xa9, 0x29, 0x49, 0x3c, 0x39,
+ 0x7f, 0xd9, 0xa6, 0x01, 0xf3, 0x7e, 0xfa, 0x4a,
+ 0x14, 0x80, 0x22, 0x74, 0xf0, 0x29, 0x30, 0x2d,
+ 0x07, 0x21, 0xda, 0xc0, 0x4d, 0x70, 0x56, 0xa2
+};
+enum { nonce37 = 0x167823ce3b64925aULL };
+
+static const u8 input38[] __initconst = {
+ 0x30, 0x8f, 0xfa, 0x24, 0x29, 0xb1, 0xfb, 0xce,
+ 0x31, 0x62, 0xdc, 0xd0, 0x46, 0xab, 0xe1, 0x31,
+ 0xd9, 0xae, 0x60, 0x0d, 0xca, 0x0a, 0x49, 0x12,
+ 0x3d, 0x92, 0xe9, 0x91, 0x67, 0x12, 0x62, 0x18,
+ 0x89, 0xe2, 0xf9, 0x1c, 0xcc
+};
+static const u8 output38[] __initconst = {
+ 0x56, 0x9c, 0xc8, 0x7a, 0xc5, 0x98, 0xa3, 0x0f,
+ 0xba, 0xd5, 0x3e, 0xe1, 0xc9, 0x33, 0x64, 0x33,
+ 0xf0, 0xd5, 0xf7, 0x43, 0x66, 0x0e, 0x08, 0x9a,
+ 0x6e, 0x09, 0xe4, 0x01, 0x0d, 0x1e, 0x2f, 0x4b,
+ 0xed, 0x9c, 0x08, 0x8c, 0x03
+};
+static const u8 key38[] __initconst = {
+ 0x77, 0x52, 0x2a, 0x23, 0xf1, 0xc5, 0x96, 0x2b,
+ 0x89, 0x4f, 0x3e, 0xf3, 0xff, 0x0e, 0x94, 0xce,
+ 0xf1, 0xbd, 0x53, 0xf5, 0x77, 0xd6, 0x9e, 0x47,
+ 0x49, 0x3d, 0x16, 0x64, 0xff, 0x95, 0x42, 0x42
+};
+enum { nonce38 = 0xff629d7b82cef357ULL };
+
+static const u8 input39[] __initconst = {
+ 0x38, 0x26, 0x27, 0xd0, 0xc2, 0xf5, 0x34, 0xba,
+ 0xda, 0x0f, 0x1c, 0x1c, 0x9a, 0x70, 0xe5, 0x8a,
+ 0x78, 0x2d, 0x8f, 0x9a, 0xbf, 0x89, 0x6a, 0xfd,
+ 0xd4, 0x9c, 0x33, 0xf1, 0xb6, 0x89, 0x16, 0xe3,
+ 0x6a, 0x00, 0xfa, 0x3a, 0x0f, 0x26
+};
+static const u8 output39[] __initconst = {
+ 0x0f, 0xaf, 0x91, 0x6d, 0x9c, 0x99, 0xa4, 0xf7,
+ 0x3b, 0x9d, 0x9a, 0x98, 0xca, 0xbb, 0x50, 0x48,
+ 0xee, 0xcb, 0x5d, 0xa1, 0x37, 0x2d, 0x36, 0x09,
+ 0x2a, 0xe2, 0x1c, 0x3d, 0x98, 0x40, 0x1c, 0x16,
+ 0x56, 0xa7, 0x98, 0xe9, 0x7d, 0x2b
+};
+static const u8 key39[] __initconst = {
+ 0x6e, 0x83, 0x15, 0x4d, 0xf8, 0x78, 0xa8, 0x0e,
+ 0x71, 0x37, 0xd4, 0x6e, 0x28, 0x5c, 0x06, 0xa1,
+ 0x2d, 0x6c, 0x72, 0x7a, 0xfd, 0xf8, 0x65, 0x1a,
+ 0xb8, 0xe6, 0x29, 0x7b, 0xe5, 0xb3, 0x23, 0x79
+};
+enum { nonce39 = 0xa4d8c491cf093e9dULL };
+
+static const u8 input40[] __initconst = {
+ 0x8f, 0x32, 0x7c, 0x40, 0x37, 0x95, 0x08, 0x00,
+ 0x00, 0xfe, 0x2f, 0x95, 0x20, 0x12, 0x40, 0x18,
+ 0x5e, 0x7e, 0x5e, 0x99, 0xee, 0x8d, 0x91, 0x7d,
+ 0x50, 0x7d, 0x21, 0x45, 0x27, 0xe1, 0x7f, 0xd4,
+ 0x73, 0x10, 0xe1, 0x33, 0xbc, 0xf8, 0xdd
+};
+static const u8 output40[] __initconst = {
+ 0x78, 0x7c, 0xdc, 0x55, 0x2b, 0xd9, 0x2b, 0x3a,
+ 0xdd, 0x56, 0x11, 0x52, 0xd3, 0x2e, 0xe0, 0x0d,
+ 0x23, 0x20, 0x8a, 0xf1, 0x4f, 0xee, 0xf1, 0x68,
+ 0xf6, 0xdc, 0x53, 0xcf, 0x17, 0xd4, 0xf0, 0x6c,
+ 0xdc, 0x80, 0x5f, 0x1c, 0xa4, 0x91, 0x05
+};
+static const u8 key40[] __initconst = {
+ 0x0d, 0x86, 0xbf, 0x8a, 0xba, 0x9e, 0x39, 0x91,
+ 0xa8, 0xe7, 0x22, 0xf0, 0x0c, 0x43, 0x18, 0xe4,
+ 0x1f, 0xb0, 0xaf, 0x8a, 0x34, 0x31, 0xf4, 0x41,
+ 0xf0, 0x89, 0x85, 0xca, 0x5d, 0x05, 0x3b, 0x94
+};
+enum { nonce40 = 0xae7acc4f5986439eULL };
+
+static const u8 input41[] __initconst = {
+ 0x20, 0x5f, 0xc1, 0x83, 0x36, 0x02, 0x76, 0x96,
+ 0xf0, 0xbf, 0x8e, 0x0e, 0x1a, 0xd1, 0xc7, 0x88,
+ 0x18, 0xc7, 0x09, 0xc4, 0x15, 0xd9, 0x4f, 0x5e,
+ 0x1f, 0xb3, 0xb4, 0x6d, 0xcb, 0xa0, 0xd6, 0x8a,
+ 0x3b, 0x40, 0x8e, 0x80, 0xf1, 0xe8, 0x8f, 0x5f
+};
+static const u8 output41[] __initconst = {
+ 0x0b, 0xd1, 0x49, 0x9a, 0x9d, 0xe8, 0x97, 0xb8,
+ 0xd1, 0xeb, 0x90, 0x62, 0x37, 0xd2, 0x99, 0x15,
+ 0x67, 0x6d, 0x27, 0x93, 0xce, 0x37, 0x65, 0xa2,
+ 0x94, 0x88, 0xd6, 0x17, 0xbc, 0x1c, 0x6e, 0xa2,
+ 0xcc, 0xfb, 0x81, 0x0e, 0x30, 0x60, 0x5a, 0x6f
+};
+static const u8 key41[] __initconst = {
+ 0x36, 0x27, 0x57, 0x01, 0x21, 0x68, 0x97, 0xc7,
+ 0x00, 0x67, 0x7b, 0xe9, 0x0f, 0x55, 0x49, 0xbb,
+ 0x92, 0x18, 0x98, 0xf5, 0x5e, 0xbc, 0xe7, 0x5a,
+ 0x9d, 0x3d, 0xc7, 0xbd, 0x59, 0xec, 0x82, 0x8e
+};
+enum { nonce41 = 0x5da05e4c8dfab464ULL };
+
+static const u8 input42[] __initconst = {
+ 0xca, 0x30, 0xcd, 0x63, 0xf0, 0x2d, 0xf1, 0x03,
+ 0x4d, 0x0d, 0xf2, 0xf7, 0x6f, 0xae, 0xd6, 0x34,
+ 0xea, 0xf6, 0x13, 0xcf, 0x1c, 0xa0, 0xd0, 0xe8,
+ 0xa4, 0x78, 0x80, 0x3b, 0x1e, 0xa5, 0x32, 0x4c,
+ 0x73, 0x12, 0xd4, 0x6a, 0x94, 0xbc, 0xba, 0x80,
+ 0x5e
+};
+static const u8 output42[] __initconst = {
+ 0xec, 0x3f, 0x18, 0x31, 0xc0, 0x7b, 0xb5, 0xe2,
+ 0xad, 0xf3, 0xec, 0xa0, 0x16, 0x9d, 0xef, 0xce,
+ 0x05, 0x65, 0x59, 0x9d, 0x5a, 0xca, 0x3e, 0x13,
+ 0xb9, 0x5d, 0x5d, 0xb5, 0xeb, 0xae, 0xc0, 0x87,
+ 0xbb, 0xfd, 0xe7, 0xe4, 0x89, 0x5b, 0xd2, 0x6c,
+ 0x56
+};
+static const u8 key42[] __initconst = {
+ 0x7c, 0x6b, 0x7e, 0x77, 0xcc, 0x8c, 0x1b, 0x03,
+ 0x8b, 0x2a, 0xb3, 0x7c, 0x5a, 0x73, 0xcc, 0xac,
+ 0xdd, 0x53, 0x54, 0x0c, 0x85, 0xed, 0xcd, 0x47,
+ 0x24, 0xc1, 0xb8, 0x9b, 0x2e, 0x41, 0x92, 0x36
+};
+enum { nonce42 = 0xe4d7348b09682c9cULL };
+
+static const u8 input43[] __initconst = {
+ 0x52, 0xf2, 0x4b, 0x7c, 0xe5, 0x58, 0xe8, 0xd2,
+ 0xb7, 0xf3, 0xa1, 0x29, 0x68, 0xa2, 0x50, 0x50,
+ 0xae, 0x9c, 0x1b, 0xe2, 0x67, 0x77, 0xe2, 0xdb,
+ 0x85, 0x55, 0x7e, 0x84, 0x8a, 0x12, 0x3c, 0xb6,
+ 0x2e, 0xed, 0xd3, 0xec, 0x47, 0x68, 0xfa, 0x52,
+ 0x46, 0x9d
+};
+static const u8 output43[] __initconst = {
+ 0x1b, 0xf0, 0x05, 0xe4, 0x1c, 0xd8, 0x74, 0x9a,
+ 0xf0, 0xee, 0x00, 0x54, 0xce, 0x02, 0x83, 0x15,
+ 0xfb, 0x23, 0x35, 0x78, 0xc3, 0xda, 0x98, 0xd8,
+ 0x9d, 0x1b, 0xb2, 0x51, 0x82, 0xb0, 0xff, 0xbe,
+ 0x05, 0xa9, 0xa4, 0x04, 0xba, 0xea, 0x4b, 0x73,
+ 0x47, 0x6e
+};
+static const u8 key43[] __initconst = {
+ 0xeb, 0xec, 0x0e, 0xa1, 0x65, 0xe2, 0x99, 0x46,
+ 0xd8, 0x54, 0x8c, 0x4a, 0x93, 0xdf, 0x6d, 0xbf,
+ 0x93, 0x34, 0x94, 0x57, 0xc9, 0x12, 0x9d, 0x68,
+ 0x05, 0xc5, 0x05, 0xad, 0x5a, 0xc9, 0x2a, 0x3b
+};
+enum { nonce43 = 0xe14f6a902b7827fULL };
+
+static const u8 input44[] __initconst = {
+ 0x3e, 0x22, 0x3e, 0x8e, 0xcd, 0x18, 0xe2, 0xa3,
+ 0x8d, 0x8b, 0x38, 0xc3, 0x02, 0xa3, 0x31, 0x48,
+ 0xc6, 0x0e, 0xec, 0x99, 0x51, 0x11, 0x6d, 0x8b,
+ 0x32, 0x35, 0x3b, 0x08, 0x58, 0x76, 0x25, 0x30,
+ 0xe2, 0xfc, 0xa2, 0x46, 0x7d, 0x6e, 0x34, 0x87,
+ 0xac, 0x42, 0xbf
+};
+static const u8 output44[] __initconst = {
+ 0x08, 0x92, 0x58, 0x02, 0x1a, 0xf4, 0x1f, 0x3d,
+ 0x38, 0x7b, 0x6b, 0xf6, 0x84, 0x07, 0xa3, 0x19,
+ 0x17, 0x2a, 0xed, 0x57, 0x1c, 0xf9, 0x55, 0x37,
+ 0x4e, 0xf4, 0x68, 0x68, 0x82, 0x02, 0x4f, 0xca,
+ 0x21, 0x00, 0xc6, 0x66, 0x79, 0x53, 0x19, 0xef,
+ 0x7f, 0xdd, 0x74
+};
+static const u8 key44[] __initconst = {
+ 0x73, 0xb6, 0x3e, 0xf4, 0x57, 0x52, 0xa6, 0x43,
+ 0x51, 0xd8, 0x25, 0x00, 0xdb, 0xb4, 0x52, 0x69,
+ 0xd6, 0x27, 0x49, 0xeb, 0x9b, 0xf1, 0x7b, 0xa0,
+ 0xd6, 0x7c, 0x9c, 0xd8, 0x95, 0x03, 0x69, 0x26
+};
+enum { nonce44 = 0xf5e6dc4f35ce24e5ULL };
+
+static const u8 input45[] __initconst = {
+ 0x55, 0x76, 0xc0, 0xf1, 0x74, 0x03, 0x7a, 0x6d,
+ 0x14, 0xd8, 0x36, 0x2c, 0x9f, 0x9a, 0x59, 0x7a,
+ 0x2a, 0xf5, 0x77, 0x84, 0x70, 0x7c, 0x1d, 0x04,
+ 0x90, 0x45, 0xa4, 0xc1, 0x5e, 0xdd, 0x2e, 0x07,
+ 0x18, 0x34, 0xa6, 0x85, 0x56, 0x4f, 0x09, 0xaf,
+ 0x2f, 0x83, 0xe1, 0xc6
+};
+static const u8 output45[] __initconst = {
+ 0x22, 0x46, 0xe4, 0x0b, 0x3a, 0x55, 0xcc, 0x9b,
+ 0xf0, 0xc0, 0x53, 0xcd, 0x95, 0xc7, 0x57, 0x6c,
+ 0x77, 0x46, 0x41, 0x72, 0x07, 0xbf, 0xa8, 0xe5,
+ 0x68, 0x69, 0xd8, 0x1e, 0x45, 0xc1, 0xa2, 0x50,
+ 0xa5, 0xd1, 0x62, 0xc9, 0x5a, 0x7d, 0x08, 0x14,
+ 0xae, 0x44, 0x16, 0xb9
+};
+static const u8 key45[] __initconst = {
+ 0x41, 0xf3, 0x88, 0xb2, 0x51, 0x25, 0x47, 0x02,
+ 0x39, 0xe8, 0x15, 0x3a, 0x22, 0x78, 0x86, 0x0b,
+ 0xf9, 0x1e, 0x8d, 0x98, 0xb2, 0x22, 0x82, 0xac,
+ 0x42, 0x94, 0xde, 0x64, 0xf0, 0xfd, 0xb3, 0x6c
+};
+enum { nonce45 = 0xf51a582daf4aa01aULL };
+
+static const u8 input46[] __initconst = {
+ 0xf6, 0xff, 0x20, 0xf9, 0x26, 0x7e, 0x0f, 0xa8,
+ 0x6a, 0x45, 0x5a, 0x91, 0x73, 0xc4, 0x4c, 0x63,
+ 0xe5, 0x61, 0x59, 0xca, 0xec, 0xc0, 0x20, 0x35,
+ 0xbc, 0x9f, 0x58, 0x9c, 0x5e, 0xa1, 0x17, 0x46,
+ 0xcc, 0xab, 0x6e, 0xd0, 0x4f, 0x24, 0xeb, 0x05,
+ 0x4d, 0x40, 0x41, 0xe0, 0x9d
+};
+static const u8 output46[] __initconst = {
+ 0x31, 0x6e, 0x63, 0x3f, 0x9c, 0xe6, 0xb1, 0xb7,
+ 0xef, 0x47, 0x46, 0xd7, 0xb1, 0x53, 0x42, 0x2f,
+ 0x2c, 0xc8, 0x01, 0xae, 0x8b, 0xec, 0x42, 0x2c,
+ 0x6b, 0x2c, 0x9c, 0xb2, 0xf0, 0x29, 0x06, 0xa5,
+ 0xcd, 0x7e, 0xc7, 0x3a, 0x38, 0x98, 0x8a, 0xde,
+ 0x03, 0x29, 0x14, 0x8f, 0xf9
+};
+static const u8 key46[] __initconst = {
+ 0xac, 0xa6, 0x44, 0x4a, 0x0d, 0x42, 0x10, 0xbc,
+ 0xd3, 0xc9, 0x8e, 0x9e, 0x71, 0xa3, 0x1c, 0x14,
+ 0x9d, 0x65, 0x0d, 0x49, 0x4d, 0x8c, 0xec, 0x46,
+ 0xe1, 0x41, 0xcd, 0xf5, 0xfc, 0x82, 0x75, 0x34
+};
+enum { nonce46 = 0x25f85182df84dec5ULL };
+
+static const u8 input47[] __initconst = {
+ 0xa1, 0xd2, 0xf2, 0x52, 0x2f, 0x79, 0x50, 0xb2,
+ 0x42, 0x29, 0x5b, 0x44, 0x20, 0xf9, 0xbd, 0x85,
+ 0xb7, 0x65, 0x77, 0x86, 0xce, 0x3e, 0x1c, 0xe4,
+ 0x70, 0x80, 0xdd, 0x72, 0x07, 0x48, 0x0f, 0x84,
+ 0x0d, 0xfd, 0x97, 0xc0, 0xb7, 0x48, 0x9b, 0xb4,
+ 0xec, 0xff, 0x73, 0x14, 0x99, 0xe4
+};
+static const u8 output47[] __initconst = {
+ 0xe5, 0x3c, 0x78, 0x66, 0x31, 0x1e, 0xd6, 0xc4,
+ 0x9e, 0x71, 0xb3, 0xd7, 0xd5, 0xad, 0x84, 0xf2,
+ 0x78, 0x61, 0x77, 0xf8, 0x31, 0xf0, 0x13, 0xad,
+ 0x66, 0xf5, 0x31, 0x7d, 0xeb, 0xdf, 0xaf, 0xcb,
+ 0xac, 0x28, 0x6c, 0xc2, 0x9e, 0xe7, 0x78, 0xa2,
+ 0xa2, 0x58, 0xce, 0x84, 0x76, 0x70
+};
+static const u8 key47[] __initconst = {
+ 0x05, 0x7f, 0xc0, 0x7f, 0x37, 0x20, 0x71, 0x02,
+ 0x3a, 0xe7, 0x20, 0x5a, 0x0a, 0x8f, 0x79, 0x5a,
+ 0xfe, 0xbb, 0x43, 0x4d, 0x2f, 0xcb, 0xf6, 0x9e,
+ 0xa2, 0x97, 0x00, 0xad, 0x0d, 0x51, 0x7e, 0x17
+};
+enum { nonce47 = 0xae707c60f54de32bULL };
+
+static const u8 input48[] __initconst = {
+ 0x80, 0x93, 0x77, 0x2e, 0x8d, 0xe8, 0xe6, 0xc1,
+ 0x27, 0xe6, 0xf2, 0x89, 0x5b, 0x33, 0x62, 0x18,
+ 0x80, 0x6e, 0x17, 0x22, 0x8e, 0x83, 0x31, 0x40,
+ 0x8f, 0xc9, 0x5c, 0x52, 0x6c, 0x0e, 0xa5, 0xe9,
+ 0x6c, 0x7f, 0xd4, 0x6a, 0x27, 0x56, 0x99, 0xce,
+ 0x8d, 0x37, 0x59, 0xaf, 0xc0, 0x0e, 0xe1
+};
+static const u8 output48[] __initconst = {
+ 0x02, 0xa4, 0x2e, 0x33, 0xb7, 0x7c, 0x2b, 0x9a,
+ 0x18, 0x5a, 0xba, 0x53, 0x38, 0xaf, 0x00, 0xeb,
+ 0xd8, 0x3d, 0x02, 0x77, 0x43, 0x45, 0x03, 0x91,
+ 0xe2, 0x5e, 0x4e, 0xeb, 0x50, 0xd5, 0x5b, 0xe0,
+ 0xf3, 0x33, 0xa7, 0xa2, 0xac, 0x07, 0x6f, 0xeb,
+ 0x3f, 0x6c, 0xcd, 0xf2, 0x6c, 0x61, 0x64
+};
+static const u8 key48[] __initconst = {
+ 0xf3, 0x79, 0xe7, 0xf8, 0x0e, 0x02, 0x05, 0x6b,
+ 0x83, 0x1a, 0xe7, 0x86, 0x6b, 0xe6, 0x8f, 0x3f,
+ 0xd3, 0xa3, 0xe4, 0x6e, 0x29, 0x06, 0xad, 0xbc,
+ 0xe8, 0x33, 0x56, 0x39, 0xdf, 0xb0, 0xe2, 0xfe
+};
+enum { nonce48 = 0xd849b938c6569da0ULL };
+
+static const u8 input49[] __initconst = {
+ 0x89, 0x3b, 0x88, 0x9e, 0x7b, 0x38, 0x16, 0x9f,
+ 0xa1, 0x28, 0xf6, 0xf5, 0x23, 0x74, 0x28, 0xb0,
+ 0xdf, 0x6c, 0x9e, 0x8a, 0x71, 0xaf, 0xed, 0x7a,
+ 0x39, 0x21, 0x57, 0x7d, 0x31, 0x6c, 0xee, 0x0d,
+ 0x11, 0x8d, 0x41, 0x9a, 0x5f, 0xb7, 0x27, 0x40,
+ 0x08, 0xad, 0xc6, 0xe0, 0x00, 0x43, 0x9e, 0xae
+};
+static const u8 output49[] __initconst = {
+ 0x4d, 0xfd, 0xdb, 0x4c, 0x77, 0xc1, 0x05, 0x07,
+ 0x4d, 0x6d, 0x32, 0xcb, 0x2e, 0x0e, 0xff, 0x65,
+ 0xc9, 0x27, 0xeb, 0xa9, 0x46, 0x5b, 0xab, 0x06,
+ 0xe6, 0xb6, 0x5a, 0x1e, 0x00, 0xfb, 0xcf, 0xe4,
+ 0xb9, 0x71, 0x40, 0x10, 0xef, 0x12, 0x39, 0xf0,
+ 0xea, 0x40, 0xb8, 0x9a, 0xa2, 0x85, 0x38, 0x48
+};
+static const u8 key49[] __initconst = {
+ 0xe7, 0x10, 0x40, 0xd9, 0x66, 0xc0, 0xa8, 0x6d,
+ 0xa3, 0xcc, 0x8b, 0xdd, 0x93, 0xf2, 0x6e, 0xe0,
+ 0x90, 0x7f, 0xd0, 0xf4, 0x37, 0x0c, 0x8b, 0x9b,
+ 0x4c, 0x4d, 0xe6, 0xf2, 0x1f, 0xe9, 0x95, 0x24
+};
+enum { nonce49 = 0xf269817bdae01bc0ULL };
+
+static const u8 input50[] __initconst = {
+ 0xda, 0x5b, 0x60, 0xcd, 0xed, 0x58, 0x8e, 0x7f,
+ 0xae, 0xdd, 0xc8, 0x2e, 0x16, 0x90, 0xea, 0x4b,
+ 0x0c, 0x74, 0x14, 0x35, 0xeb, 0xee, 0x2c, 0xff,
+ 0x46, 0x99, 0x97, 0x6e, 0xae, 0xa7, 0x8e, 0x6e,
+ 0x38, 0xfe, 0x63, 0xe7, 0x51, 0xd9, 0xaa, 0xce,
+ 0x7b, 0x1e, 0x7e, 0x5d, 0xc0, 0xe8, 0x10, 0x06,
+ 0x14
+};
+static const u8 output50[] __initconst = {
+ 0xe4, 0xe5, 0x86, 0x1b, 0x66, 0x19, 0xac, 0x49,
+ 0x1c, 0xbd, 0xee, 0x03, 0xaf, 0x11, 0xfc, 0x1f,
+ 0x6a, 0xd2, 0x50, 0x5c, 0xea, 0x2c, 0xa5, 0x75,
+ 0xfd, 0xb7, 0x0e, 0x80, 0x8f, 0xed, 0x3f, 0x31,
+ 0x47, 0xac, 0x67, 0x43, 0xb8, 0x2e, 0xb4, 0x81,
+ 0x6d, 0xe4, 0x1e, 0xb7, 0x8b, 0x0c, 0x53, 0xa9,
+ 0x26
+};
+static const u8 key50[] __initconst = {
+ 0xd7, 0xb2, 0x04, 0x76, 0x30, 0xcc, 0x38, 0x45,
+ 0xef, 0xdb, 0xc5, 0x86, 0x08, 0x61, 0xf0, 0xee,
+ 0x6d, 0xd8, 0x22, 0x04, 0x8c, 0xfb, 0xcb, 0x37,
+ 0xa6, 0xfb, 0x95, 0x22, 0xe1, 0x87, 0xb7, 0x6f
+};
+enum { nonce50 = 0x3b44d09c45607d38ULL };
+
+static const u8 input51[] __initconst = {
+ 0xa9, 0x41, 0x02, 0x4b, 0xd7, 0xd5, 0xd1, 0xf1,
+ 0x21, 0x55, 0xb2, 0x75, 0x6d, 0x77, 0x1b, 0x86,
+ 0xa9, 0xc8, 0x90, 0xfd, 0xed, 0x4a, 0x7b, 0x6c,
+ 0xb2, 0x5f, 0x9b, 0x5f, 0x16, 0xa1, 0x54, 0xdb,
+ 0xd6, 0x3f, 0x6a, 0x7f, 0x2e, 0x51, 0x9d, 0x49,
+ 0x5b, 0xa5, 0x0e, 0xf9, 0xfb, 0x2a, 0x38, 0xff,
+ 0x20, 0x8c
+};
+static const u8 output51[] __initconst = {
+ 0x18, 0xf7, 0x88, 0xc1, 0x72, 0xfd, 0x90, 0x4b,
+ 0xa9, 0x2d, 0xdb, 0x47, 0xb0, 0xa5, 0xc4, 0x37,
+ 0x01, 0x95, 0xc4, 0xb1, 0xab, 0xc5, 0x5b, 0xcd,
+ 0xe1, 0x97, 0x78, 0x13, 0xde, 0x6a, 0xff, 0x36,
+ 0xce, 0xa4, 0x67, 0xc5, 0x4a, 0x45, 0x2b, 0xd9,
+ 0xff, 0x8f, 0x06, 0x7c, 0x63, 0xbb, 0x83, 0x17,
+ 0xb4, 0x6b
+};
+static const u8 key51[] __initconst = {
+ 0x82, 0x1a, 0x79, 0xab, 0x9a, 0xb5, 0x49, 0x6a,
+ 0x30, 0x6b, 0x99, 0x19, 0x11, 0xc7, 0xa2, 0xf4,
+ 0xca, 0x55, 0xb9, 0xdd, 0xe7, 0x2f, 0xe7, 0xc1,
+ 0xdd, 0x27, 0xad, 0x80, 0xf2, 0x56, 0xad, 0xf3
+};
+enum { nonce51 = 0xe93aff94ca71a4a6ULL };
+
+static const u8 input52[] __initconst = {
+ 0x89, 0xdd, 0xf3, 0xfa, 0xb6, 0xc1, 0xaa, 0x9a,
+ 0xc8, 0xad, 0x6b, 0x00, 0xa1, 0x65, 0xea, 0x14,
+ 0x55, 0x54, 0x31, 0x8f, 0xf0, 0x03, 0x84, 0x51,
+ 0x17, 0x1e, 0x0a, 0x93, 0x6e, 0x79, 0x96, 0xa3,
+ 0x2a, 0x85, 0x9c, 0x89, 0xf8, 0xd1, 0xe2, 0x15,
+ 0x95, 0x05, 0xf4, 0x43, 0x4d, 0x6b, 0xf0, 0x71,
+ 0x3b, 0x3e, 0xba
+};
+static const u8 output52[] __initconst = {
+ 0x0c, 0x42, 0x6a, 0xb3, 0x66, 0x63, 0x5d, 0x2c,
+ 0x9f, 0x3d, 0xa6, 0x6e, 0xc7, 0x5f, 0x79, 0x2f,
+ 0x50, 0xe3, 0xd6, 0x07, 0x56, 0xa4, 0x2b, 0x2d,
+ 0x8d, 0x10, 0xc0, 0x6c, 0xa2, 0xfc, 0x97, 0xec,
+ 0x3f, 0x5c, 0x8d, 0x59, 0xbe, 0x84, 0xf1, 0x3e,
+ 0x38, 0x47, 0x4f, 0x75, 0x25, 0x66, 0x88, 0x14,
+ 0x03, 0xdd, 0xde
+};
+static const u8 key52[] __initconst = {
+ 0x4f, 0xb0, 0x27, 0xb6, 0xdd, 0x24, 0x0c, 0xdb,
+ 0x6b, 0x71, 0x2e, 0xac, 0xfc, 0x3f, 0xa6, 0x48,
+ 0x5d, 0xd5, 0xff, 0x53, 0xb5, 0x62, 0xf1, 0xe0,
+ 0x93, 0xfe, 0x39, 0x4c, 0x9f, 0x03, 0x11, 0xa7
+};
+enum { nonce52 = 0xed8becec3bdf6f25ULL };
+
+static const u8 input53[] __initconst = {
+ 0x68, 0xd1, 0xc7, 0x74, 0x44, 0x1c, 0x84, 0xde,
+ 0x27, 0x27, 0x35, 0xf0, 0x18, 0x0b, 0x57, 0xaa,
+ 0xd0, 0x1a, 0xd3, 0x3b, 0x5e, 0x5c, 0x62, 0x93,
+ 0xd7, 0x6b, 0x84, 0x3b, 0x71, 0x83, 0x77, 0x01,
+ 0x3e, 0x59, 0x45, 0xf4, 0x77, 0x6c, 0x6b, 0xcb,
+ 0x88, 0x45, 0x09, 0x1d, 0xc6, 0x45, 0x6e, 0xdc,
+ 0x6e, 0x51, 0xb8, 0x28
+};
+static const u8 output53[] __initconst = {
+ 0xc5, 0x90, 0x96, 0x78, 0x02, 0xf5, 0xc4, 0x3c,
+ 0xde, 0xd4, 0xd4, 0xc6, 0xa7, 0xad, 0x12, 0x47,
+ 0x45, 0xce, 0xcd, 0x8c, 0x35, 0xcc, 0xa6, 0x9e,
+ 0x5a, 0xc6, 0x60, 0xbb, 0xe3, 0xed, 0xec, 0x68,
+ 0x3f, 0x64, 0xf7, 0x06, 0x63, 0x9c, 0x8c, 0xc8,
+ 0x05, 0x3a, 0xad, 0x32, 0x79, 0x8b, 0x45, 0x96,
+ 0x93, 0x73, 0x4c, 0xe0
+};
+static const u8 key53[] __initconst = {
+ 0x42, 0x4b, 0x20, 0x81, 0x49, 0x50, 0xe9, 0xc2,
+ 0x43, 0x69, 0x36, 0xe7, 0x68, 0xae, 0xd5, 0x7e,
+ 0x42, 0x1a, 0x1b, 0xb4, 0x06, 0x4d, 0xa7, 0x17,
+ 0xb5, 0x31, 0xd6, 0x0c, 0xb0, 0x5c, 0x41, 0x0b
+};
+enum { nonce53 = 0xf44ce1931fbda3d7ULL };
+
+static const u8 input54[] __initconst = {
+ 0x7b, 0xf6, 0x8b, 0xae, 0xc0, 0xcb, 0x10, 0x8e,
+ 0xe8, 0xd8, 0x2e, 0x3b, 0x14, 0xba, 0xb4, 0xd2,
+ 0x58, 0x6b, 0x2c, 0xec, 0xc1, 0x81, 0x71, 0xb4,
+ 0xc6, 0xea, 0x08, 0xc5, 0xc9, 0x78, 0xdb, 0xa2,
+ 0xfa, 0x44, 0x50, 0x9b, 0xc8, 0x53, 0x8d, 0x45,
+ 0x42, 0xe7, 0x09, 0xc4, 0x29, 0xd8, 0x75, 0x02,
+ 0xbb, 0xb2, 0x78, 0xcf, 0xe7
+};
+static const u8 output54[] __initconst = {
+ 0xaf, 0x2c, 0x83, 0x26, 0x6e, 0x7f, 0xa6, 0xe9,
+ 0x03, 0x75, 0xfe, 0xfe, 0x87, 0x58, 0xcf, 0xb5,
+ 0xbc, 0x3c, 0x9d, 0xa1, 0x6e, 0x13, 0xf1, 0x0f,
+ 0x9e, 0xbc, 0xe0, 0x54, 0x24, 0x32, 0xce, 0x95,
+ 0xe6, 0xa5, 0x59, 0x3d, 0x24, 0x1d, 0x8f, 0xb1,
+ 0x74, 0x6c, 0x56, 0xe7, 0x96, 0xc1, 0x91, 0xc8,
+ 0x2d, 0x0e, 0xb7, 0x51, 0x10
+};
+static const u8 key54[] __initconst = {
+ 0x00, 0x68, 0x74, 0xdc, 0x30, 0x9e, 0xe3, 0x52,
+ 0xa9, 0xae, 0xb6, 0x7c, 0xa1, 0xdc, 0x12, 0x2d,
+ 0x98, 0x32, 0x7a, 0x77, 0xe1, 0xdd, 0xa3, 0x76,
+ 0x72, 0x34, 0x83, 0xd8, 0xb7, 0x69, 0xba, 0x77
+};
+enum { nonce54 = 0xbea57d79b798b63aULL };
+
+static const u8 input55[] __initconst = {
+ 0xb5, 0xf4, 0x2f, 0xc1, 0x5e, 0x10, 0xa7, 0x4e,
+ 0x74, 0x3d, 0xa3, 0x96, 0xc0, 0x4d, 0x7b, 0x92,
+ 0x8f, 0xdb, 0x2d, 0x15, 0x52, 0x6a, 0x95, 0x5e,
+ 0x40, 0x81, 0x4f, 0x70, 0x73, 0xea, 0x84, 0x65,
+ 0x3d, 0x9a, 0x4e, 0x03, 0x95, 0xf8, 0x5d, 0x2f,
+ 0x07, 0x02, 0x13, 0x13, 0xdd, 0x82, 0xe6, 0x3b,
+ 0xe1, 0x5f, 0xb3, 0x37, 0x9b, 0x88
+};
+static const u8 output55[] __initconst = {
+ 0xc1, 0x88, 0xbd, 0x92, 0x77, 0xad, 0x7c, 0x5f,
+ 0xaf, 0xa8, 0x57, 0x0e, 0x40, 0x0a, 0xdc, 0x70,
+ 0xfb, 0xc6, 0x71, 0xfd, 0xc4, 0x74, 0x60, 0xcc,
+ 0xa0, 0x89, 0x8e, 0x99, 0xf0, 0x06, 0xa6, 0x7c,
+ 0x97, 0x42, 0x21, 0x81, 0x6a, 0x07, 0xe7, 0xb3,
+ 0xf7, 0xa5, 0x03, 0x71, 0x50, 0x05, 0x63, 0x17,
+ 0xa9, 0x46, 0x0b, 0xff, 0x30, 0x78
+};
+static const u8 key55[] __initconst = {
+ 0x19, 0x8f, 0xe7, 0xd7, 0x6b, 0x7f, 0x6f, 0x69,
+ 0x86, 0x91, 0x0f, 0xa7, 0x4a, 0x69, 0x8e, 0x34,
+ 0xf3, 0xdb, 0xde, 0xaf, 0xf2, 0x66, 0x1d, 0x64,
+ 0x97, 0x0c, 0xcf, 0xfa, 0x33, 0x84, 0xfd, 0x0c
+};
+enum { nonce55 = 0x80aa3d3e2c51ef06ULL };
+
+static const u8 input56[] __initconst = {
+ 0x6b, 0xe9, 0x73, 0x42, 0x27, 0x5e, 0x12, 0xcd,
+ 0xaa, 0x45, 0x12, 0x8b, 0xb3, 0xe6, 0x54, 0x33,
+ 0x31, 0x7d, 0xe2, 0x25, 0xc6, 0x86, 0x47, 0x67,
+ 0x86, 0x83, 0xe4, 0x46, 0xb5, 0x8f, 0x2c, 0xbb,
+ 0xe4, 0xb8, 0x9f, 0xa2, 0xa4, 0xe8, 0x75, 0x96,
+ 0x92, 0x51, 0x51, 0xac, 0x8e, 0x2e, 0x6f, 0xfc,
+ 0xbd, 0x0d, 0xa3, 0x9f, 0x16, 0x55, 0x3e
+};
+static const u8 output56[] __initconst = {
+ 0x42, 0x99, 0x73, 0x6c, 0xd9, 0x4b, 0x16, 0xe5,
+ 0x18, 0x63, 0x1a, 0xd9, 0x0e, 0xf1, 0x15, 0x2e,
+ 0x0f, 0x4b, 0xe4, 0x5f, 0xa0, 0x4d, 0xde, 0x9f,
+ 0xa7, 0x18, 0xc1, 0x0c, 0x0b, 0xae, 0x55, 0xe4,
+ 0x89, 0x18, 0xa4, 0x78, 0x9d, 0x25, 0x0d, 0xd5,
+ 0x94, 0x0f, 0xf9, 0x78, 0xa3, 0xa6, 0xe9, 0x9e,
+ 0x2c, 0x73, 0xf0, 0xf7, 0x35, 0xf3, 0x2b
+};
+static const u8 key56[] __initconst = {
+ 0x7d, 0x12, 0xad, 0x51, 0xd5, 0x6f, 0x8f, 0x96,
+ 0xc0, 0x5d, 0x9a, 0xd1, 0x7e, 0x20, 0x98, 0x0e,
+ 0x3c, 0x0a, 0x67, 0x6b, 0x1b, 0x88, 0x69, 0xd4,
+ 0x07, 0x8c, 0xaf, 0x0f, 0x3a, 0x28, 0xe4, 0x5d
+};
+enum { nonce56 = 0x70f4c372fb8b5984ULL };
+
+static const u8 input57[] __initconst = {
+ 0x28, 0xa3, 0x06, 0xe8, 0xe7, 0x08, 0xb9, 0xef,
+ 0x0d, 0x63, 0x15, 0x99, 0xb2, 0x78, 0x7e, 0xaf,
+ 0x30, 0x50, 0xcf, 0xea, 0xc9, 0x91, 0x41, 0x2f,
+ 0x3b, 0x38, 0x70, 0xc4, 0x87, 0xb0, 0x3a, 0xee,
+ 0x4a, 0xea, 0xe3, 0x83, 0x68, 0x8b, 0xcf, 0xda,
+ 0x04, 0xa5, 0xbd, 0xb2, 0xde, 0x3c, 0x55, 0x13,
+ 0xfe, 0x96, 0xad, 0xc1, 0x61, 0x1b, 0x98, 0xde
+};
+static const u8 output57[] __initconst = {
+ 0xf4, 0x44, 0xe9, 0xd2, 0x6d, 0xc2, 0x5a, 0xe9,
+ 0xfd, 0x7e, 0x41, 0x54, 0x3f, 0xf4, 0x12, 0xd8,
+ 0x55, 0x0d, 0x12, 0x9b, 0xd5, 0x2e, 0x95, 0xe5,
+ 0x77, 0x42, 0x3f, 0x2c, 0xfb, 0x28, 0x9d, 0x72,
+ 0x6d, 0x89, 0x82, 0x27, 0x64, 0x6f, 0x0d, 0x57,
+ 0xa1, 0x25, 0xa3, 0x6b, 0x88, 0x9a, 0xac, 0x0c,
+ 0x76, 0x19, 0x90, 0xe2, 0x50, 0x5a, 0xf8, 0x12
+};
+static const u8 key57[] __initconst = {
+ 0x08, 0x26, 0xb8, 0xac, 0xf3, 0xa5, 0xc6, 0xa3,
+ 0x7f, 0x09, 0x87, 0xf5, 0x6c, 0x5a, 0x85, 0x6c,
+ 0x3d, 0xbd, 0xde, 0xd5, 0x87, 0xa3, 0x98, 0x7a,
+ 0xaa, 0x40, 0x3e, 0xf7, 0xff, 0x44, 0x5d, 0xee
+};
+enum { nonce57 = 0xc03a6130bf06b089ULL };
+
+static const u8 input58[] __initconst = {
+ 0x82, 0xa5, 0x38, 0x6f, 0xaa, 0xb4, 0xaf, 0xb2,
+ 0x42, 0x01, 0xa8, 0x39, 0x3f, 0x15, 0x51, 0xa8,
+ 0x11, 0x1b, 0x93, 0xca, 0x9c, 0xa0, 0x57, 0x68,
+ 0x8f, 0xdb, 0x68, 0x53, 0x51, 0x6d, 0x13, 0x22,
+ 0x12, 0x9b, 0xbd, 0x33, 0xa8, 0x52, 0x40, 0x57,
+ 0x80, 0x9b, 0x98, 0xef, 0x56, 0x70, 0x11, 0xfa,
+ 0x36, 0x69, 0x7d, 0x15, 0x48, 0xf9, 0x3b, 0xeb,
+ 0x42
+};
+static const u8 output58[] __initconst = {
+ 0xff, 0x3a, 0x74, 0xc3, 0x3e, 0x44, 0x64, 0x4d,
+ 0x0e, 0x5f, 0x9d, 0xa8, 0xdb, 0xbe, 0x12, 0xef,
+ 0xba, 0x56, 0x65, 0x50, 0x76, 0xaf, 0xa4, 0x4e,
+ 0x01, 0xc1, 0xd3, 0x31, 0x14, 0xe2, 0xbe, 0x7b,
+ 0xa5, 0x67, 0xb4, 0xe3, 0x68, 0x40, 0x9c, 0xb0,
+ 0xb1, 0x78, 0xef, 0x49, 0x03, 0x0f, 0x2d, 0x56,
+ 0xb4, 0x37, 0xdb, 0xbc, 0x2d, 0x68, 0x1c, 0x3c,
+ 0xf1
+};
+static const u8 key58[] __initconst = {
+ 0x7e, 0xf1, 0x7c, 0x20, 0x65, 0xed, 0xcd, 0xd7,
+ 0x57, 0xe8, 0xdb, 0x90, 0x87, 0xdb, 0x5f, 0x63,
+ 0x3d, 0xdd, 0xb8, 0x2b, 0x75, 0x8e, 0x04, 0xb5,
+ 0xf4, 0x12, 0x79, 0xa9, 0x4d, 0x42, 0x16, 0x7f
+};
+enum { nonce58 = 0x92838183f80d2f7fULL };
+
+static const u8 input59[] __initconst = {
+ 0x37, 0xf1, 0x9d, 0xdd, 0xd7, 0x08, 0x9f, 0x13,
+ 0xc5, 0x21, 0x82, 0x75, 0x08, 0x9e, 0x25, 0x16,
+ 0xb1, 0xd1, 0x71, 0x42, 0x28, 0x63, 0xac, 0x47,
+ 0x71, 0x54, 0xb1, 0xfc, 0x39, 0xf0, 0x61, 0x4f,
+ 0x7c, 0x6d, 0x4f, 0xc8, 0x33, 0xef, 0x7e, 0xc8,
+ 0xc0, 0x97, 0xfc, 0x1a, 0x61, 0xb4, 0x87, 0x6f,
+ 0xdd, 0x5a, 0x15, 0x7b, 0x1b, 0x95, 0x50, 0x94,
+ 0x1d, 0xba
+};
+static const u8 output59[] __initconst = {
+ 0x73, 0x67, 0xc5, 0x07, 0xbb, 0x57, 0x79, 0xd5,
+ 0xc9, 0x04, 0xdd, 0x88, 0xf3, 0x86, 0xe5, 0x70,
+ 0x49, 0x31, 0xe0, 0xcc, 0x3b, 0x1d, 0xdf, 0xb0,
+ 0xaf, 0xf4, 0x2d, 0xe0, 0x06, 0x10, 0x91, 0x8d,
+ 0x1c, 0xcf, 0x31, 0x0b, 0xf6, 0x73, 0xda, 0x1c,
+ 0xf0, 0x17, 0x52, 0x9e, 0x20, 0x2e, 0x9f, 0x8c,
+ 0xb3, 0x59, 0xce, 0xd4, 0xd3, 0xc1, 0x81, 0xe9,
+ 0x11, 0x36
+};
+static const u8 key59[] __initconst = {
+ 0xbd, 0x07, 0xd0, 0x53, 0x2c, 0xb3, 0xcc, 0x3f,
+ 0xc4, 0x95, 0xfd, 0xe7, 0x81, 0xb3, 0x29, 0x99,
+ 0x05, 0x45, 0xd6, 0x95, 0x25, 0x0b, 0x72, 0xd3,
+ 0xcd, 0xbb, 0x73, 0xf8, 0xfa, 0xc0, 0x9b, 0x7a
+};
+enum { nonce59 = 0x4a0db819b0d519e2ULL };
+
+static const u8 input60[] __initconst = {
+ 0x58, 0x4e, 0xdf, 0x94, 0x3c, 0x76, 0x0a, 0x79,
+ 0x47, 0xf1, 0xbe, 0x88, 0xd3, 0xba, 0x94, 0xd8,
+ 0xe2, 0x8f, 0xe3, 0x2f, 0x2f, 0x74, 0x82, 0x55,
+ 0xc3, 0xda, 0xe2, 0x4e, 0x2c, 0x8c, 0x45, 0x1d,
+ 0x72, 0x8f, 0x54, 0x41, 0xb5, 0xb7, 0x69, 0xe4,
+ 0xdc, 0xd2, 0x36, 0x21, 0x5c, 0x28, 0x52, 0xf7,
+ 0x98, 0x8e, 0x72, 0xa7, 0x6d, 0x57, 0xed, 0xdc,
+ 0x3c, 0xe6, 0x6a
+};
+static const u8 output60[] __initconst = {
+ 0xda, 0xaf, 0xb5, 0xe3, 0x30, 0x65, 0x5c, 0xb1,
+ 0x48, 0x08, 0x43, 0x7b, 0x9e, 0xd2, 0x6a, 0x62,
+ 0x56, 0x7c, 0xad, 0xd9, 0xe5, 0xf6, 0x09, 0x71,
+ 0xcd, 0xe6, 0x05, 0x6b, 0x3f, 0x44, 0x3a, 0x5c,
+ 0xf6, 0xf8, 0xd7, 0xce, 0x7d, 0xd1, 0xe0, 0x4f,
+ 0x88, 0x15, 0x04, 0xd8, 0x20, 0xf0, 0x3e, 0xef,
+ 0xae, 0xa6, 0x27, 0xa3, 0x0e, 0xfc, 0x18, 0x90,
+ 0x33, 0xcd, 0xd3
+};
+static const u8 key60[] __initconst = {
+ 0xbf, 0xfd, 0x25, 0xb5, 0xb2, 0xfc, 0x78, 0x0c,
+ 0x8e, 0xb9, 0x57, 0x2f, 0x26, 0x4a, 0x7e, 0x71,
+ 0xcc, 0xf2, 0xe0, 0xfd, 0x24, 0x11, 0x20, 0x23,
+ 0x57, 0x00, 0xff, 0x80, 0x11, 0x0c, 0x1e, 0xff
+};
+enum { nonce60 = 0xf18df56fdb7954adULL };
+
+static const u8 input61[] __initconst = {
+ 0xb0, 0xf3, 0x06, 0xbc, 0x22, 0xae, 0x49, 0x40,
+ 0xae, 0xff, 0x1b, 0x31, 0xa7, 0x98, 0xab, 0x1d,
+ 0xe7, 0x40, 0x23, 0x18, 0x4f, 0xab, 0x8e, 0x93,
+ 0x82, 0xf4, 0x56, 0x61, 0xfd, 0x2b, 0xcf, 0xa7,
+ 0xc4, 0xb4, 0x0a, 0xf4, 0xcb, 0xc7, 0x8c, 0x40,
+ 0x57, 0xac, 0x0b, 0x3e, 0x2a, 0x0a, 0x67, 0x83,
+ 0x50, 0xbf, 0xec, 0xb0, 0xc7, 0xf1, 0x32, 0x26,
+ 0x98, 0x80, 0x33, 0xb4
+};
+static const u8 output61[] __initconst = {
+ 0x9d, 0x23, 0x0e, 0xff, 0xcc, 0x7c, 0xd5, 0xcf,
+ 0x1a, 0xb8, 0x59, 0x1e, 0x92, 0xfd, 0x7f, 0xca,
+ 0xca, 0x3c, 0x18, 0x81, 0xde, 0xfa, 0x59, 0xc8,
+ 0x6f, 0x9c, 0x24, 0x3f, 0x3a, 0xe6, 0x0b, 0xb4,
+ 0x34, 0x48, 0x69, 0xfc, 0xb6, 0xea, 0xb2, 0xde,
+ 0x9f, 0xfd, 0x92, 0x36, 0x18, 0x98, 0x99, 0xaa,
+ 0x65, 0xe2, 0xea, 0xf4, 0xb1, 0x47, 0x8e, 0xb0,
+ 0xe7, 0xd4, 0x7a, 0x2c
+};
+static const u8 key61[] __initconst = {
+ 0xd7, 0xfd, 0x9b, 0xbd, 0x8f, 0x65, 0x0d, 0x00,
+ 0xca, 0xa1, 0x6c, 0x85, 0x85, 0xa4, 0x6d, 0xf1,
+ 0xb1, 0x68, 0x0c, 0x8b, 0x5d, 0x37, 0x72, 0xd0,
+ 0xd8, 0xd2, 0x25, 0xab, 0x9f, 0x7b, 0x7d, 0x95
+};
+enum { nonce61 = 0xd82caf72a9c4864fULL };
+
+static const u8 input62[] __initconst = {
+ 0x10, 0x77, 0xf3, 0x2f, 0xc2, 0x50, 0xd6, 0x0c,
+ 0xba, 0xa8, 0x8d, 0xce, 0x0d, 0x58, 0x9e, 0x87,
+ 0xb1, 0x59, 0x66, 0x0a, 0x4a, 0xb3, 0xd8, 0xca,
+ 0x0a, 0x6b, 0xf8, 0xc6, 0x2b, 0x3f, 0x8e, 0x09,
+ 0xe0, 0x0a, 0x15, 0x85, 0xfe, 0xaa, 0xc6, 0xbd,
+ 0x30, 0xef, 0xe4, 0x10, 0x78, 0x03, 0xc1, 0xc7,
+ 0x8a, 0xd9, 0xde, 0x0b, 0x51, 0x07, 0xc4, 0x7b,
+ 0xe2, 0x2e, 0x36, 0x3a, 0xc2
+};
+static const u8 output62[] __initconst = {
+ 0xa0, 0x0c, 0xfc, 0xc1, 0xf6, 0xaf, 0xc2, 0xb8,
+ 0x5c, 0xef, 0x6e, 0xf3, 0xce, 0x15, 0x48, 0x05,
+ 0xb5, 0x78, 0x49, 0x51, 0x1f, 0x9d, 0xf4, 0xbf,
+ 0x2f, 0x53, 0xa2, 0xd1, 0x15, 0x20, 0x82, 0x6b,
+ 0xd2, 0x22, 0x6c, 0x4e, 0x14, 0x87, 0xe3, 0xd7,
+ 0x49, 0x45, 0x84, 0xdb, 0x5f, 0x68, 0x60, 0xc4,
+ 0xb3, 0xe6, 0x3f, 0xd1, 0xfc, 0xa5, 0x73, 0xf3,
+ 0xfc, 0xbb, 0xbe, 0xc8, 0x9d
+};
+static const u8 key62[] __initconst = {
+ 0x6e, 0xc9, 0xaf, 0xce, 0x35, 0xb9, 0x86, 0xd1,
+ 0xce, 0x5f, 0xd9, 0xbb, 0xd5, 0x1f, 0x7c, 0xcd,
+ 0xfe, 0x19, 0xaa, 0x3d, 0xea, 0x64, 0xc1, 0x28,
+ 0x40, 0xba, 0xa1, 0x28, 0xcd, 0x40, 0xb6, 0xf2
+};
+enum { nonce62 = 0xa1c0c265f900cde8ULL };
+
+static const u8 input63[] __initconst = {
+ 0x7a, 0x70, 0x21, 0x2c, 0xef, 0xa6, 0x36, 0xd4,
+ 0xe0, 0xab, 0x8c, 0x25, 0x73, 0x34, 0xc8, 0x94,
+ 0x6c, 0x81, 0xcb, 0x19, 0x8d, 0x5a, 0x49, 0xaa,
+ 0x6f, 0xba, 0x83, 0x72, 0x02, 0x5e, 0xf5, 0x89,
+ 0xce, 0x79, 0x7e, 0x13, 0x3d, 0x5b, 0x98, 0x60,
+ 0x5d, 0xd9, 0xfb, 0x15, 0x93, 0x4c, 0xf3, 0x51,
+ 0x49, 0x55, 0xd1, 0x58, 0xdd, 0x7e, 0x6d, 0xfe,
+ 0xdd, 0x84, 0x23, 0x05, 0xba, 0xe9
+};
+static const u8 output63[] __initconst = {
+ 0x20, 0xb3, 0x5c, 0x03, 0x03, 0x78, 0x17, 0xfc,
+ 0x3b, 0x35, 0x30, 0x9a, 0x00, 0x18, 0xf5, 0xc5,
+ 0x06, 0x53, 0xf5, 0x04, 0x24, 0x9d, 0xd1, 0xb2,
+ 0xac, 0x5a, 0xb6, 0x2a, 0xa5, 0xda, 0x50, 0x00,
+ 0xec, 0xff, 0xa0, 0x7a, 0x14, 0x7b, 0xe4, 0x6b,
+ 0x63, 0xe8, 0x66, 0x86, 0x34, 0xfd, 0x74, 0x44,
+ 0xa2, 0x50, 0x97, 0x0d, 0xdc, 0xc3, 0x84, 0xf8,
+ 0x71, 0x02, 0x31, 0x95, 0xed, 0x54
+};
+static const u8 key63[] __initconst = {
+ 0x7d, 0x64, 0xb4, 0x12, 0x81, 0xe4, 0xe6, 0x8f,
+ 0xcc, 0xe7, 0xd1, 0x1f, 0x70, 0x20, 0xfd, 0xb8,
+ 0x3a, 0x7d, 0xa6, 0x53, 0x65, 0x30, 0x5d, 0xe3,
+ 0x1a, 0x44, 0xbe, 0x62, 0xed, 0x90, 0xc4, 0xd1
+};
+enum { nonce63 = 0xe8e849596c942276ULL };
+
+static const u8 input64[] __initconst = {
+ 0x84, 0xf8, 0xda, 0x87, 0x23, 0x39, 0x60, 0xcf,
+ 0xc5, 0x50, 0x7e, 0xc5, 0x47, 0x29, 0x7c, 0x05,
+ 0xc2, 0xb4, 0xf4, 0xb2, 0xec, 0x5d, 0x48, 0x36,
+ 0xbf, 0xfc, 0x06, 0x8c, 0xf2, 0x0e, 0x88, 0xe7,
+ 0xc9, 0xc5, 0xa4, 0xa2, 0x83, 0x20, 0xa1, 0x6f,
+ 0x37, 0xe5, 0x2d, 0xa1, 0x72, 0xa1, 0x19, 0xef,
+ 0x05, 0x42, 0x08, 0xf2, 0x57, 0x47, 0x31, 0x1e,
+ 0x17, 0x76, 0x13, 0xd3, 0xcc, 0x75, 0x2c
+};
+static const u8 output64[] __initconst = {
+ 0xcb, 0xec, 0x90, 0x88, 0xeb, 0x31, 0x69, 0x20,
+ 0xa6, 0xdc, 0xff, 0x76, 0x98, 0xb0, 0x24, 0x49,
+ 0x7b, 0x20, 0xd9, 0xd1, 0x1b, 0xe3, 0x61, 0xdc,
+ 0xcf, 0x51, 0xf6, 0x70, 0x72, 0x33, 0x28, 0x94,
+ 0xac, 0x73, 0x18, 0xcf, 0x93, 0xfd, 0xca, 0x08,
+ 0x0d, 0xa2, 0xb9, 0x57, 0x1e, 0x51, 0xb6, 0x07,
+ 0x5c, 0xc1, 0x13, 0x64, 0x1d, 0x18, 0x6f, 0xe6,
+ 0x0b, 0xb7, 0x14, 0x03, 0x43, 0xb6, 0xaf
+};
+static const u8 key64[] __initconst = {
+ 0xbf, 0x82, 0x65, 0xe4, 0x50, 0xf9, 0x5e, 0xea,
+ 0x28, 0x91, 0xd1, 0xd2, 0x17, 0x7c, 0x13, 0x7e,
+ 0xf5, 0xd5, 0x6b, 0x06, 0x1c, 0x20, 0xc2, 0x82,
+ 0xa1, 0x7a, 0xa2, 0x14, 0xa1, 0xb0, 0x54, 0x58
+};
+enum { nonce64 = 0xe57c5095aa5723c9ULL };
+
+static const u8 input65[] __initconst = {
+ 0x1c, 0xfb, 0xd3, 0x3f, 0x85, 0xd7, 0xba, 0x7b,
+ 0xae, 0xb1, 0xa5, 0xd2, 0xe5, 0x40, 0xce, 0x4d,
+ 0x3e, 0xab, 0x17, 0x9d, 0x7d, 0x9f, 0x03, 0x98,
+ 0x3f, 0x9f, 0xc8, 0xdd, 0x36, 0x17, 0x43, 0x5c,
+ 0x34, 0xd1, 0x23, 0xe0, 0x77, 0xbf, 0x35, 0x5d,
+ 0x8f, 0xb1, 0xcb, 0x82, 0xbb, 0x39, 0x69, 0xd8,
+ 0x90, 0x45, 0x37, 0xfd, 0x98, 0x25, 0xf7, 0x5b,
+ 0xce, 0x06, 0x43, 0xba, 0x61, 0xa8, 0x47, 0xb9
+};
+static const u8 output65[] __initconst = {
+ 0x73, 0xa5, 0x68, 0xab, 0x8b, 0xa5, 0xc3, 0x7e,
+ 0x74, 0xf8, 0x9d, 0xf5, 0x93, 0x6e, 0xf2, 0x71,
+ 0x6d, 0xde, 0x82, 0xc5, 0x40, 0xa0, 0x46, 0xb3,
+ 0x9a, 0x78, 0xa8, 0xf7, 0xdf, 0xb1, 0xc3, 0xdd,
+ 0x8d, 0x90, 0x00, 0x68, 0x21, 0x48, 0xe8, 0xba,
+ 0x56, 0x9f, 0x8f, 0xe7, 0xa4, 0x4d, 0x36, 0x55,
+ 0xd0, 0x34, 0x99, 0xa6, 0x1c, 0x4c, 0xc1, 0xe2,
+ 0x65, 0x98, 0x14, 0x8e, 0x6a, 0x05, 0xb1, 0x2b
+};
+static const u8 key65[] __initconst = {
+ 0xbd, 0x5c, 0x8a, 0xb0, 0x11, 0x29, 0xf3, 0x00,
+ 0x7a, 0x78, 0x32, 0x63, 0x34, 0x00, 0xe6, 0x7d,
+ 0x30, 0x54, 0xde, 0x37, 0xda, 0xc2, 0xc4, 0x3d,
+ 0x92, 0x6b, 0x4c, 0xc2, 0x92, 0xe9, 0x9e, 0x2a
+};
+enum { nonce65 = 0xf654a3031de746f2ULL };
+
+static const u8 input66[] __initconst = {
+ 0x4b, 0x27, 0x30, 0x8f, 0x28, 0xd8, 0x60, 0x46,
+ 0x39, 0x06, 0x49, 0xea, 0x1b, 0x71, 0x26, 0xe0,
+ 0x99, 0x2b, 0xd4, 0x8f, 0x64, 0x64, 0xcd, 0xac,
+ 0x1d, 0x78, 0x88, 0x90, 0xe1, 0x5c, 0x24, 0x4b,
+ 0xdc, 0x2d, 0xb7, 0xee, 0x3a, 0xe6, 0x86, 0x2c,
+ 0x21, 0xe4, 0x2b, 0xfc, 0xe8, 0x19, 0xca, 0x65,
+ 0xe7, 0xdd, 0x6f, 0x52, 0xb3, 0x11, 0xe1, 0xe2,
+ 0xbf, 0xe8, 0x70, 0xe3, 0x0d, 0x45, 0xb8, 0xa5,
+ 0x20, 0xb7, 0xb5, 0xaf, 0xff, 0x08, 0xcf, 0x23,
+ 0x65, 0xdf, 0x8d, 0xc3, 0x31, 0xf3, 0x1e, 0x6a,
+ 0x58, 0x8d, 0xcc, 0x45, 0x16, 0x86, 0x1f, 0x31,
+ 0x5c, 0x27, 0xcd, 0xc8, 0x6b, 0x19, 0x1e, 0xec,
+ 0x44, 0x75, 0x63, 0x97, 0xfd, 0x79, 0xf6, 0x62,
+ 0xc5, 0xba, 0x17, 0xc7, 0xab, 0x8f, 0xbb, 0xed,
+ 0x85, 0x2a, 0x98, 0x79, 0x21, 0xec, 0x6e, 0x4d,
+ 0xdc, 0xfa, 0x72, 0x52, 0xba, 0xc8, 0x4c
+};
+static const u8 output66[] __initconst = {
+ 0x76, 0x5b, 0x2c, 0xa7, 0x62, 0xb9, 0x08, 0x4a,
+ 0xc6, 0x4a, 0x92, 0xc3, 0xbb, 0x10, 0xb3, 0xee,
+ 0xff, 0xb9, 0x07, 0xc7, 0x27, 0xcb, 0x1e, 0xcf,
+ 0x58, 0x6f, 0xa1, 0x64, 0xe8, 0xf1, 0x4e, 0xe1,
+ 0xef, 0x18, 0x96, 0xab, 0x97, 0x28, 0xd1, 0x7c,
+ 0x71, 0x6c, 0xd1, 0xe2, 0xfa, 0xd9, 0x75, 0xcb,
+ 0xeb, 0xea, 0x0c, 0x86, 0x82, 0xd8, 0xf4, 0xcc,
+ 0xea, 0xa3, 0x00, 0xfa, 0x82, 0xd2, 0xcd, 0xcb,
+ 0xdb, 0x63, 0x28, 0xe2, 0x82, 0xe9, 0x01, 0xed,
+ 0x31, 0xe6, 0x71, 0x45, 0x08, 0x89, 0x8a, 0x23,
+ 0xa8, 0xb5, 0xc2, 0xe2, 0x9f, 0xe9, 0xb8, 0x9a,
+ 0xc4, 0x79, 0x6d, 0x71, 0x52, 0x61, 0x74, 0x6c,
+ 0x1b, 0xd7, 0x65, 0x6d, 0x03, 0xc4, 0x1a, 0xc0,
+ 0x50, 0xba, 0xd6, 0xc9, 0x43, 0x50, 0xbe, 0x09,
+ 0x09, 0x8a, 0xdb, 0xaa, 0x76, 0x4e, 0x3b, 0x61,
+ 0x3c, 0x7c, 0x44, 0xe7, 0xdb, 0x10, 0xa7
+};
+static const u8 key66[] __initconst = {
+ 0x88, 0xdf, 0xca, 0x68, 0xaf, 0x4f, 0xb3, 0xfd,
+ 0x6e, 0xa7, 0x95, 0x35, 0x8a, 0xe8, 0x37, 0xe8,
+ 0xc8, 0x55, 0xa2, 0x2a, 0x6d, 0x77, 0xf8, 0x93,
+ 0x7a, 0x41, 0xf3, 0x7b, 0x95, 0xdf, 0x89, 0xf5
+};
+enum { nonce66 = 0x1024b4fdd415cf82ULL };
+
+static const u8 input67[] __initconst = {
+ 0xd4, 0x2e, 0xfa, 0x92, 0xe9, 0x29, 0x68, 0xb7,
+ 0x54, 0x2c, 0xf7, 0xa4, 0x2d, 0xb7, 0x50, 0xb5,
+ 0xc5, 0xb2, 0x9d, 0x17, 0x5e, 0x0a, 0xca, 0x37,
+ 0xbf, 0x60, 0xae, 0xd2, 0x98, 0xe9, 0xfa, 0x59,
+ 0x67, 0x62, 0xe6, 0x43, 0x0c, 0x77, 0x80, 0x82,
+ 0x33, 0x61, 0xa3, 0xff, 0xc1, 0xa0, 0x8f, 0x56,
+ 0xbc, 0xec, 0x65, 0x43, 0x88, 0xa5, 0xff, 0x51,
+ 0x64, 0x30, 0xee, 0x34, 0xb7, 0x5c, 0x28, 0x68,
+ 0xc3, 0x52, 0xd2, 0xac, 0x78, 0x2a, 0xa6, 0x10,
+ 0xb8, 0xb2, 0x4c, 0x80, 0x4f, 0x99, 0xb2, 0x36,
+ 0x94, 0x8f, 0x66, 0xcb, 0xa1, 0x91, 0xed, 0x06,
+ 0x42, 0x6d, 0xc1, 0xae, 0x55, 0x93, 0xdd, 0x93,
+ 0x9e, 0x88, 0x34, 0x7f, 0x98, 0xeb, 0xbe, 0x61,
+ 0xf9, 0xa9, 0x0f, 0xd9, 0xc4, 0x87, 0xd5, 0xef,
+ 0xcc, 0x71, 0x8c, 0x0e, 0xce, 0xad, 0x02, 0xcf,
+ 0xa2, 0x61, 0xdf, 0xb1, 0xfe, 0x3b, 0xdc, 0xc0,
+ 0x58, 0xb5, 0x71, 0xa1, 0x83, 0xc9, 0xb4, 0xaf,
+ 0x9d, 0x54, 0x12, 0xcd, 0xea, 0x06, 0xd6, 0x4e,
+ 0xe5, 0x27, 0x0c, 0xc3, 0xbb, 0xa8, 0x0a, 0x81,
+ 0x75, 0xc3, 0xc9, 0xd4, 0x35, 0x3e, 0x53, 0x9f,
+ 0xaa, 0x20, 0xc0, 0x68, 0x39, 0x2c, 0x96, 0x39,
+ 0x53, 0x81, 0xda, 0x07, 0x0f, 0x44, 0xa5, 0x47,
+ 0x0e, 0xb3, 0x87, 0x0d, 0x1b, 0xc1, 0xe5, 0x41,
+ 0x35, 0x12, 0x58, 0x96, 0x69, 0x8a, 0x1a, 0xa3,
+ 0x9d, 0x3d, 0xd4, 0xb1, 0x8e, 0x1f, 0x96, 0x87,
+ 0xda, 0xd3, 0x19, 0xe2, 0xb1, 0x3a, 0x19, 0x74,
+ 0xa0, 0x00, 0x9f, 0x4d, 0xbc, 0xcb, 0x0c, 0xe9,
+ 0xec, 0x10, 0xdf, 0x2a, 0x88, 0xdc, 0x30, 0x51,
+ 0x46, 0x56, 0x53, 0x98, 0x6a, 0x26, 0x14, 0x05,
+ 0x54, 0x81, 0x55, 0x0b, 0x3c, 0x85, 0xdd, 0x33,
+ 0x81, 0x11, 0x29, 0x82, 0x46, 0x35, 0xe1, 0xdb,
+ 0x59, 0x7b
+};
+static const u8 output67[] __initconst = {
+ 0x64, 0x6c, 0xda, 0x7f, 0xd4, 0xa9, 0x2a, 0x5e,
+ 0x22, 0xae, 0x8d, 0x67, 0xdb, 0xee, 0xfd, 0xd0,
+ 0x44, 0x80, 0x17, 0xb2, 0xe3, 0x87, 0xad, 0x57,
+ 0x15, 0xcb, 0x88, 0x64, 0xc0, 0xf1, 0x49, 0x3d,
+ 0xfa, 0xbe, 0xa8, 0x9f, 0x12, 0xc3, 0x57, 0x56,
+ 0x70, 0xa5, 0xc5, 0x6b, 0xf1, 0xab, 0xd5, 0xde,
+ 0x77, 0x92, 0x6a, 0x56, 0x03, 0xf5, 0x21, 0x0d,
+ 0xb6, 0xc4, 0xcc, 0x62, 0x44, 0x3f, 0xb1, 0xc1,
+ 0x61, 0x41, 0x90, 0xb2, 0xd5, 0xb8, 0xf3, 0x57,
+ 0xfb, 0xc2, 0x6b, 0x25, 0x58, 0xc8, 0x45, 0x20,
+ 0x72, 0x29, 0x6f, 0x9d, 0xb5, 0x81, 0x4d, 0x2b,
+ 0xb2, 0x89, 0x9e, 0x91, 0x53, 0x97, 0x1c, 0xd9,
+ 0x3d, 0x79, 0xdc, 0x14, 0xae, 0x01, 0x73, 0x75,
+ 0xf0, 0xca, 0xd5, 0xab, 0x62, 0x5c, 0x7a, 0x7d,
+ 0x3f, 0xfe, 0x22, 0x7d, 0xee, 0xe2, 0xcb, 0x76,
+ 0x55, 0xec, 0x06, 0xdd, 0x41, 0x47, 0x18, 0x62,
+ 0x1d, 0x57, 0xd0, 0xd6, 0xb6, 0x0f, 0x4b, 0xfc,
+ 0x79, 0x19, 0xf4, 0xd6, 0x37, 0x86, 0x18, 0x1f,
+ 0x98, 0x0d, 0x9e, 0x15, 0x2d, 0xb6, 0x9a, 0x8a,
+ 0x8c, 0x80, 0x22, 0x2f, 0x82, 0xc4, 0xc7, 0x36,
+ 0xfa, 0xfa, 0x07, 0xbd, 0xc2, 0x2a, 0xe2, 0xea,
+ 0x93, 0xc8, 0xb2, 0x90, 0x33, 0xf2, 0xee, 0x4b,
+ 0x1b, 0xf4, 0x37, 0x92, 0x13, 0xbb, 0xe2, 0xce,
+ 0xe3, 0x03, 0xcf, 0x07, 0x94, 0xab, 0x9a, 0xc9,
+ 0xff, 0x83, 0x69, 0x3a, 0xda, 0x2c, 0xd0, 0x47,
+ 0x3d, 0x6c, 0x1a, 0x60, 0x68, 0x47, 0xb9, 0x36,
+ 0x52, 0xdd, 0x16, 0xef, 0x6c, 0xbf, 0x54, 0x11,
+ 0x72, 0x62, 0xce, 0x8c, 0x9d, 0x90, 0xa0, 0x25,
+ 0x06, 0x92, 0x3e, 0x12, 0x7e, 0x1a, 0x1d, 0xe5,
+ 0xa2, 0x71, 0xce, 0x1c, 0x4c, 0x6a, 0x7c, 0xdc,
+ 0x3d, 0xe3, 0x6e, 0x48, 0x9d, 0xb3, 0x64, 0x7d,
+ 0x78, 0x40
+};
+static const u8 key67[] __initconst = {
+ 0xa9, 0x20, 0x75, 0x89, 0x7e, 0x37, 0x85, 0x48,
+ 0xa3, 0xfb, 0x7b, 0xe8, 0x30, 0xa7, 0xe3, 0x6e,
+ 0xa6, 0xc1, 0x71, 0x17, 0xc1, 0x6c, 0x9b, 0xc2,
+ 0xde, 0xf0, 0xa7, 0x19, 0xec, 0xce, 0xc6, 0x53
+};
+enum { nonce67 = 0x4adc4d1f968c8a10ULL };
+
+static const u8 input68[] __initconst = {
+ 0x99, 0xae, 0x72, 0xfb, 0x16, 0xe1, 0xf1, 0x59,
+ 0x43, 0x15, 0x4e, 0x33, 0xa0, 0x95, 0xe7, 0x6c,
+ 0x74, 0x24, 0x31, 0xca, 0x3b, 0x2e, 0xeb, 0xd7,
+ 0x11, 0xd8, 0xe0, 0x56, 0x92, 0x91, 0x61, 0x57,
+ 0xe2, 0x82, 0x9f, 0x8f, 0x37, 0xf5, 0x3d, 0x24,
+ 0x92, 0x9d, 0x87, 0x00, 0x8d, 0x89, 0xe0, 0x25,
+ 0x8b, 0xe4, 0x20, 0x5b, 0x8a, 0x26, 0x2c, 0x61,
+ 0x78, 0xb0, 0xa6, 0x3e, 0x82, 0x18, 0xcf, 0xdc,
+ 0x2d, 0x24, 0xdd, 0x81, 0x42, 0xc4, 0x95, 0xf0,
+ 0x48, 0x60, 0x71, 0xe3, 0xe3, 0xac, 0xec, 0xbe,
+ 0x98, 0x6b, 0x0c, 0xb5, 0x6a, 0xa9, 0xc8, 0x79,
+ 0x23, 0x2e, 0x38, 0x0b, 0x72, 0x88, 0x8c, 0xe7,
+ 0x71, 0x8b, 0x36, 0xe3, 0x58, 0x3d, 0x9c, 0xa0,
+ 0xa2, 0xea, 0xcf, 0x0c, 0x6a, 0x6c, 0x64, 0xdf,
+ 0x97, 0x21, 0x8f, 0x93, 0xfb, 0xba, 0xf3, 0x5a,
+ 0xd7, 0x8f, 0xa6, 0x37, 0x15, 0x50, 0x43, 0x02,
+ 0x46, 0x7f, 0x93, 0x46, 0x86, 0x31, 0xe2, 0xaa,
+ 0x24, 0xa8, 0x26, 0xae, 0xe6, 0xc0, 0x05, 0x73,
+ 0x0b, 0x4f, 0x7e, 0xed, 0x65, 0xeb, 0x56, 0x1e,
+ 0xb6, 0xb3, 0x0b, 0xc3, 0x0e, 0x31, 0x95, 0xa9,
+ 0x18, 0x4d, 0xaf, 0x38, 0xd7, 0xec, 0xc6, 0x44,
+ 0x72, 0x77, 0x4e, 0x25, 0x4b, 0x25, 0xdd, 0x1e,
+ 0x8c, 0xa2, 0xdf, 0xf6, 0x2a, 0x97, 0x1a, 0x88,
+ 0x2c, 0x8a, 0x5d, 0xfe, 0xe8, 0xfb, 0x35, 0xe8,
+ 0x0f, 0x2b, 0x7a, 0x18, 0x69, 0x43, 0x31, 0x1d,
+ 0x38, 0x6a, 0x62, 0x95, 0x0f, 0x20, 0x4b, 0xbb,
+ 0x97, 0x3c, 0xe0, 0x64, 0x2f, 0x52, 0xc9, 0x2d,
+ 0x4d, 0x9d, 0x54, 0x04, 0x3d, 0xc9, 0xea, 0xeb,
+ 0xd0, 0x86, 0x52, 0xff, 0x42, 0xe1, 0x0d, 0x7a,
+ 0xad, 0x88, 0xf9, 0x9b, 0x1e, 0x5e, 0x12, 0x27,
+ 0x95, 0x3e, 0x0c, 0x2c, 0x13, 0x00, 0x6f, 0x8e,
+ 0x93, 0x69, 0x0e, 0x01, 0x8c, 0xc1, 0xfd, 0xb3
+};
+static const u8 output68[] __initconst = {
+ 0x26, 0x3e, 0xf2, 0xb1, 0xf5, 0xef, 0x81, 0xa4,
+ 0xb7, 0x42, 0xd4, 0x26, 0x18, 0x4b, 0xdd, 0x6a,
+ 0x47, 0x15, 0xcb, 0x0e, 0x57, 0xdb, 0xa7, 0x29,
+ 0x7e, 0x7b, 0x3f, 0x47, 0x89, 0x57, 0xab, 0xea,
+ 0x14, 0x7b, 0xcf, 0x37, 0xdb, 0x1c, 0xe1, 0x11,
+ 0x77, 0xae, 0x2e, 0x4c, 0xd2, 0x08, 0x3f, 0xa6,
+ 0x62, 0x86, 0xa6, 0xb2, 0x07, 0xd5, 0x3f, 0x9b,
+ 0xdc, 0xc8, 0x50, 0x4b, 0x7b, 0xb9, 0x06, 0xe6,
+ 0xeb, 0xac, 0x98, 0x8c, 0x36, 0x0c, 0x1e, 0xb2,
+ 0xc8, 0xfb, 0x24, 0x60, 0x2c, 0x08, 0x17, 0x26,
+ 0x5b, 0xc8, 0xc2, 0xdf, 0x9c, 0x73, 0x67, 0x4a,
+ 0xdb, 0xcf, 0xd5, 0x2c, 0x2b, 0xca, 0x24, 0xcc,
+ 0xdb, 0xc9, 0xa8, 0xf2, 0x5d, 0x67, 0xdf, 0x5c,
+ 0x62, 0x0b, 0x58, 0xc0, 0x83, 0xde, 0x8b, 0xf6,
+ 0x15, 0x0a, 0xd6, 0x32, 0xd8, 0xf5, 0xf2, 0x5f,
+ 0x33, 0xce, 0x7e, 0xab, 0x76, 0xcd, 0x14, 0x91,
+ 0xd8, 0x41, 0x90, 0x93, 0xa1, 0xaf, 0xf3, 0x45,
+ 0x6c, 0x1b, 0x25, 0xbd, 0x48, 0x51, 0x6d, 0x15,
+ 0x47, 0xe6, 0x23, 0x50, 0x32, 0x69, 0x1e, 0xb5,
+ 0x94, 0xd3, 0x97, 0xba, 0xd7, 0x37, 0x4a, 0xba,
+ 0xb9, 0xcd, 0xfb, 0x96, 0x9a, 0x90, 0xe0, 0x37,
+ 0xf8, 0xdf, 0x91, 0x6c, 0x62, 0x13, 0x19, 0x21,
+ 0x4b, 0xa9, 0xf1, 0x12, 0x66, 0xe2, 0x74, 0xd7,
+ 0x81, 0xa0, 0x74, 0x8d, 0x7e, 0x7e, 0xc9, 0xb1,
+ 0x69, 0x8f, 0xed, 0xb3, 0xf6, 0x97, 0xcd, 0x72,
+ 0x78, 0x93, 0xd3, 0x54, 0x6b, 0x43, 0xac, 0x29,
+ 0xb4, 0xbc, 0x7d, 0xa4, 0x26, 0x4b, 0x7b, 0xab,
+ 0xd6, 0x67, 0x22, 0xff, 0x03, 0x92, 0xb6, 0xd4,
+ 0x96, 0x94, 0x5a, 0xe5, 0x02, 0x35, 0x77, 0xfa,
+ 0x3f, 0x54, 0x1d, 0xdd, 0x35, 0x39, 0xfe, 0x03,
+ 0xdd, 0x8e, 0x3c, 0x8c, 0xc2, 0x69, 0x2a, 0xb1,
+ 0xb7, 0xb3, 0xa1, 0x89, 0x84, 0xea, 0x16, 0xe2
+};
+static const u8 key68[] __initconst = {
+ 0xd2, 0x49, 0x7f, 0xd7, 0x49, 0x66, 0x0d, 0xb3,
+ 0x5a, 0x7e, 0x3c, 0xfc, 0x37, 0x83, 0x0e, 0xf7,
+ 0x96, 0xd8, 0xd6, 0x33, 0x79, 0x2b, 0x84, 0x53,
+ 0x06, 0xbc, 0x6c, 0x0a, 0x55, 0x84, 0xfe, 0xab
+};
+enum { nonce68 = 0x6a6df7ff0a20de06ULL };
+
+static const u8 input69[] __initconst = {
+ 0xf9, 0x18, 0x4c, 0xd2, 0x3f, 0xf7, 0x22, 0xd9,
+ 0x58, 0xb6, 0x3b, 0x38, 0x69, 0x79, 0xf4, 0x71,
+ 0x5f, 0x38, 0x52, 0x1f, 0x17, 0x6f, 0x6f, 0xd9,
+ 0x09, 0x2b, 0xfb, 0x67, 0xdc, 0xc9, 0xe8, 0x4a,
+ 0x70, 0x9f, 0x2e, 0x3c, 0x06, 0xe5, 0x12, 0x20,
+ 0x25, 0x29, 0xd0, 0xdc, 0x81, 0xc5, 0xc6, 0x0f,
+ 0xd2, 0xa8, 0x81, 0x15, 0x98, 0xb2, 0x71, 0x5a,
+ 0x9a, 0xe9, 0xfb, 0xaf, 0x0e, 0x5f, 0x8a, 0xf3,
+ 0x16, 0x4a, 0x47, 0xf2, 0x5c, 0xbf, 0xda, 0x52,
+ 0x9a, 0xa6, 0x36, 0xfd, 0xc6, 0xf7, 0x66, 0x00,
+ 0xcc, 0x6c, 0xd4, 0xb3, 0x07, 0x6d, 0xeb, 0xfe,
+ 0x92, 0x71, 0x25, 0xd0, 0xcf, 0x9c, 0xe8, 0x65,
+ 0x45, 0x10, 0xcf, 0x62, 0x74, 0x7d, 0xf2, 0x1b,
+ 0x57, 0xa0, 0xf1, 0x6b, 0xa4, 0xd5, 0xfa, 0x12,
+ 0x27, 0x5a, 0xf7, 0x99, 0xfc, 0xca, 0xf3, 0xb8,
+ 0x2c, 0x8b, 0xba, 0x28, 0x74, 0xde, 0x8f, 0x78,
+ 0xa2, 0x8c, 0xaf, 0x89, 0x4b, 0x05, 0xe2, 0xf3,
+ 0xf8, 0xd2, 0xef, 0xac, 0xa4, 0xc4, 0xe2, 0xe2,
+ 0x36, 0xbb, 0x5e, 0xae, 0xe6, 0x87, 0x3d, 0x88,
+ 0x9f, 0xb8, 0x11, 0xbb, 0xcf, 0x57, 0xce, 0xd0,
+ 0xba, 0x62, 0xf4, 0xf8, 0x9b, 0x95, 0x04, 0xc9,
+ 0xcf, 0x01, 0xe9, 0xf1, 0xc8, 0xc6, 0x22, 0xa4,
+ 0xf2, 0x8b, 0x2f, 0x24, 0x0a, 0xf5, 0x6e, 0xb7,
+ 0xd4, 0x2c, 0xb6, 0xf7, 0x5c, 0x97, 0x61, 0x0b,
+ 0xd9, 0xb5, 0x06, 0xcd, 0xed, 0x3e, 0x1f, 0xc5,
+ 0xb2, 0x6c, 0xa3, 0xea, 0xb8, 0xad, 0xa6, 0x42,
+ 0x88, 0x7a, 0x52, 0xd5, 0x64, 0xba, 0xb5, 0x20,
+ 0x10, 0xa0, 0x0f, 0x0d, 0xea, 0xef, 0x5a, 0x9b,
+ 0x27, 0xb8, 0xca, 0x20, 0x19, 0x6d, 0xa8, 0xc4,
+ 0x46, 0x04, 0xb3, 0xe8, 0xf8, 0x66, 0x1b, 0x0a,
+ 0xce, 0x76, 0x5d, 0x59, 0x58, 0x05, 0xee, 0x3e,
+ 0x3c, 0x86, 0x5b, 0x49, 0x1c, 0x72, 0x18, 0x01,
+ 0x62, 0x92, 0x0f, 0x3e, 0xd1, 0x57, 0x5e, 0x20,
+ 0x7b, 0xfb, 0x4d, 0x3c, 0xc5, 0x35, 0x43, 0x2f,
+ 0xb0, 0xc5, 0x7c, 0xe4, 0xa2, 0x84, 0x13, 0x77
+};
+static const u8 output69[] __initconst = {
+ 0xbb, 0x4a, 0x7f, 0x7c, 0xd5, 0x2f, 0x89, 0x06,
+ 0xec, 0x20, 0xf1, 0x9a, 0x11, 0x09, 0x14, 0x2e,
+ 0x17, 0x50, 0xf9, 0xd5, 0xf5, 0x48, 0x7c, 0x7a,
+ 0x55, 0xc0, 0x57, 0x03, 0xe3, 0xc4, 0xb2, 0xb7,
+ 0x18, 0x47, 0x95, 0xde, 0xaf, 0x80, 0x06, 0x3c,
+ 0x5a, 0xf2, 0xc3, 0x53, 0xe3, 0x29, 0x92, 0xf8,
+ 0xff, 0x64, 0x85, 0xb9, 0xf7, 0xd3, 0x80, 0xd2,
+ 0x0c, 0x5d, 0x7b, 0x57, 0x0c, 0x51, 0x79, 0x86,
+ 0xf3, 0x20, 0xd2, 0xb8, 0x6e, 0x0c, 0x5a, 0xce,
+ 0xeb, 0x88, 0x02, 0x8b, 0x82, 0x1b, 0x7f, 0xf5,
+ 0xde, 0x7f, 0x48, 0x48, 0xdf, 0xa0, 0x55, 0xc6,
+ 0x0c, 0x22, 0xa1, 0x80, 0x8d, 0x3b, 0xcb, 0x40,
+ 0x2d, 0x3d, 0x0b, 0xf2, 0xe0, 0x22, 0x13, 0x99,
+ 0xe1, 0xa7, 0x27, 0x68, 0x31, 0xe1, 0x24, 0x5d,
+ 0xd2, 0xee, 0x16, 0xc1, 0xd7, 0xa8, 0x14, 0x19,
+ 0x23, 0x72, 0x67, 0x27, 0xdc, 0x5e, 0xb9, 0xc7,
+ 0xd8, 0xe3, 0x55, 0x50, 0x40, 0x98, 0x7b, 0xe7,
+ 0x34, 0x1c, 0x3b, 0x18, 0x14, 0xd8, 0x62, 0xc1,
+ 0x93, 0x84, 0xf3, 0x5b, 0xdd, 0x9e, 0x1f, 0x3b,
+ 0x0b, 0xbc, 0x4e, 0x5b, 0x79, 0xa3, 0xca, 0x74,
+ 0x2a, 0x98, 0xe8, 0x04, 0x39, 0xef, 0xc6, 0x76,
+ 0x6d, 0xee, 0x9f, 0x67, 0x5b, 0x59, 0x3a, 0xe5,
+ 0xf2, 0x3b, 0xca, 0x89, 0xe8, 0x9b, 0x03, 0x3d,
+ 0x11, 0xd2, 0x4a, 0x70, 0xaf, 0x88, 0xb0, 0x94,
+ 0x96, 0x26, 0xab, 0x3c, 0xc1, 0xb8, 0xe4, 0xe7,
+ 0x14, 0x61, 0x64, 0x3a, 0x61, 0x08, 0x0f, 0xa9,
+ 0xce, 0x64, 0xb2, 0x40, 0xf8, 0x20, 0x3a, 0xa9,
+ 0x31, 0xbd, 0x7e, 0x16, 0xca, 0xf5, 0x62, 0x0f,
+ 0x91, 0x9f, 0x8e, 0x1d, 0xa4, 0x77, 0xf3, 0x87,
+ 0x61, 0xe8, 0x14, 0xde, 0x18, 0x68, 0x4e, 0x9d,
+ 0x73, 0xcd, 0x8a, 0xe4, 0x80, 0x84, 0x23, 0xaa,
+ 0x9d, 0x64, 0x1c, 0x80, 0x41, 0xca, 0x82, 0x40,
+ 0x94, 0x55, 0xe3, 0x28, 0xa1, 0x97, 0x71, 0xba,
+ 0xf2, 0x2c, 0x39, 0x62, 0x29, 0x56, 0xd0, 0xff,
+ 0xb2, 0x82, 0x20, 0x59, 0x1f, 0xc3, 0x64, 0x57
+};
+static const u8 key69[] __initconst = {
+ 0x19, 0x09, 0xe9, 0x7c, 0xd9, 0x02, 0x4a, 0x0c,
+ 0x52, 0x25, 0xad, 0x5c, 0x2e, 0x8d, 0x86, 0x10,
+ 0x85, 0x2b, 0xba, 0xa4, 0x44, 0x5b, 0x39, 0x3e,
+ 0x18, 0xaa, 0xce, 0x0e, 0xe2, 0x69, 0x3c, 0xcf
+};
+enum { nonce69 = 0xdb925a1948f0f060ULL };
+
+static const u8 input70[] __initconst = {
+ 0x10, 0xe7, 0x83, 0xcf, 0x42, 0x9f, 0xf2, 0x41,
+ 0xc7, 0xe4, 0xdb, 0xf9, 0xa3, 0x02, 0x1d, 0x8d,
+ 0x50, 0x81, 0x2c, 0x6b, 0x92, 0xe0, 0x4e, 0xea,
+ 0x26, 0x83, 0x2a, 0xd0, 0x31, 0xf1, 0x23, 0xf3,
+ 0x0e, 0x88, 0x14, 0x31, 0xf9, 0x01, 0x63, 0x59,
+ 0x21, 0xd1, 0x8b, 0xdd, 0x06, 0xd0, 0xc6, 0xab,
+ 0x91, 0x71, 0x82, 0x4d, 0xd4, 0x62, 0x37, 0x17,
+ 0xf9, 0x50, 0xf9, 0xb5, 0x74, 0xce, 0x39, 0x80,
+ 0x80, 0x78, 0xf8, 0xdc, 0x1c, 0xdb, 0x7c, 0x3d,
+ 0xd4, 0x86, 0x31, 0x00, 0x75, 0x7b, 0xd1, 0x42,
+ 0x9f, 0x1b, 0x97, 0x88, 0x0e, 0x14, 0x0e, 0x1e,
+ 0x7d, 0x7b, 0xc4, 0xd2, 0xf3, 0xc1, 0x6d, 0x17,
+ 0x5d, 0xc4, 0x75, 0x54, 0x0f, 0x38, 0x65, 0x89,
+ 0xd8, 0x7d, 0xab, 0xc9, 0xa7, 0x0a, 0x21, 0x0b,
+ 0x37, 0x12, 0x05, 0x07, 0xb5, 0x68, 0x32, 0x32,
+ 0xb9, 0xf8, 0x97, 0x17, 0x03, 0xed, 0x51, 0x8f,
+ 0x3d, 0x5a, 0xd0, 0x12, 0x01, 0x6e, 0x2e, 0x91,
+ 0x1c, 0xbe, 0x6b, 0xa3, 0xcc, 0x75, 0x62, 0x06,
+ 0x8e, 0x65, 0xbb, 0xe2, 0x29, 0x71, 0x4b, 0x89,
+ 0x6a, 0x9d, 0x85, 0x8c, 0x8c, 0xdf, 0x94, 0x95,
+ 0x23, 0x66, 0xf8, 0x92, 0xee, 0x56, 0xeb, 0xb3,
+ 0xeb, 0xd2, 0x4a, 0x3b, 0x77, 0x8a, 0x6e, 0xf6,
+ 0xca, 0xd2, 0x34, 0x00, 0xde, 0xbe, 0x1d, 0x7a,
+ 0x73, 0xef, 0x2b, 0x80, 0x56, 0x16, 0x29, 0xbf,
+ 0x6e, 0x33, 0xed, 0x0d, 0xe2, 0x02, 0x60, 0x74,
+ 0xe9, 0x0a, 0xbc, 0xd1, 0xc5, 0xe8, 0x53, 0x02,
+ 0x79, 0x0f, 0x25, 0x0c, 0xef, 0xab, 0xd3, 0xbc,
+ 0xb7, 0xfc, 0xf3, 0xb0, 0x34, 0xd1, 0x07, 0xd2,
+ 0x5a, 0x31, 0x1f, 0xec, 0x1f, 0x87, 0xed, 0xdd,
+ 0x6a, 0xc1, 0xe8, 0xb3, 0x25, 0x4c, 0xc6, 0x9b,
+ 0x91, 0x73, 0xec, 0x06, 0x73, 0x9e, 0x57, 0x65,
+ 0x32, 0x75, 0x11, 0x74, 0x6e, 0xa4, 0x7d, 0x0d,
+ 0x74, 0x9f, 0x51, 0x10, 0x10, 0x47, 0xc9, 0x71,
+ 0x6e, 0x97, 0xae, 0x44, 0x41, 0xef, 0x98, 0x78,
+ 0xf4, 0xc5, 0xbd, 0x5e, 0x00, 0xe5, 0xfd, 0xe2,
+ 0xbe, 0x8c, 0xc2, 0xae, 0xc2, 0xee, 0x59, 0xf6,
+ 0xcb, 0x20, 0x54, 0x84, 0xc3, 0x31, 0x7e, 0x67,
+ 0x71, 0xb6, 0x76, 0xbe, 0x81, 0x8f, 0x82, 0xad,
+ 0x01, 0x8f, 0xc4, 0x00, 0x04, 0x3d, 0x8d, 0x34,
+ 0xaa, 0xea, 0xc0, 0xea, 0x91, 0x42, 0xb6, 0xb8,
+ 0x43, 0xf3, 0x17, 0xb2, 0x73, 0x64, 0x82, 0x97,
+ 0xd5, 0xc9, 0x07, 0x77, 0xb1, 0x26, 0xe2, 0x00,
+ 0x6a, 0xae, 0x70, 0x0b, 0xbe, 0xe6, 0xb8, 0x42,
+ 0x81, 0x55, 0xf7, 0xb8, 0x96, 0x41, 0x9d, 0xd4,
+ 0x2c, 0x27, 0x00, 0xcc, 0x91, 0x28, 0x22, 0xa4,
+ 0x7b, 0x42, 0x51, 0x9e, 0xd6, 0xec, 0xf3, 0x6b,
+ 0x00, 0xff, 0x5c, 0xa2, 0xac, 0x47, 0x33, 0x2d,
+ 0xf8, 0x11, 0x65, 0x5f, 0x4d, 0x79, 0x8b, 0x4f,
+ 0xad, 0xf0, 0x9d, 0xcd, 0xb9, 0x7b, 0x08, 0xf7,
+ 0x32, 0x51, 0xfa, 0x39, 0xaa, 0x78, 0x05, 0xb1,
+ 0xf3, 0x5d, 0xe8, 0x7c, 0x8e, 0x4f, 0xa2, 0xe0,
+ 0x98, 0x0c, 0xb2, 0xa7, 0xf0, 0x35, 0x8e, 0x70,
+ 0x7c, 0x82, 0xf3, 0x1b, 0x26, 0x28, 0x12, 0xe5,
+ 0x23, 0x57, 0xe4, 0xb4, 0x9b, 0x00, 0x39, 0x97,
+ 0xef, 0x7c, 0x46, 0x9b, 0x34, 0x6b, 0xe7, 0x0e,
+ 0xa3, 0x2a, 0x18, 0x11, 0x64, 0xc6, 0x7c, 0x8b,
+ 0x06, 0x02, 0xf5, 0x69, 0x76, 0xf9, 0xaa, 0x09,
+ 0x5f, 0x68, 0xf8, 0x4a, 0x79, 0x58, 0xec, 0x37,
+ 0xcf, 0x3a, 0xcc, 0x97, 0x70, 0x1d, 0x3e, 0x52,
+ 0x18, 0x0a, 0xad, 0x28, 0x5b, 0x3b, 0xe9, 0x03,
+ 0x84, 0xe9, 0x68, 0x50, 0xce, 0xc4, 0xbc, 0x3e,
+ 0x21, 0xad, 0x63, 0xfe, 0xc6, 0xfd, 0x6e, 0x69,
+ 0x84, 0xa9, 0x30, 0xb1, 0x7a, 0xc4, 0x31, 0x10,
+ 0xc1, 0x1f, 0x6e, 0xeb, 0xa5, 0xa6, 0x01
+};
+static const u8 output70[] __initconst = {
+ 0x0f, 0x93, 0x2a, 0x20, 0xb3, 0x87, 0x2d, 0xce,
+ 0xd1, 0x3b, 0x30, 0xfd, 0x06, 0x6d, 0x0a, 0xaa,
+ 0x3e, 0xc4, 0x29, 0x02, 0x8a, 0xde, 0xa6, 0x4b,
+ 0x45, 0x1b, 0x4f, 0x25, 0x59, 0xd5, 0x56, 0x6a,
+ 0x3b, 0x37, 0xbd, 0x3e, 0x47, 0x12, 0x2c, 0x4e,
+ 0x60, 0x5f, 0x05, 0x75, 0x61, 0x23, 0x05, 0x74,
+ 0xcb, 0xfc, 0x5a, 0xb3, 0xac, 0x5c, 0x3d, 0xab,
+ 0x52, 0x5f, 0x05, 0xbc, 0x57, 0xc0, 0x7e, 0xcf,
+ 0x34, 0x5d, 0x7f, 0x41, 0xa3, 0x17, 0x78, 0xd5,
+ 0x9f, 0xec, 0x0f, 0x1e, 0xf9, 0xfe, 0xa3, 0xbd,
+ 0x28, 0xb0, 0xba, 0x4d, 0x84, 0xdb, 0xae, 0x8f,
+ 0x1d, 0x98, 0xb7, 0xdc, 0xf9, 0xad, 0x55, 0x9c,
+ 0x89, 0xfe, 0x9b, 0x9c, 0xa9, 0x89, 0xf6, 0x97,
+ 0x9c, 0x3f, 0x09, 0x3e, 0xc6, 0x02, 0xc2, 0x55,
+ 0x58, 0x09, 0x54, 0x66, 0xe4, 0x36, 0x81, 0x35,
+ 0xca, 0x88, 0x17, 0x89, 0x80, 0x24, 0x2b, 0x21,
+ 0x89, 0xee, 0x45, 0x5a, 0xe7, 0x1f, 0xd5, 0xa5,
+ 0x16, 0xa4, 0xda, 0x70, 0x7e, 0xe9, 0x4f, 0x24,
+ 0x61, 0x97, 0xab, 0xa0, 0xe0, 0xe7, 0xb8, 0x5c,
+ 0x0f, 0x25, 0x17, 0x37, 0x75, 0x12, 0xb5, 0x40,
+ 0xde, 0x1c, 0x0d, 0x8a, 0x77, 0x62, 0x3c, 0x86,
+ 0xd9, 0x70, 0x2e, 0x96, 0x30, 0xd2, 0x55, 0xb3,
+ 0x6b, 0xc3, 0xf2, 0x9c, 0x47, 0xf3, 0x3a, 0x24,
+ 0x52, 0xc6, 0x38, 0xd8, 0x22, 0xb3, 0x0c, 0xfd,
+ 0x2f, 0xa3, 0x3c, 0xb5, 0xe8, 0x26, 0xe1, 0xa3,
+ 0xad, 0xb0, 0x82, 0x17, 0xc1, 0x53, 0xb8, 0x34,
+ 0x48, 0xee, 0x39, 0xae, 0x51, 0x43, 0xec, 0x82,
+ 0xce, 0x87, 0xc6, 0x76, 0xb9, 0x76, 0xd3, 0x53,
+ 0xfe, 0x49, 0x24, 0x7d, 0x02, 0x42, 0x2b, 0x72,
+ 0xfb, 0xcb, 0xd8, 0x96, 0x02, 0xc6, 0x9a, 0x20,
+ 0xf3, 0x5a, 0x67, 0xe8, 0x13, 0xf8, 0xb2, 0xcb,
+ 0xa2, 0xec, 0x18, 0x20, 0x4a, 0xb0, 0x73, 0x53,
+ 0x21, 0xb0, 0x77, 0x53, 0xd8, 0x76, 0xa1, 0x30,
+ 0x17, 0x72, 0x2e, 0x33, 0x5f, 0x33, 0x6b, 0x28,
+ 0xfb, 0xb0, 0xf4, 0xec, 0x8e, 0xed, 0x20, 0x7d,
+ 0x57, 0x8c, 0x74, 0x28, 0x64, 0x8b, 0xeb, 0x59,
+ 0x38, 0x3f, 0xe7, 0x83, 0x2e, 0xe5, 0x64, 0x4d,
+ 0x5c, 0x1f, 0xe1, 0x3b, 0xd9, 0x84, 0xdb, 0xc9,
+ 0xec, 0xd8, 0xc1, 0x7c, 0x1f, 0x1b, 0x68, 0x35,
+ 0xc6, 0x34, 0x10, 0xef, 0x19, 0xc9, 0x0a, 0xd6,
+ 0x43, 0x7f, 0xa6, 0xcb, 0x9d, 0xf4, 0xf0, 0x16,
+ 0xb1, 0xb1, 0x96, 0x64, 0xec, 0x8d, 0x22, 0x4c,
+ 0x4b, 0xe8, 0x1a, 0xba, 0x6f, 0xb7, 0xfc, 0xa5,
+ 0x69, 0x3e, 0xad, 0x78, 0x79, 0x19, 0xb5, 0x04,
+ 0x69, 0xe5, 0x3f, 0xff, 0x60, 0x8c, 0xda, 0x0b,
+ 0x7b, 0xf7, 0xe7, 0xe6, 0x29, 0x3a, 0x85, 0xba,
+ 0xb5, 0xb0, 0x35, 0xbd, 0x38, 0xce, 0x34, 0x5e,
+ 0xf2, 0xdc, 0xd1, 0x8f, 0xc3, 0x03, 0x24, 0xa2,
+ 0x03, 0xf7, 0x4e, 0x49, 0x5b, 0xcf, 0x6d, 0xb0,
+ 0xeb, 0xe3, 0x30, 0x28, 0xd5, 0x5b, 0x82, 0x5f,
+ 0xe4, 0x7c, 0x1e, 0xec, 0xd2, 0x39, 0xf9, 0x6f,
+ 0x2e, 0xb3, 0xcd, 0x01, 0xb1, 0x67, 0xaa, 0xea,
+ 0xaa, 0xb3, 0x63, 0xaf, 0xd9, 0xb2, 0x1f, 0xba,
+ 0x05, 0x20, 0xeb, 0x19, 0x32, 0xf0, 0x6c, 0x3f,
+ 0x40, 0xcc, 0x93, 0xb3, 0xd8, 0x25, 0xa6, 0xe4,
+ 0xce, 0xd7, 0x7e, 0x48, 0x99, 0x65, 0x7f, 0x86,
+ 0xc5, 0xd4, 0x79, 0x6b, 0xab, 0x43, 0xb8, 0x6b,
+ 0xf1, 0x2f, 0xea, 0x4c, 0x5e, 0xf0, 0x3b, 0xb4,
+ 0xb8, 0xb0, 0x94, 0x0c, 0x6b, 0xe7, 0x22, 0x93,
+ 0xaa, 0x01, 0xcb, 0xf1, 0x11, 0x60, 0xf6, 0x69,
+ 0xcf, 0x14, 0xde, 0xfb, 0x90, 0x05, 0x27, 0x0c,
+ 0x1a, 0x9e, 0xf0, 0xb4, 0xc6, 0xa1, 0xe8, 0xdd,
+ 0xd0, 0x4c, 0x25, 0x4f, 0x9c, 0xb7, 0xb1, 0xb0,
+ 0x21, 0xdb, 0x87, 0x09, 0x03, 0xf2, 0xb3
+};
+static const u8 key70[] __initconst = {
+ 0x3b, 0x5b, 0x59, 0x36, 0x44, 0xd1, 0xba, 0x71,
+ 0x55, 0x87, 0x4d, 0x62, 0x3d, 0xc2, 0xfc, 0xaa,
+ 0x3f, 0x4e, 0x1a, 0xe4, 0xca, 0x09, 0xfc, 0x6a,
+ 0xb2, 0xd6, 0x5d, 0x79, 0xf9, 0x1a, 0x91, 0xa7
+};
+enum { nonce70 = 0x3fd6786dd147a85ULL };
+
+static const u8 input71[] __initconst = {
+ 0x18, 0x78, 0xd6, 0x79, 0xe4, 0x9a, 0x6c, 0x73,
+ 0x17, 0xd4, 0x05, 0x0f, 0x1e, 0x9f, 0xd9, 0x2b,
+ 0x86, 0x48, 0x7d, 0xf4, 0xd9, 0x1c, 0x76, 0xfc,
+ 0x8e, 0x22, 0x34, 0xe1, 0x48, 0x4a, 0x8d, 0x79,
+ 0xb7, 0xbb, 0x88, 0xab, 0x90, 0xde, 0xc5, 0xb4,
+ 0xb4, 0xe7, 0x85, 0x49, 0xda, 0x57, 0xeb, 0xc9,
+ 0xcd, 0x21, 0xfc, 0x45, 0x6e, 0x32, 0x67, 0xf2,
+ 0x4f, 0xa6, 0x54, 0xe5, 0x20, 0xed, 0xcf, 0xc6,
+ 0x62, 0x25, 0x8e, 0x00, 0xf8, 0x6b, 0xa2, 0x80,
+ 0xac, 0x88, 0xa6, 0x59, 0x27, 0x83, 0x95, 0x11,
+ 0x3f, 0x70, 0x5e, 0x3f, 0x11, 0xfb, 0x26, 0xbf,
+ 0xe1, 0x48, 0x75, 0xf9, 0x86, 0xbf, 0xa6, 0x5d,
+ 0x15, 0x61, 0x66, 0xbf, 0x78, 0x8f, 0x6b, 0x9b,
+ 0xda, 0x98, 0xb7, 0x19, 0xe2, 0xf2, 0xa3, 0x9c,
+ 0x7c, 0x6a, 0x9a, 0xd8, 0x3d, 0x4c, 0x2c, 0xe1,
+ 0x09, 0xb4, 0x28, 0x82, 0x4e, 0xab, 0x0c, 0x75,
+ 0x63, 0xeb, 0xbc, 0xd0, 0x71, 0xa2, 0x73, 0x85,
+ 0xed, 0x53, 0x7a, 0x3f, 0x68, 0x9f, 0xd0, 0xa9,
+ 0x00, 0x5a, 0x9e, 0x80, 0x55, 0x00, 0xe6, 0xae,
+ 0x0c, 0x03, 0x40, 0xed, 0xfc, 0x68, 0x4a, 0xb7,
+ 0x1e, 0x09, 0x65, 0x30, 0x5a, 0x3d, 0x97, 0x4d,
+ 0x5e, 0x51, 0x8e, 0xda, 0xc3, 0x55, 0x8c, 0xfb,
+ 0xcf, 0x83, 0x05, 0x35, 0x0d, 0x08, 0x1b, 0xf3,
+ 0x3a, 0x57, 0x96, 0xac, 0x58, 0x8b, 0xfa, 0x00,
+ 0x49, 0x15, 0x78, 0xd2, 0x4b, 0xed, 0xb8, 0x59,
+ 0x78, 0x9b, 0x7f, 0xaa, 0xfc, 0xe7, 0x46, 0xdc,
+ 0x7b, 0x34, 0xd0, 0x34, 0xe5, 0x10, 0xff, 0x4d,
+ 0x5a, 0x4d, 0x60, 0xa7, 0x16, 0x54, 0xc4, 0xfd,
+ 0xca, 0x5d, 0x68, 0xc7, 0x4a, 0x01, 0x8d, 0x7f,
+ 0x74, 0x5d, 0xff, 0xb8, 0x37, 0x15, 0x62, 0xfa,
+ 0x44, 0x45, 0xcf, 0x77, 0x3b, 0x1d, 0xb2, 0xd2,
+ 0x0d, 0x42, 0x00, 0x39, 0x68, 0x1f, 0xcc, 0x89,
+ 0x73, 0x5d, 0xa9, 0x2e, 0xfd, 0x58, 0x62, 0xca,
+ 0x35, 0x8e, 0x70, 0x70, 0xaa, 0x6e, 0x14, 0xe9,
+ 0xa4, 0xe2, 0x10, 0x66, 0x71, 0xdc, 0x4c, 0xfc,
+ 0xa9, 0xdc, 0x8f, 0x57, 0x4d, 0xc5, 0xac, 0xd7,
+ 0xa9, 0xf3, 0xf3, 0xa1, 0xff, 0x62, 0xa0, 0x8f,
+ 0xe4, 0x96, 0x3e, 0xcb, 0x9f, 0x76, 0x42, 0x39,
+ 0x1f, 0x24, 0xfd, 0xfd, 0x79, 0xe8, 0x27, 0xdf,
+ 0xa8, 0xf6, 0x33, 0x8b, 0x31, 0x59, 0x69, 0xcf,
+ 0x6a, 0xef, 0x89, 0x4d, 0xa7, 0xf6, 0x7e, 0x97,
+ 0x14, 0xbd, 0xda, 0xdd, 0xb4, 0x84, 0x04, 0x24,
+ 0xe0, 0x17, 0xe1, 0x0f, 0x1f, 0x8a, 0x6a, 0x71,
+ 0x74, 0x41, 0xdc, 0x59, 0x5c, 0x8f, 0x01, 0x25,
+ 0x92, 0xf0, 0x2e, 0x15, 0x62, 0x71, 0x9a, 0x9f,
+ 0x87, 0xdf, 0x62, 0x49, 0x7f, 0x86, 0x62, 0xfc,
+ 0x20, 0x84, 0xd7, 0xe3, 0x3a, 0xd9, 0x37, 0x85,
+ 0xb7, 0x84, 0x5a, 0xf9, 0xed, 0x21, 0x32, 0x94,
+ 0x3e, 0x04, 0xe7, 0x8c, 0x46, 0x76, 0x21, 0x67,
+ 0xf6, 0x95, 0x64, 0x92, 0xb7, 0x15, 0xf6, 0xe3,
+ 0x41, 0x27, 0x9d, 0xd7, 0xe3, 0x79, 0x75, 0x92,
+ 0xd0, 0xc1, 0xf3, 0x40, 0x92, 0x08, 0xde, 0x90,
+ 0x22, 0x82, 0xb2, 0x69, 0xae, 0x1a, 0x35, 0x11,
+ 0x89, 0xc8, 0x06, 0x82, 0x95, 0x23, 0x44, 0x08,
+ 0x22, 0xf2, 0x71, 0x73, 0x1b, 0x88, 0x11, 0xcf,
+ 0x1c, 0x7e, 0x8a, 0x2e, 0xdc, 0x79, 0x57, 0xce,
+ 0x1f, 0xe7, 0x6c, 0x07, 0xd8, 0x06, 0xbe, 0xec,
+ 0xa3, 0xcf, 0xf9, 0x68, 0xa5, 0xb8, 0xf0, 0xe3,
+ 0x3f, 0x01, 0x92, 0xda, 0xf1, 0xa0, 0x2d, 0x7b,
+ 0xab, 0x57, 0x58, 0x2a, 0xaf, 0xab, 0xbd, 0xf2,
+ 0xe5, 0xaf, 0x7e, 0x1f, 0x46, 0x24, 0x9e, 0x20,
+ 0x22, 0x0f, 0x84, 0x4c, 0xb7, 0xd8, 0x03, 0xe8,
+ 0x09, 0x73, 0x6c, 0xc6, 0x9b, 0x90, 0xe0, 0xdb,
+ 0xf2, 0x71, 0xba, 0xad, 0xb3, 0xec, 0xda, 0x7a
+};
+static const u8 output71[] __initconst = {
+ 0x28, 0xc5, 0x9b, 0x92, 0xf9, 0x21, 0x4f, 0xbb,
+ 0xef, 0x3b, 0xf0, 0xf5, 0x3a, 0x6d, 0x7f, 0xd6,
+ 0x6a, 0x8d, 0xa1, 0x01, 0x5c, 0x62, 0x20, 0x8b,
+ 0x5b, 0x39, 0xd5, 0xd3, 0xc2, 0xf6, 0x9d, 0x5e,
+ 0xcc, 0xe1, 0xa2, 0x61, 0x16, 0xe2, 0xce, 0xe9,
+ 0x86, 0xd0, 0xfc, 0xce, 0x9a, 0x28, 0x27, 0xc4,
+ 0x0c, 0xb9, 0xaa, 0x8d, 0x48, 0xdb, 0xbf, 0x82,
+ 0x7d, 0xd0, 0x35, 0xc4, 0x06, 0x34, 0xb4, 0x19,
+ 0x51, 0x73, 0xf4, 0x7a, 0xf4, 0xfd, 0xe9, 0x1d,
+ 0xdc, 0x0f, 0x7e, 0xf7, 0x96, 0x03, 0xe3, 0xb1,
+ 0x2e, 0x22, 0x59, 0xb7, 0x6d, 0x1c, 0x97, 0x8c,
+ 0xd7, 0x31, 0x08, 0x26, 0x4c, 0x6d, 0xc6, 0x14,
+ 0xa5, 0xeb, 0x45, 0x6a, 0x88, 0xa3, 0xa2, 0x36,
+ 0xc4, 0x35, 0xb1, 0x5a, 0xa0, 0xad, 0xf7, 0x06,
+ 0x9b, 0x5d, 0xc1, 0x15, 0xc1, 0xce, 0x0a, 0xb0,
+ 0x57, 0x2e, 0x3f, 0x6f, 0x0d, 0x10, 0xd9, 0x11,
+ 0x2c, 0x9c, 0xad, 0x2d, 0xa5, 0x81, 0xfb, 0x4e,
+ 0x8f, 0xd5, 0x32, 0x4e, 0xaf, 0x5c, 0xc1, 0x86,
+ 0xde, 0x56, 0x5a, 0x33, 0x29, 0xf7, 0x67, 0xc6,
+ 0x37, 0x6f, 0xb2, 0x37, 0x4e, 0xd4, 0x69, 0x79,
+ 0xaf, 0xd5, 0x17, 0x79, 0xe0, 0xba, 0x62, 0xa3,
+ 0x68, 0xa4, 0x87, 0x93, 0x8d, 0x7e, 0x8f, 0xa3,
+ 0x9c, 0xef, 0xda, 0xe3, 0xa5, 0x1f, 0xcd, 0x30,
+ 0xa6, 0x55, 0xac, 0x4c, 0x69, 0x74, 0x02, 0xc7,
+ 0x5d, 0x95, 0x81, 0x4a, 0x68, 0x11, 0xd3, 0xa9,
+ 0x98, 0xb1, 0x0b, 0x0d, 0xae, 0x40, 0x86, 0x65,
+ 0xbf, 0xcc, 0x2d, 0xef, 0x57, 0xca, 0x1f, 0xe4,
+ 0x34, 0x4e, 0xa6, 0x5e, 0x82, 0x6e, 0x61, 0xad,
+ 0x0b, 0x3c, 0xf8, 0xeb, 0x01, 0x43, 0x7f, 0x87,
+ 0xa2, 0xa7, 0x6a, 0xe9, 0x62, 0x23, 0x24, 0x61,
+ 0xf1, 0xf7, 0x36, 0xdb, 0x10, 0xe5, 0x57, 0x72,
+ 0x3a, 0xc2, 0xae, 0xcc, 0x75, 0xc7, 0x80, 0x05,
+ 0x0a, 0x5c, 0x4c, 0x95, 0xda, 0x02, 0x01, 0x14,
+ 0x06, 0x6b, 0x5c, 0x65, 0xc2, 0xb8, 0x4a, 0xd6,
+ 0xd3, 0xb4, 0xd8, 0x12, 0x52, 0xb5, 0x60, 0xd3,
+ 0x8e, 0x5f, 0x5c, 0x76, 0x33, 0x7a, 0x05, 0xe5,
+ 0xcb, 0xef, 0x4f, 0x89, 0xf1, 0xba, 0x32, 0x6f,
+ 0x33, 0xcd, 0x15, 0x8d, 0xa3, 0x0c, 0x3f, 0x63,
+ 0x11, 0xe7, 0x0e, 0xe0, 0x00, 0x01, 0xe9, 0xe8,
+ 0x8e, 0x36, 0x34, 0x8d, 0x96, 0xb5, 0x03, 0xcf,
+ 0x55, 0x62, 0x49, 0x7a, 0x34, 0x44, 0xa5, 0xee,
+ 0x8c, 0x46, 0x06, 0x22, 0xab, 0x1d, 0x53, 0x9c,
+ 0xa1, 0xf9, 0x67, 0x18, 0x57, 0x89, 0xf9, 0xc2,
+ 0xd1, 0x7e, 0xbe, 0x36, 0x40, 0xcb, 0xe9, 0x04,
+ 0xde, 0xb1, 0x3b, 0x29, 0x52, 0xc5, 0x9a, 0xb5,
+ 0xa2, 0x7c, 0x7b, 0xfe, 0xe5, 0x92, 0x73, 0xea,
+ 0xea, 0x7b, 0xba, 0x0a, 0x8c, 0x88, 0x15, 0xe6,
+ 0x53, 0xbf, 0x1c, 0x33, 0xf4, 0x9b, 0x9a, 0x5e,
+ 0x8d, 0xae, 0x60, 0xdc, 0xcb, 0x5d, 0xfa, 0xbe,
+ 0x06, 0xc3, 0x3f, 0x06, 0xe7, 0x00, 0x40, 0x7b,
+ 0xaa, 0x94, 0xfa, 0x6d, 0x1f, 0xe4, 0xc5, 0xa9,
+ 0x1b, 0x5f, 0x36, 0xea, 0x5a, 0xdd, 0xa5, 0x48,
+ 0x6a, 0x55, 0xd2, 0x47, 0x28, 0xbf, 0x96, 0xf1,
+ 0x9f, 0xb6, 0x11, 0x4b, 0xd3, 0x44, 0x7d, 0x48,
+ 0x41, 0x61, 0xdb, 0x12, 0xd4, 0xc2, 0x59, 0x82,
+ 0x4c, 0x47, 0x5c, 0x04, 0xf6, 0x7b, 0xd3, 0x92,
+ 0x2e, 0xe8, 0x40, 0xef, 0x15, 0x32, 0x97, 0xdc,
+ 0x35, 0x4c, 0x6e, 0xa4, 0x97, 0xe9, 0x24, 0xde,
+ 0x63, 0x8b, 0xb1, 0x6b, 0x48, 0xbb, 0x46, 0x1f,
+ 0x84, 0xd6, 0x17, 0xb0, 0x5a, 0x4a, 0x4e, 0xd5,
+ 0x31, 0xd7, 0xcf, 0xa0, 0x39, 0xc6, 0x2e, 0xfc,
+ 0xa6, 0xa3, 0xd3, 0x0f, 0xa4, 0x28, 0xac, 0xb2,
+ 0xf4, 0x48, 0x8d, 0x50, 0xa5, 0x1c, 0x44, 0x5d,
+ 0x6e, 0x38, 0xb7, 0x2b, 0x8a, 0x45, 0xa7, 0x3d
+};
+static const u8 key71[] __initconst = {
+ 0x8b, 0x68, 0xc4, 0xb7, 0x0d, 0x81, 0xef, 0x52,
+ 0x1e, 0x05, 0x96, 0x72, 0x62, 0x89, 0x27, 0x83,
+ 0xd0, 0xc7, 0x33, 0x6d, 0xf2, 0xcc, 0x69, 0xf9,
+ 0x23, 0xae, 0x99, 0xb1, 0xd1, 0x05, 0x4e, 0x54
+};
+enum { nonce71 = 0x983f03656d64b5f6ULL };
+
+static const u8 input72[] __initconst = {
+ 0x6b, 0x09, 0xc9, 0x57, 0x3d, 0x79, 0x04, 0x8c,
+ 0x65, 0xad, 0x4a, 0x0f, 0xa1, 0x31, 0x3a, 0xdd,
+ 0x14, 0x8e, 0xe8, 0xfe, 0xbf, 0x42, 0x87, 0x98,
+ 0x2e, 0x8d, 0x83, 0xa3, 0xf8, 0x55, 0x3d, 0x84,
+ 0x1e, 0x0e, 0x05, 0x4a, 0x38, 0x9e, 0xe7, 0xfe,
+ 0xd0, 0x4d, 0x79, 0x74, 0x3a, 0x0b, 0x9b, 0xe1,
+ 0xfd, 0x51, 0x84, 0x4e, 0xb2, 0x25, 0xe4, 0x64,
+ 0x4c, 0xda, 0xcf, 0x46, 0xec, 0xba, 0x12, 0xeb,
+ 0x5a, 0x33, 0x09, 0x6e, 0x78, 0x77, 0x8f, 0x30,
+ 0xb1, 0x7d, 0x3f, 0x60, 0x8c, 0xf2, 0x1d, 0x8e,
+ 0xb4, 0x70, 0xa2, 0x90, 0x7c, 0x79, 0x1a, 0x2c,
+ 0xf6, 0x28, 0x79, 0x7c, 0x53, 0xc5, 0xfa, 0xcc,
+ 0x65, 0x9b, 0xe1, 0x51, 0xd1, 0x7f, 0x1d, 0xc4,
+ 0xdb, 0xd4, 0xd9, 0x04, 0x61, 0x7d, 0xbe, 0x12,
+ 0xfc, 0xcd, 0xaf, 0xe4, 0x0f, 0x9c, 0x20, 0xb5,
+ 0x22, 0x40, 0x18, 0xda, 0xe4, 0xda, 0x8c, 0x2d,
+ 0x84, 0xe3, 0x5f, 0x53, 0x17, 0xed, 0x78, 0xdc,
+ 0x2f, 0xe8, 0x31, 0xc7, 0xe6, 0x39, 0x71, 0x40,
+ 0xb4, 0x0f, 0xc9, 0xa9, 0x7e, 0x78, 0x87, 0xc1,
+ 0x05, 0x78, 0xbb, 0x01, 0xf2, 0x8f, 0x33, 0xb0,
+ 0x6e, 0x84, 0xcd, 0x36, 0x33, 0x5c, 0x5b, 0x8e,
+ 0xf1, 0xac, 0x30, 0xfe, 0x33, 0xec, 0x08, 0xf3,
+ 0x7e, 0xf2, 0xf0, 0x4c, 0xf2, 0xad, 0xd8, 0xc1,
+ 0xd4, 0x4e, 0x87, 0x06, 0xd4, 0x75, 0xe7, 0xe3,
+ 0x09, 0xd3, 0x4d, 0xe3, 0x21, 0x32, 0xba, 0xb4,
+ 0x68, 0x68, 0xcb, 0x4c, 0xa3, 0x1e, 0xb3, 0x87,
+ 0x7b, 0xd3, 0x0c, 0x63, 0x37, 0x71, 0x79, 0xfb,
+ 0x58, 0x36, 0x57, 0x0f, 0x34, 0x1d, 0xc1, 0x42,
+ 0x02, 0x17, 0xe7, 0xed, 0xe8, 0xe7, 0x76, 0xcb,
+ 0x42, 0xc4, 0x4b, 0xe2, 0xb2, 0x5e, 0x42, 0xd5,
+ 0xec, 0x9d, 0xc1, 0x32, 0x71, 0xe4, 0xeb, 0x10,
+ 0x68, 0x1a, 0x6e, 0x99, 0x8e, 0x73, 0x12, 0x1f,
+ 0x97, 0x0c, 0x9e, 0xcd, 0x02, 0x3e, 0x4c, 0xa0,
+ 0xf2, 0x8d, 0xe5, 0x44, 0xca, 0x6d, 0xfe, 0x07,
+ 0xe3, 0xe8, 0x9b, 0x76, 0xc1, 0x6d, 0xb7, 0x6e,
+ 0x0d, 0x14, 0x00, 0x6f, 0x8a, 0xfd, 0x43, 0xc6,
+ 0x43, 0xa5, 0x9c, 0x02, 0x47, 0x10, 0xd4, 0xb4,
+ 0x9b, 0x55, 0x67, 0xc8, 0x7f, 0xc1, 0x8a, 0x1f,
+ 0x1e, 0xd1, 0xbc, 0x99, 0x5d, 0x50, 0x4f, 0x89,
+ 0xf1, 0xe6, 0x5d, 0x91, 0x40, 0xdc, 0x20, 0x67,
+ 0x56, 0xc2, 0xef, 0xbd, 0x2c, 0xa2, 0x99, 0x38,
+ 0xe0, 0x45, 0xec, 0x44, 0x05, 0x52, 0x65, 0x11,
+ 0xfc, 0x3b, 0x19, 0xcb, 0x71, 0xc2, 0x8e, 0x0e,
+ 0x03, 0x2a, 0x03, 0x3b, 0x63, 0x06, 0x31, 0x9a,
+ 0xac, 0x53, 0x04, 0x14, 0xd4, 0x80, 0x9d, 0x6b,
+ 0x42, 0x7e, 0x7e, 0x4e, 0xdc, 0xc7, 0x01, 0x49,
+ 0x9f, 0xf5, 0x19, 0x86, 0x13, 0x28, 0x2b, 0xa6,
+ 0xa6, 0xbe, 0xa1, 0x7e, 0x71, 0x05, 0x00, 0xff,
+ 0x59, 0x2d, 0xb6, 0x63, 0xf0, 0x1e, 0x2e, 0x69,
+ 0x9b, 0x85, 0xf1, 0x1e, 0x8a, 0x64, 0x39, 0xab,
+ 0x00, 0x12, 0xe4, 0x33, 0x4b, 0xb5, 0xd8, 0xb3,
+ 0x6b, 0x5b, 0x8b, 0x5c, 0xd7, 0x6f, 0x23, 0xcf,
+ 0x3f, 0x2e, 0x5e, 0x47, 0xb9, 0xb8, 0x1f, 0xf0,
+ 0x1d, 0xda, 0xe7, 0x4f, 0x6e, 0xab, 0xc3, 0x36,
+ 0xb4, 0x74, 0x6b, 0xeb, 0xc7, 0x5d, 0x91, 0xe5,
+ 0xda, 0xf2, 0xc2, 0x11, 0x17, 0x48, 0xf8, 0x9c,
+ 0xc9, 0x8b, 0xc1, 0xa2, 0xf4, 0xcd, 0x16, 0xf8,
+ 0x27, 0xd9, 0x6c, 0x6f, 0xb5, 0x8f, 0x77, 0xca,
+ 0x1b, 0xd8, 0xef, 0x84, 0x68, 0x71, 0x53, 0xc1,
+ 0x43, 0x0f, 0x9f, 0x98, 0xae, 0x7e, 0x31, 0xd2,
+ 0x98, 0xfb, 0x20, 0xa2, 0xad, 0x00, 0x10, 0x83,
+ 0x00, 0x8b, 0xeb, 0x56, 0xd2, 0xc4, 0xcc, 0x7f,
+ 0x2f, 0x4e, 0xfa, 0x88, 0x13, 0xa4, 0x2c, 0xde,
+ 0x6b, 0x77, 0x86, 0x10, 0x6a, 0xab, 0x43, 0x0a,
+ 0x02
+};
+static const u8 output72[] __initconst = {
+ 0x42, 0x89, 0xa4, 0x80, 0xd2, 0xcb, 0x5f, 0x7f,
+ 0x2a, 0x1a, 0x23, 0x00, 0xa5, 0x6a, 0x95, 0xa3,
+ 0x9a, 0x41, 0xa1, 0xd0, 0x2d, 0x1e, 0xd6, 0x13,
+ 0x34, 0x40, 0x4e, 0x7f, 0x1a, 0xbe, 0xa0, 0x3d,
+ 0x33, 0x9c, 0x56, 0x2e, 0x89, 0x25, 0x45, 0xf9,
+ 0xf0, 0xba, 0x9c, 0x6d, 0xd1, 0xd1, 0xde, 0x51,
+ 0x47, 0x63, 0xc9, 0xbd, 0xfa, 0xa2, 0x9e, 0xad,
+ 0x6a, 0x7b, 0x21, 0x1a, 0x6c, 0x3e, 0xff, 0x46,
+ 0xbe, 0xf3, 0x35, 0x7a, 0x6e, 0xb3, 0xb9, 0xf7,
+ 0xda, 0x5e, 0xf0, 0x14, 0xb5, 0x70, 0xa4, 0x2b,
+ 0xdb, 0xbb, 0xc7, 0x31, 0x4b, 0x69, 0x5a, 0x83,
+ 0x70, 0xd9, 0x58, 0xd4, 0x33, 0x84, 0x23, 0xf0,
+ 0xae, 0xbb, 0x6d, 0x26, 0x7c, 0xc8, 0x30, 0xf7,
+ 0x24, 0xad, 0xbd, 0xe4, 0x2c, 0x38, 0x38, 0xac,
+ 0xe1, 0x4a, 0x9b, 0xac, 0x33, 0x0e, 0x4a, 0xf4,
+ 0x93, 0xed, 0x07, 0x82, 0x81, 0x4f, 0x8f, 0xb1,
+ 0xdd, 0x73, 0xd5, 0x50, 0x6d, 0x44, 0x1e, 0xbe,
+ 0xa7, 0xcd, 0x17, 0x57, 0xd5, 0x3b, 0x62, 0x36,
+ 0xcf, 0x7d, 0xc8, 0xd8, 0xd1, 0x78, 0xd7, 0x85,
+ 0x46, 0x76, 0x5d, 0xcc, 0xfe, 0xe8, 0x94, 0xc5,
+ 0xad, 0xbc, 0x5e, 0xbc, 0x8d, 0x1d, 0xdf, 0x03,
+ 0xc9, 0x6b, 0x1b, 0x81, 0xd1, 0xb6, 0x5a, 0x24,
+ 0xe3, 0xdc, 0x3f, 0x20, 0xc9, 0x07, 0x73, 0x4c,
+ 0x43, 0x13, 0x87, 0x58, 0x34, 0x0d, 0x14, 0x63,
+ 0x0f, 0x6f, 0xad, 0x8d, 0xac, 0x7c, 0x67, 0x68,
+ 0xa3, 0x9d, 0x7f, 0x00, 0xdf, 0x28, 0xee, 0x67,
+ 0xf4, 0x5c, 0x26, 0xcb, 0xef, 0x56, 0x71, 0xc8,
+ 0xc6, 0x67, 0x5f, 0x38, 0xbb, 0xa0, 0xb1, 0x5c,
+ 0x1f, 0xb3, 0x08, 0xd9, 0x38, 0xcf, 0x74, 0x54,
+ 0xc6, 0xa4, 0xc4, 0xc0, 0x9f, 0xb3, 0xd0, 0xda,
+ 0x62, 0x67, 0x8b, 0x81, 0x33, 0xf0, 0xa9, 0x73,
+ 0xa4, 0xd1, 0x46, 0x88, 0x8d, 0x85, 0x12, 0x40,
+ 0xba, 0x1a, 0xcd, 0x82, 0xd8, 0x8d, 0xc4, 0x52,
+ 0xe7, 0x01, 0x94, 0x2e, 0x0e, 0xd0, 0xaf, 0xe7,
+ 0x2d, 0x3f, 0x3c, 0xaa, 0xf4, 0xf5, 0xa7, 0x01,
+ 0x4c, 0x14, 0xe2, 0xc2, 0x96, 0x76, 0xbe, 0x05,
+ 0xaa, 0x19, 0xb1, 0xbd, 0x95, 0xbb, 0x5a, 0xf9,
+ 0xa5, 0xa7, 0xe6, 0x16, 0x38, 0x34, 0xf7, 0x9d,
+ 0x19, 0x66, 0x16, 0x8e, 0x7f, 0x2b, 0x5a, 0xfb,
+ 0xb5, 0x29, 0x79, 0xbf, 0x52, 0xae, 0x30, 0x95,
+ 0x3f, 0x31, 0x33, 0x28, 0xde, 0xc5, 0x0d, 0x55,
+ 0x89, 0xec, 0x21, 0x11, 0x0f, 0x8b, 0xfe, 0x63,
+ 0x3a, 0xf1, 0x95, 0x5c, 0xcd, 0x50, 0xe4, 0x5d,
+ 0x8f, 0xa7, 0xc8, 0xca, 0x93, 0xa0, 0x67, 0x82,
+ 0x63, 0x5c, 0xd0, 0xed, 0xe7, 0x08, 0xc5, 0x60,
+ 0xf8, 0xb4, 0x47, 0xf0, 0x1a, 0x65, 0x4e, 0xa3,
+ 0x51, 0x68, 0xc7, 0x14, 0xa1, 0xd9, 0x39, 0x72,
+ 0xa8, 0x6f, 0x7c, 0x7e, 0xf6, 0x03, 0x0b, 0x25,
+ 0x9b, 0xf2, 0xca, 0x49, 0xae, 0x5b, 0xf8, 0x0f,
+ 0x71, 0x51, 0x01, 0xa6, 0x23, 0xa9, 0xdf, 0xd0,
+ 0x7a, 0x39, 0x19, 0xf5, 0xc5, 0x26, 0x44, 0x7b,
+ 0x0a, 0x4a, 0x41, 0xbf, 0xf2, 0x8e, 0x83, 0x50,
+ 0x91, 0x96, 0x72, 0x02, 0xf6, 0x80, 0xbf, 0x95,
+ 0x41, 0xac, 0xda, 0xb0, 0xba, 0xe3, 0x76, 0xb1,
+ 0x9d, 0xff, 0x1f, 0x33, 0x02, 0x85, 0xfc, 0x2a,
+ 0x29, 0xe6, 0xe3, 0x9d, 0xd0, 0xef, 0xc2, 0xd6,
+ 0x9c, 0x4a, 0x62, 0xac, 0xcb, 0xea, 0x8b, 0xc3,
+ 0x08, 0x6e, 0x49, 0x09, 0x26, 0x19, 0xc1, 0x30,
+ 0xcc, 0x27, 0xaa, 0xc6, 0x45, 0x88, 0xbd, 0xae,
+ 0xd6, 0x79, 0xff, 0x4e, 0xfc, 0x66, 0x4d, 0x02,
+ 0xa5, 0xee, 0x8e, 0xa5, 0xb6, 0x15, 0x72, 0x24,
+ 0xb1, 0xbf, 0xbf, 0x64, 0xcf, 0xcc, 0x93, 0xe9,
+ 0xb6, 0xfd, 0xb4, 0xb6, 0x21, 0xb5, 0x48, 0x08,
+ 0x0f, 0x11, 0x65, 0xe1, 0x47, 0xee, 0x93, 0x29,
+ 0xad
+};
+static const u8 key72[] __initconst = {
+ 0xb9, 0xa2, 0xfc, 0x59, 0x06, 0x3f, 0x77, 0xa5,
+ 0x66, 0xd0, 0x2b, 0x22, 0x74, 0x22, 0x4c, 0x1e,
+ 0x6a, 0x39, 0xdf, 0xe1, 0x0d, 0x4c, 0x64, 0x99,
+ 0x54, 0x8a, 0xba, 0x1d, 0x2c, 0x21, 0x5f, 0xc3
+};
+enum { nonce72 = 0x3d069308fa3db04bULL };
+
+static const u8 input73[] __initconst = {
+ 0xe4, 0xdd, 0x36, 0xd4, 0xf5, 0x70, 0x51, 0x73,
+ 0x97, 0x1d, 0x45, 0x05, 0x92, 0xe7, 0xeb, 0xb7,
+ 0x09, 0x82, 0x6e, 0x25, 0x6c, 0x50, 0xf5, 0x40,
+ 0x19, 0xba, 0xbc, 0xf4, 0x39, 0x14, 0xc5, 0x15,
+ 0x83, 0x40, 0xbd, 0x26, 0xe0, 0xff, 0x3b, 0x22,
+ 0x7c, 0x7c, 0xd7, 0x0b, 0xe9, 0x25, 0x0c, 0x3d,
+ 0x92, 0x38, 0xbe, 0xe4, 0x22, 0x75, 0x65, 0xf1,
+ 0x03, 0x85, 0x34, 0x09, 0xb8, 0x77, 0xfb, 0x48,
+ 0xb1, 0x2e, 0x21, 0x67, 0x9b, 0x9d, 0xad, 0x18,
+ 0x82, 0x0d, 0x6b, 0xc3, 0xcf, 0x00, 0x61, 0x6e,
+ 0xda, 0xdc, 0xa7, 0x0b, 0x5c, 0x02, 0x1d, 0xa6,
+ 0x4e, 0x0d, 0x7f, 0x37, 0x01, 0x5a, 0x37, 0xf3,
+ 0x2b, 0xbf, 0xba, 0xe2, 0x1c, 0xb3, 0xa3, 0xbc,
+ 0x1c, 0x93, 0x1a, 0xb1, 0x71, 0xaf, 0xe2, 0xdd,
+ 0x17, 0xee, 0x53, 0xfa, 0xfb, 0x02, 0x40, 0x3e,
+ 0x03, 0xca, 0xe7, 0xc3, 0x51, 0x81, 0xcc, 0x8c,
+ 0xca, 0xcf, 0x4e, 0xc5, 0x78, 0x99, 0xfd, 0xbf,
+ 0xea, 0xab, 0x38, 0x81, 0xfc, 0xd1, 0x9e, 0x41,
+ 0x0b, 0x84, 0x25, 0xf1, 0x6b, 0x3c, 0xf5, 0x40,
+ 0x0d, 0xc4, 0x3e, 0xb3, 0x6a, 0xec, 0x6e, 0x75,
+ 0xdc, 0x9b, 0xdf, 0x08, 0x21, 0x16, 0xfb, 0x7a,
+ 0x8e, 0x19, 0x13, 0x02, 0xa7, 0xfc, 0x58, 0x21,
+ 0xc3, 0xb3, 0x59, 0x5a, 0x9c, 0xef, 0x38, 0xbd,
+ 0x87, 0x55, 0xd7, 0x0d, 0x1f, 0x84, 0xdc, 0x98,
+ 0x22, 0xca, 0x87, 0x96, 0x71, 0x6d, 0x68, 0x00,
+ 0xcb, 0x4f, 0x2f, 0xc4, 0x64, 0x0c, 0xc1, 0x53,
+ 0x0c, 0x90, 0xe7, 0x3c, 0x88, 0xca, 0xc5, 0x85,
+ 0xa3, 0x2a, 0x96, 0x7c, 0x82, 0x6d, 0x45, 0xf5,
+ 0xb7, 0x8d, 0x17, 0x69, 0xd6, 0xcd, 0x3c, 0xd3,
+ 0xe7, 0x1c, 0xce, 0x93, 0x50, 0xd4, 0x59, 0xa2,
+ 0xd8, 0x8b, 0x72, 0x60, 0x5b, 0x25, 0x14, 0xcd,
+ 0x5a, 0xe8, 0x8c, 0xdb, 0x23, 0x8d, 0x2b, 0x59,
+ 0x12, 0x13, 0x10, 0x47, 0xa4, 0xc8, 0x3c, 0xc1,
+ 0x81, 0x89, 0x6c, 0x98, 0xec, 0x8f, 0x7b, 0x32,
+ 0xf2, 0x87, 0xd9, 0xa2, 0x0d, 0xc2, 0x08, 0xf9,
+ 0xd5, 0xf3, 0x91, 0xe7, 0xb3, 0x87, 0xa7, 0x0b,
+ 0x64, 0x8f, 0xb9, 0x55, 0x1c, 0x81, 0x96, 0x6c,
+ 0xa1, 0xc9, 0x6e, 0x3b, 0xcd, 0x17, 0x1b, 0xfc,
+ 0xa6, 0x05, 0xba, 0x4a, 0x7d, 0x03, 0x3c, 0x59,
+ 0xc8, 0xee, 0x50, 0xb2, 0x5b, 0xe1, 0x4d, 0x6a,
+ 0x1f, 0x09, 0xdc, 0xa2, 0x51, 0xd1, 0x93, 0x3a,
+ 0x5f, 0x72, 0x1d, 0x26, 0x14, 0x62, 0xa2, 0x41,
+ 0x3d, 0x08, 0x70, 0x7b, 0x27, 0x3d, 0xbc, 0xdf,
+ 0x15, 0xfa, 0xb9, 0x5f, 0xb5, 0x38, 0x84, 0x0b,
+ 0x58, 0x3d, 0xee, 0x3f, 0x32, 0x65, 0x6d, 0xd7,
+ 0xce, 0x97, 0x3c, 0x8d, 0xfb, 0x63, 0xb9, 0xb0,
+ 0xa8, 0x4a, 0x72, 0x99, 0x97, 0x58, 0xc8, 0xa7,
+ 0xf9, 0x4c, 0xae, 0xc1, 0x63, 0xb9, 0x57, 0x18,
+ 0x8a, 0xfa, 0xab, 0xe9, 0xf3, 0x67, 0xe6, 0xfd,
+ 0xd2, 0x9d, 0x5c, 0xa9, 0x8e, 0x11, 0x0a, 0xf4,
+ 0x4b, 0xf1, 0xec, 0x1a, 0xaf, 0x50, 0x5d, 0x16,
+ 0x13, 0x69, 0x2e, 0xbd, 0x0d, 0xe6, 0xf0, 0xb2,
+ 0xed, 0xb4, 0x4c, 0x59, 0x77, 0x37, 0x00, 0x0b,
+ 0xc7, 0xa7, 0x9e, 0x37, 0xf3, 0x60, 0x70, 0xef,
+ 0xf3, 0xc1, 0x74, 0x52, 0x87, 0xc6, 0xa1, 0x81,
+ 0xbd, 0x0a, 0x2c, 0x5d, 0x2c, 0x0c, 0x6a, 0x81,
+ 0xa1, 0xfe, 0x26, 0x78, 0x6c, 0x03, 0x06, 0x07,
+ 0x34, 0xaa, 0xd1, 0x1b, 0x40, 0x03, 0x39, 0x56,
+ 0xcf, 0x2a, 0x92, 0xc1, 0x4e, 0xdf, 0x29, 0x24,
+ 0x83, 0x22, 0x7a, 0xea, 0x67, 0x1e, 0xe7, 0x54,
+ 0x64, 0xd3, 0xbd, 0x3a, 0x5d, 0xae, 0xca, 0xf0,
+ 0x9c, 0xd6, 0x5a, 0x9a, 0x62, 0xc8, 0xc7, 0x83,
+ 0xf9, 0x89, 0xde, 0x2d, 0x53, 0x64, 0x61, 0xf7,
+ 0xa3, 0xa7, 0x31, 0x38, 0xc6, 0x22, 0x9c, 0xb4,
+ 0x87, 0xe0
+};
+static const u8 output73[] __initconst = {
+ 0x34, 0xed, 0x05, 0xb0, 0x14, 0xbc, 0x8c, 0xcc,
+ 0x95, 0xbd, 0x99, 0x0f, 0xb1, 0x98, 0x17, 0x10,
+ 0xae, 0xe0, 0x08, 0x53, 0xa3, 0x69, 0xd2, 0xed,
+ 0x66, 0xdb, 0x2a, 0x34, 0x8d, 0x0c, 0x6e, 0xce,
+ 0x63, 0x69, 0xc9, 0xe4, 0x57, 0xc3, 0x0c, 0x8b,
+ 0xa6, 0x2c, 0xa7, 0xd2, 0x08, 0xff, 0x4f, 0xec,
+ 0x61, 0x8c, 0xee, 0x0d, 0xfa, 0x6b, 0xe0, 0xe8,
+ 0x71, 0xbc, 0x41, 0x46, 0xd7, 0x33, 0x1d, 0xc0,
+ 0xfd, 0xad, 0xca, 0x8b, 0x34, 0x56, 0xa4, 0x86,
+ 0x71, 0x62, 0xae, 0x5e, 0x3d, 0x2b, 0x66, 0x3e,
+ 0xae, 0xd8, 0xc0, 0xe1, 0x21, 0x3b, 0xca, 0xd2,
+ 0x6b, 0xa2, 0xb8, 0xc7, 0x98, 0x4a, 0xf3, 0xcf,
+ 0xb8, 0x62, 0xd8, 0x33, 0xe6, 0x80, 0xdb, 0x2f,
+ 0x0a, 0xaf, 0x90, 0x3c, 0xe1, 0xec, 0xe9, 0x21,
+ 0x29, 0x42, 0x9e, 0xa5, 0x50, 0xe9, 0x93, 0xd3,
+ 0x53, 0x1f, 0xac, 0x2a, 0x24, 0x07, 0xb8, 0xed,
+ 0xed, 0x38, 0x2c, 0xc4, 0xa1, 0x2b, 0x31, 0x5d,
+ 0x9c, 0x24, 0x7b, 0xbf, 0xd9, 0xbb, 0x4e, 0x87,
+ 0x8f, 0x32, 0x30, 0xf1, 0x11, 0x29, 0x54, 0x94,
+ 0x00, 0x95, 0x1d, 0x1d, 0x24, 0xc0, 0xd4, 0x34,
+ 0x49, 0x1d, 0xd5, 0xe3, 0xa6, 0xde, 0x8b, 0xbf,
+ 0x5a, 0x9f, 0x58, 0x5a, 0x9b, 0x70, 0xe5, 0x9b,
+ 0xb3, 0xdb, 0xe8, 0xb8, 0xca, 0x1b, 0x43, 0xe3,
+ 0xc6, 0x6f, 0x0a, 0xd6, 0x32, 0x11, 0xd4, 0x04,
+ 0xef, 0xa3, 0xe4, 0x3f, 0x12, 0xd8, 0xc1, 0x73,
+ 0x51, 0x87, 0x03, 0xbd, 0xba, 0x60, 0x79, 0xee,
+ 0x08, 0xcc, 0xf7, 0xc0, 0xaa, 0x4c, 0x33, 0xc4,
+ 0xc7, 0x09, 0xf5, 0x91, 0xcb, 0x74, 0x57, 0x08,
+ 0x1b, 0x90, 0xa9, 0x1b, 0x60, 0x02, 0xd2, 0x3f,
+ 0x7a, 0xbb, 0xfd, 0x78, 0xf0, 0x15, 0xf9, 0x29,
+ 0x82, 0x8f, 0xc4, 0xb2, 0x88, 0x1f, 0xbc, 0xcc,
+ 0x53, 0x27, 0x8b, 0x07, 0x5f, 0xfc, 0x91, 0x29,
+ 0x82, 0x80, 0x59, 0x0a, 0x3c, 0xea, 0xc4, 0x7e,
+ 0xad, 0xd2, 0x70, 0x46, 0xbd, 0x9e, 0x3b, 0x1c,
+ 0x8a, 0x62, 0xea, 0x69, 0xbd, 0xf6, 0x96, 0x15,
+ 0xb5, 0x57, 0xe8, 0x63, 0x5f, 0x65, 0x46, 0x84,
+ 0x58, 0x50, 0x87, 0x4b, 0x0e, 0x5b, 0x52, 0x90,
+ 0xb0, 0xae, 0x37, 0x0f, 0xdd, 0x7e, 0xa2, 0xa0,
+ 0x8b, 0x78, 0xc8, 0x5a, 0x1f, 0x53, 0xdb, 0xc5,
+ 0xbf, 0x73, 0x20, 0xa9, 0x44, 0xfb, 0x1e, 0xc7,
+ 0x97, 0xb2, 0x3a, 0x5a, 0x17, 0xe6, 0x8b, 0x9b,
+ 0xe8, 0xf8, 0x2a, 0x01, 0x27, 0xa3, 0x71, 0x28,
+ 0xe3, 0x19, 0xc6, 0xaf, 0xf5, 0x3a, 0x26, 0xc0,
+ 0x5c, 0x69, 0x30, 0x78, 0x75, 0x27, 0xf2, 0x0c,
+ 0x22, 0x71, 0x65, 0xc6, 0x8e, 0x7b, 0x47, 0xe3,
+ 0x31, 0xaf, 0x7b, 0xc6, 0xc2, 0x55, 0x68, 0x81,
+ 0xaa, 0x1b, 0x21, 0x65, 0xfb, 0x18, 0x35, 0x45,
+ 0x36, 0x9a, 0x44, 0xba, 0x5c, 0xff, 0x06, 0xde,
+ 0x3a, 0xc8, 0x44, 0x0b, 0xaa, 0x8e, 0x34, 0xe2,
+ 0x84, 0xac, 0x18, 0xfe, 0x9b, 0xe1, 0x4f, 0xaa,
+ 0xb6, 0x90, 0x0b, 0x1c, 0x2c, 0xd9, 0x9a, 0x10,
+ 0x18, 0xf9, 0x49, 0x41, 0x42, 0x1b, 0xb5, 0xe1,
+ 0x26, 0xac, 0x2d, 0x38, 0x00, 0x00, 0xe4, 0xb4,
+ 0x50, 0x6f, 0x14, 0x18, 0xd6, 0x3d, 0x00, 0x59,
+ 0x3c, 0x45, 0xf3, 0x42, 0x13, 0x44, 0xb8, 0x57,
+ 0xd4, 0x43, 0x5c, 0x8a, 0x2a, 0xb4, 0xfc, 0x0a,
+ 0x25, 0x5a, 0xdc, 0x8f, 0x11, 0x0b, 0x11, 0x44,
+ 0xc7, 0x0e, 0x54, 0x8b, 0x22, 0x01, 0x7e, 0x67,
+ 0x2e, 0x15, 0x3a, 0xb9, 0xee, 0x84, 0x10, 0xd4,
+ 0x80, 0x57, 0xd7, 0x75, 0xcf, 0x8b, 0xcb, 0x03,
+ 0xc9, 0x92, 0x2b, 0x69, 0xd8, 0x5a, 0x9b, 0x06,
+ 0x85, 0x47, 0xaa, 0x4c, 0x28, 0xde, 0x49, 0x58,
+ 0xe6, 0x11, 0x1e, 0x5e, 0x64, 0x8e, 0x3b, 0xe0,
+ 0x40, 0x2e, 0xac, 0x96, 0x97, 0x15, 0x37, 0x1e,
+ 0x30, 0xdd
+};
+static const u8 key73[] __initconst = {
+ 0x96, 0x06, 0x1e, 0xc1, 0x6d, 0xba, 0x49, 0x5b,
+ 0x65, 0x80, 0x79, 0xdd, 0xf3, 0x67, 0xa8, 0x6e,
+ 0x2d, 0x9c, 0x54, 0x46, 0xd8, 0x4a, 0xeb, 0x7e,
+ 0x23, 0x86, 0x51, 0xd8, 0x49, 0x49, 0x56, 0xe0
+};
+enum { nonce73 = 0xbefb83cb67e11ffdULL };
+
+static const u8 input74[] __initconst = {
+ 0x47, 0x22, 0x70, 0xe5, 0x2f, 0x41, 0x18, 0x45,
+ 0x07, 0xd3, 0x6d, 0x32, 0x0d, 0x43, 0x92, 0x2b,
+ 0x9b, 0x65, 0x73, 0x13, 0x1a, 0x4f, 0x49, 0x8f,
+ 0xff, 0xf8, 0xcc, 0xae, 0x15, 0xab, 0x9d, 0x7d,
+ 0xee, 0x22, 0x5d, 0x8b, 0xde, 0x81, 0x5b, 0x81,
+ 0x83, 0x49, 0x35, 0x9b, 0xb4, 0xbc, 0x4e, 0x01,
+ 0xc2, 0x29, 0xa7, 0xf1, 0xca, 0x3a, 0xce, 0x3f,
+ 0xf5, 0x31, 0x93, 0xa8, 0xe2, 0xc9, 0x7d, 0x03,
+ 0x26, 0xa4, 0xbc, 0xa8, 0x9c, 0xb9, 0x68, 0xf3,
+ 0xb3, 0x91, 0xe8, 0xe6, 0xc7, 0x2b, 0x1a, 0xce,
+ 0xd2, 0x41, 0x53, 0xbd, 0xa3, 0x2c, 0x54, 0x94,
+ 0x21, 0xa1, 0x40, 0xae, 0xc9, 0x0c, 0x11, 0x92,
+ 0xfd, 0x91, 0xa9, 0x40, 0xca, 0xde, 0x21, 0x4e,
+ 0x1e, 0x3d, 0xcc, 0x2c, 0x87, 0x11, 0xef, 0x46,
+ 0xed, 0x52, 0x03, 0x11, 0x19, 0x43, 0x25, 0xc7,
+ 0x0d, 0xc3, 0x37, 0x5f, 0xd3, 0x6f, 0x0c, 0x6a,
+ 0x45, 0x30, 0x88, 0xec, 0xf0, 0x21, 0xef, 0x1d,
+ 0x7b, 0x38, 0x63, 0x4b, 0x49, 0x0c, 0x72, 0xf6,
+ 0x4c, 0x40, 0xc3, 0xcc, 0x03, 0xa7, 0xae, 0xa8,
+ 0x8c, 0x37, 0x03, 0x1c, 0x11, 0xae, 0x0d, 0x1b,
+ 0x62, 0x97, 0x27, 0xfc, 0x56, 0x4b, 0xb7, 0xfd,
+ 0xbc, 0xfb, 0x0e, 0xfc, 0x61, 0xad, 0xc6, 0xb5,
+ 0x9c, 0x8c, 0xc6, 0x38, 0x27, 0x91, 0x29, 0x3d,
+ 0x29, 0xc8, 0x37, 0xc9, 0x96, 0x69, 0xe3, 0xdc,
+ 0x3e, 0x61, 0x35, 0x9b, 0x99, 0x4f, 0xb9, 0x4e,
+ 0x5a, 0x29, 0x1c, 0x2e, 0xcf, 0x16, 0xcb, 0x69,
+ 0x87, 0xe4, 0x1a, 0xc4, 0x6e, 0x78, 0x43, 0x00,
+ 0x03, 0xb2, 0x8b, 0x03, 0xd0, 0xb4, 0xf1, 0xd2,
+ 0x7d, 0x2d, 0x7e, 0xfc, 0x19, 0x66, 0x5b, 0xa3,
+ 0x60, 0x3f, 0x9d, 0xbd, 0xfa, 0x3e, 0xca, 0x7b,
+ 0x26, 0x08, 0x19, 0x16, 0x93, 0x5d, 0x83, 0xfd,
+ 0xf9, 0x21, 0xc6, 0x31, 0x34, 0x6f, 0x0c, 0xaa,
+ 0x28, 0xf9, 0x18, 0xa2, 0xc4, 0x78, 0x3b, 0x56,
+ 0xc0, 0x88, 0x16, 0xba, 0x22, 0x2c, 0x07, 0x2f,
+ 0x70, 0xd0, 0xb0, 0x46, 0x35, 0xc7, 0x14, 0xdc,
+ 0xbb, 0x56, 0x23, 0x1e, 0x36, 0x36, 0x2d, 0x73,
+ 0x78, 0xc7, 0xce, 0xf3, 0x58, 0xf7, 0x58, 0xb5,
+ 0x51, 0xff, 0x33, 0x86, 0x0e, 0x3b, 0x39, 0xfb,
+ 0x1a, 0xfd, 0xf8, 0x8b, 0x09, 0x33, 0x1b, 0x83,
+ 0xf2, 0xe6, 0x38, 0x37, 0xef, 0x47, 0x84, 0xd9,
+ 0x82, 0x77, 0x2b, 0x82, 0xcc, 0xf9, 0xee, 0x94,
+ 0x71, 0x78, 0x81, 0xc8, 0x4d, 0x91, 0xd7, 0x35,
+ 0x29, 0x31, 0x30, 0x5c, 0x4a, 0x23, 0x23, 0xb1,
+ 0x38, 0x6b, 0xac, 0x22, 0x3f, 0x80, 0xc7, 0xe0,
+ 0x7d, 0xfa, 0x76, 0x47, 0xd4, 0x6f, 0x93, 0xa0,
+ 0xa0, 0x93, 0x5d, 0x68, 0xf7, 0x43, 0x25, 0x8f,
+ 0x1b, 0xc7, 0x87, 0xea, 0x59, 0x0c, 0xa2, 0xfa,
+ 0xdb, 0x2f, 0x72, 0x43, 0xcf, 0x90, 0xf1, 0xd6,
+ 0x58, 0xf3, 0x17, 0x6a, 0xdf, 0xb3, 0x4e, 0x0e,
+ 0x38, 0x24, 0x48, 0x1f, 0xb7, 0x01, 0xec, 0x81,
+ 0xb1, 0x87, 0x5b, 0xec, 0x9c, 0x11, 0x1a, 0xff,
+ 0xa5, 0xca, 0x5a, 0x63, 0x31, 0xb2, 0xe4, 0xc6,
+ 0x3c, 0x1d, 0xaf, 0x27, 0xb2, 0xd4, 0x19, 0xa2,
+ 0xcc, 0x04, 0x92, 0x42, 0xd2, 0xc1, 0x8c, 0x3b,
+ 0xce, 0xf5, 0x74, 0xc1, 0x81, 0xf8, 0x20, 0x23,
+ 0x6f, 0x20, 0x6d, 0x78, 0x36, 0x72, 0x2c, 0x52,
+ 0xdf, 0x5e, 0xe8, 0x75, 0xce, 0x1c, 0x49, 0x9d,
+ 0x93, 0x6f, 0x65, 0xeb, 0xb1, 0xbd, 0x8e, 0x5e,
+ 0xe5, 0x89, 0xc4, 0x8a, 0x81, 0x3d, 0x9a, 0xa7,
+ 0x11, 0x82, 0x8e, 0x38, 0x5b, 0x5b, 0xca, 0x7d,
+ 0x4b, 0x72, 0xc2, 0x9c, 0x30, 0x5e, 0x7f, 0xc0,
+ 0x6f, 0x91, 0xd5, 0x67, 0x8c, 0x3e, 0xae, 0xda,
+ 0x2b, 0x3c, 0x53, 0xcc, 0x50, 0x97, 0x36, 0x0b,
+ 0x79, 0xd6, 0x73, 0x6e, 0x7d, 0x42, 0x56, 0xe1,
+ 0xaa, 0xfc, 0xb3, 0xa7, 0xc8, 0x01, 0xaa, 0xc1,
+ 0xfc, 0x5c, 0x72, 0x8e, 0x63, 0xa8, 0x46, 0x18,
+ 0xee, 0x11, 0xe7, 0x30, 0x09, 0x83, 0x6c, 0xd9,
+ 0xf4, 0x7a, 0x7b, 0xb5, 0x1f, 0x6d, 0xc7, 0xbc,
+ 0xcb, 0x55, 0xea, 0x40, 0x58, 0x7a, 0x00, 0x00,
+ 0x90, 0x60, 0xc5, 0x64, 0x69, 0x05, 0x99, 0xd2,
+ 0x49, 0x62, 0x4f, 0xcb, 0x97, 0xdf, 0xdd, 0x6b,
+ 0x60, 0x75, 0xe2, 0xe0, 0x6f, 0x76, 0xd0, 0x37,
+ 0x67, 0x0a, 0xcf, 0xff, 0xc8, 0x61, 0x84, 0x14,
+ 0x80, 0x7c, 0x1d, 0x31, 0x8d, 0x90, 0xde, 0x0b,
+ 0x1c, 0x74, 0x9f, 0x82, 0x96, 0x80, 0xda, 0xaf,
+ 0x8d, 0x99, 0x86, 0x9f, 0x24, 0x99, 0x28, 0x3e,
+ 0xe0, 0xa3, 0xc3, 0x90, 0x2d, 0x14, 0x65, 0x1e,
+ 0x3b, 0xb9, 0xba, 0x13, 0xa5, 0x77, 0x73, 0x63,
+ 0x9a, 0x06, 0x3d, 0xa9, 0x28, 0x9b, 0xba, 0x25,
+ 0x61, 0xc9, 0xcd, 0xcf, 0x7a, 0x4d, 0x96, 0x09,
+ 0xcb, 0xca, 0x03, 0x9c, 0x54, 0x34, 0x31, 0x85,
+ 0xa0, 0x3d, 0xe5, 0xbc, 0xa5, 0x5f, 0x1b, 0xd3,
+ 0x10, 0x63, 0x74, 0x9d, 0x01, 0x92, 0x88, 0xf0,
+ 0x27, 0x9c, 0x28, 0xd9, 0xfd, 0xe2, 0x4e, 0x01,
+ 0x8d, 0x61, 0x79, 0x60, 0x61, 0x5b, 0x76, 0xab,
+ 0x06, 0xd3, 0x44, 0x87, 0x43, 0x52, 0xcd, 0x06,
+ 0x68, 0x1e, 0x2d, 0xc5, 0xb0, 0x07, 0x25, 0xdf,
+ 0x0a, 0x50, 0xd7, 0xd9, 0x08, 0x53, 0x65, 0xf1,
+ 0x0c, 0x2c, 0xde, 0x3f, 0x9d, 0x03, 0x1f, 0xe1,
+ 0x49, 0x43, 0x3c, 0x83, 0x81, 0x37, 0xf8, 0xa2,
+ 0x0b, 0xf9, 0x61, 0x1c, 0xc1, 0xdb, 0x79, 0xbc,
+ 0x64, 0xce, 0x06, 0x4e, 0x87, 0x89, 0x62, 0x73,
+ 0x51, 0xbc, 0xa4, 0x32, 0xd4, 0x18, 0x62, 0xab,
+ 0x65, 0x7e, 0xad, 0x1e, 0x91, 0xa3, 0xfa, 0x2d,
+ 0x58, 0x9e, 0x2a, 0xe9, 0x74, 0x44, 0x64, 0x11,
+ 0xe6, 0xb6, 0xb3, 0x00, 0x7e, 0xa3, 0x16, 0xef,
+ 0x72
+};
+static const u8 output74[] __initconst = {
+ 0xf5, 0xca, 0x45, 0x65, 0x50, 0x35, 0x47, 0x67,
+ 0x6f, 0x4f, 0x67, 0xff, 0x34, 0xd9, 0xc3, 0x37,
+ 0x2a, 0x26, 0xb0, 0x4f, 0x08, 0x1e, 0x45, 0x13,
+ 0xc7, 0x2c, 0x14, 0x75, 0x33, 0xd8, 0x8e, 0x1e,
+ 0x1b, 0x11, 0x0d, 0x97, 0x04, 0x33, 0x8a, 0xe4,
+ 0xd8, 0x8d, 0x0e, 0x12, 0x8d, 0xdb, 0x6e, 0x02,
+ 0xfa, 0xe5, 0xbd, 0x3a, 0xb5, 0x28, 0x07, 0x7d,
+ 0x20, 0xf0, 0x12, 0x64, 0x83, 0x2f, 0x59, 0x79,
+ 0x17, 0x88, 0x3c, 0x2d, 0x08, 0x2f, 0x55, 0xda,
+ 0xcc, 0x02, 0x3a, 0x82, 0xcd, 0x03, 0x94, 0xdf,
+ 0xdf, 0xab, 0x8a, 0x13, 0xf5, 0xe6, 0x74, 0xdf,
+ 0x7b, 0xe2, 0xab, 0x34, 0xbc, 0x00, 0x85, 0xbf,
+ 0x5a, 0x48, 0xc8, 0xff, 0x8d, 0x6c, 0x27, 0x48,
+ 0x19, 0x2d, 0x08, 0xfa, 0x82, 0x62, 0x39, 0x55,
+ 0x32, 0x11, 0xa8, 0xd7, 0xb9, 0x08, 0x2c, 0xd6,
+ 0x7a, 0xd9, 0x83, 0x9f, 0x9b, 0xfb, 0xec, 0x3a,
+ 0xd1, 0x08, 0xc7, 0xad, 0xdc, 0x98, 0x4c, 0xbc,
+ 0x98, 0xeb, 0x36, 0xb0, 0x39, 0xf4, 0x3a, 0xd6,
+ 0x53, 0x02, 0xa0, 0xa9, 0x73, 0xa1, 0xca, 0xef,
+ 0xd8, 0xd2, 0xec, 0x0e, 0xf8, 0xf5, 0xac, 0x8d,
+ 0x34, 0x41, 0x06, 0xa8, 0xc6, 0xc3, 0x31, 0xbc,
+ 0xe5, 0xcc, 0x7e, 0x72, 0x63, 0x59, 0x3e, 0x63,
+ 0xc2, 0x8d, 0x2b, 0xd5, 0xb9, 0xfd, 0x1e, 0x31,
+ 0x69, 0x32, 0x05, 0xd6, 0xde, 0xc9, 0xe6, 0x4c,
+ 0xac, 0x68, 0xf7, 0x1f, 0x9d, 0xcd, 0x0e, 0xa2,
+ 0x15, 0x3d, 0xd6, 0x47, 0x99, 0xab, 0x08, 0x5f,
+ 0x28, 0xc3, 0x4c, 0xc2, 0xd5, 0xdd, 0x10, 0xb7,
+ 0xbd, 0xdb, 0x9b, 0xcf, 0x85, 0x27, 0x29, 0x76,
+ 0x98, 0xeb, 0xad, 0x31, 0x64, 0xe7, 0xfb, 0x61,
+ 0xe0, 0xd8, 0x1a, 0xa6, 0xe2, 0xe7, 0x43, 0x42,
+ 0x77, 0xc9, 0x82, 0x00, 0xac, 0x85, 0xe0, 0xa2,
+ 0xd4, 0x62, 0xe3, 0xb7, 0x17, 0x6e, 0xb2, 0x9e,
+ 0x21, 0x58, 0x73, 0xa9, 0x53, 0x2d, 0x3c, 0xe1,
+ 0xdd, 0xd6, 0x6e, 0x92, 0xf2, 0x1d, 0xc2, 0x22,
+ 0x5f, 0x9a, 0x7e, 0xd0, 0x52, 0xbf, 0x54, 0x19,
+ 0xd7, 0x80, 0x63, 0x3e, 0xd0, 0x08, 0x2d, 0x37,
+ 0x0c, 0x15, 0xf7, 0xde, 0xab, 0x2b, 0xe3, 0x16,
+ 0x21, 0x3a, 0xee, 0xa5, 0xdc, 0xdf, 0xde, 0xa3,
+ 0x69, 0xcb, 0xfd, 0x92, 0x89, 0x75, 0xcf, 0xc9,
+ 0x8a, 0xa4, 0xc8, 0xdd, 0xcc, 0x21, 0xe6, 0xfe,
+ 0x9e, 0x43, 0x76, 0xb2, 0x45, 0x22, 0xb9, 0xb5,
+ 0xac, 0x7e, 0x3d, 0x26, 0xb0, 0x53, 0xc8, 0xab,
+ 0xfd, 0xea, 0x2c, 0xd1, 0x44, 0xc5, 0x60, 0x1b,
+ 0x8a, 0x99, 0x0d, 0xa5, 0x0e, 0x67, 0x6e, 0x3a,
+ 0x96, 0x55, 0xec, 0xe8, 0xcc, 0xbe, 0x49, 0xd9,
+ 0xf2, 0x72, 0x9f, 0x30, 0x21, 0x97, 0x57, 0x19,
+ 0xbe, 0x5e, 0x33, 0x0c, 0xee, 0xc0, 0x72, 0x0d,
+ 0x2e, 0xd1, 0xe1, 0x52, 0xc2, 0xea, 0x41, 0xbb,
+ 0xe1, 0x6d, 0xd4, 0x17, 0xa9, 0x8d, 0x89, 0xa9,
+ 0xd6, 0x4b, 0xc6, 0x4c, 0xf2, 0x88, 0x97, 0x54,
+ 0x3f, 0x4f, 0x57, 0xb7, 0x37, 0xf0, 0x2c, 0x11,
+ 0x15, 0x56, 0xdb, 0x28, 0xb5, 0x16, 0x84, 0x66,
+ 0xce, 0x45, 0x3f, 0x61, 0x75, 0xb6, 0xbe, 0x00,
+ 0xd1, 0xe4, 0xf5, 0x27, 0x54, 0x7f, 0xc2, 0xf1,
+ 0xb3, 0x32, 0x9a, 0xe8, 0x07, 0x02, 0xf3, 0xdb,
+ 0xa9, 0xd1, 0xc2, 0xdf, 0xee, 0xad, 0xe5, 0x8a,
+ 0x3c, 0xfa, 0x67, 0xec, 0x6b, 0xa4, 0x08, 0xfe,
+ 0xba, 0x5a, 0x58, 0x0b, 0x78, 0x11, 0x91, 0x76,
+ 0xe3, 0x1a, 0x28, 0x54, 0x5e, 0xbd, 0x71, 0x1b,
+ 0x8b, 0xdc, 0x6c, 0xf4, 0x6f, 0xd7, 0xf4, 0xf3,
+ 0xe1, 0x03, 0xa4, 0x3c, 0x8d, 0x91, 0x2e, 0xba,
+ 0x5f, 0x7f, 0x8c, 0xaf, 0x69, 0x89, 0x29, 0x0a,
+ 0x5b, 0x25, 0x13, 0xc4, 0x2e, 0x16, 0xc2, 0x15,
+ 0x07, 0x5d, 0x58, 0x33, 0x7c, 0xe0, 0xf0, 0x55,
+ 0x5f, 0xbf, 0x5e, 0xf0, 0x71, 0x48, 0x8f, 0xf7,
+ 0x48, 0xb3, 0xf7, 0x0d, 0xa1, 0xd0, 0x63, 0xb1,
+ 0xad, 0xae, 0xb5, 0xb0, 0x5f, 0x71, 0xaf, 0x24,
+ 0x8b, 0xb9, 0x1c, 0x44, 0xd2, 0x1a, 0x53, 0xd1,
+ 0xd5, 0xb4, 0xa9, 0xff, 0x88, 0x73, 0xb5, 0xaa,
+ 0x15, 0x32, 0x5f, 0x59, 0x9d, 0x2e, 0xb5, 0xcb,
+ 0xde, 0x21, 0x2e, 0xe9, 0x35, 0xed, 0xfd, 0x0f,
+ 0xb6, 0xbb, 0xe6, 0x4b, 0x16, 0xf1, 0x45, 0x1e,
+ 0xb4, 0x84, 0xe9, 0x58, 0x1c, 0x0c, 0x95, 0xc0,
+ 0xcf, 0x49, 0x8b, 0x59, 0xa1, 0x78, 0xe6, 0x80,
+ 0x12, 0x49, 0x7a, 0xd4, 0x66, 0x62, 0xdf, 0x9c,
+ 0x18, 0xc8, 0x8c, 0xda, 0xc1, 0xa6, 0xbc, 0x65,
+ 0x28, 0xd2, 0xa4, 0xe8, 0xf1, 0x35, 0xdb, 0x5a,
+ 0x75, 0x1f, 0x73, 0x60, 0xec, 0xa8, 0xda, 0x5a,
+ 0x43, 0x15, 0x83, 0x9b, 0xe7, 0xb1, 0xa6, 0x81,
+ 0xbb, 0xef, 0xf3, 0x8f, 0x0f, 0xd3, 0x79, 0xa2,
+ 0xe5, 0xaa, 0x42, 0xef, 0xa0, 0x13, 0x4e, 0x91,
+ 0x2d, 0xcb, 0x61, 0x7a, 0x9a, 0x33, 0x14, 0x50,
+ 0x77, 0x4a, 0xd0, 0x91, 0x48, 0xe0, 0x0c, 0xe0,
+ 0x11, 0xcb, 0xdf, 0xb0, 0xce, 0x06, 0xd2, 0x79,
+ 0x4d, 0x69, 0xb9, 0xc9, 0x36, 0x74, 0x8f, 0x81,
+ 0x72, 0x73, 0xf3, 0x17, 0xb7, 0x13, 0xcb, 0x5b,
+ 0xd2, 0x5c, 0x33, 0x61, 0xb7, 0x61, 0x79, 0xb0,
+ 0xc0, 0x4d, 0xa1, 0xc7, 0x5d, 0x98, 0xc9, 0xe1,
+ 0x98, 0xbd, 0x78, 0x5a, 0x2c, 0x64, 0x53, 0xaf,
+ 0xaf, 0x66, 0x51, 0x47, 0xe4, 0x48, 0x66, 0x8b,
+ 0x07, 0x52, 0xa3, 0x03, 0x93, 0x28, 0xad, 0xcc,
+ 0xa3, 0x86, 0xad, 0x63, 0x04, 0x35, 0x6c, 0x49,
+ 0xd5, 0x28, 0x0e, 0x00, 0x47, 0xf4, 0xd4, 0x32,
+ 0x27, 0x19, 0xb3, 0x29, 0xe7, 0xbc, 0xbb, 0xce,
+ 0x3e, 0x3e, 0xd5, 0x67, 0x20, 0xe4, 0x0b, 0x75,
+ 0x95, 0x24, 0xe0, 0x6c, 0xb6, 0x29, 0x0c, 0x14,
+ 0xfd
+};
+static const u8 key74[] __initconst = {
+ 0xf0, 0x41, 0x5b, 0x00, 0x56, 0xc4, 0xac, 0xf6,
+ 0xa2, 0x4c, 0x33, 0x41, 0x16, 0x09, 0x1b, 0x8e,
+ 0x4d, 0xe8, 0x8c, 0xd9, 0x48, 0xab, 0x3e, 0x60,
+ 0xcb, 0x49, 0x3e, 0xaf, 0x2b, 0x8b, 0xc8, 0xf0
+};
+enum { nonce74 = 0xcbdb0ffd0e923384ULL };
+
+static const struct chacha20_testvec chacha20_testvecs[] __initconst = {
+ { input01, output01, key01, nonce01, sizeof(input01) },
+ { input02, output02, key02, nonce02, sizeof(input02) },
+ { input03, output03, key03, nonce03, sizeof(input03) },
+ { input04, output04, key04, nonce04, sizeof(input04) },
+ { input05, output05, key05, nonce05, sizeof(input05) },
+ { input06, output06, key06, nonce06, sizeof(input06) },
+ { input07, output07, key07, nonce07, sizeof(input07) },
+ { input08, output08, key08, nonce08, sizeof(input08) },
+ { input09, output09, key09, nonce09, sizeof(input09) },
+ { input10, output10, key10, nonce10, sizeof(input10) },
+ { input11, output11, key11, nonce11, sizeof(input11) },
+ { input12, output12, key12, nonce12, sizeof(input12) },
+ { input13, output13, key13, nonce13, sizeof(input13) },
+ { input14, output14, key14, nonce14, sizeof(input14) },
+ { input15, output15, key15, nonce15, sizeof(input15) },
+ { input16, output16, key16, nonce16, sizeof(input16) },
+ { input17, output17, key17, nonce17, sizeof(input17) },
+ { input18, output18, key18, nonce18, sizeof(input18) },
+ { input19, output19, key19, nonce19, sizeof(input19) },
+ { input20, output20, key20, nonce20, sizeof(input20) },
+ { input21, output21, key21, nonce21, sizeof(input21) },
+ { input22, output22, key22, nonce22, sizeof(input22) },
+ { input23, output23, key23, nonce23, sizeof(input23) },
+ { input24, output24, key24, nonce24, sizeof(input24) },
+ { input25, output25, key25, nonce25, sizeof(input25) },
+ { input26, output26, key26, nonce26, sizeof(input26) },
+ { input27, output27, key27, nonce27, sizeof(input27) },
+ { input28, output28, key28, nonce28, sizeof(input28) },
+ { input29, output29, key29, nonce29, sizeof(input29) },
+ { input30, output30, key30, nonce30, sizeof(input30) },
+ { input31, output31, key31, nonce31, sizeof(input31) },
+ { input32, output32, key32, nonce32, sizeof(input32) },
+ { input33, output33, key33, nonce33, sizeof(input33) },
+ { input34, output34, key34, nonce34, sizeof(input34) },
+ { input35, output35, key35, nonce35, sizeof(input35) },
+ { input36, output36, key36, nonce36, sizeof(input36) },
+ { input37, output37, key37, nonce37, sizeof(input37) },
+ { input38, output38, key38, nonce38, sizeof(input38) },
+ { input39, output39, key39, nonce39, sizeof(input39) },
+ { input40, output40, key40, nonce40, sizeof(input40) },
+ { input41, output41, key41, nonce41, sizeof(input41) },
+ { input42, output42, key42, nonce42, sizeof(input42) },
+ { input43, output43, key43, nonce43, sizeof(input43) },
+ { input44, output44, key44, nonce44, sizeof(input44) },
+ { input45, output45, key45, nonce45, sizeof(input45) },
+ { input46, output46, key46, nonce46, sizeof(input46) },
+ { input47, output47, key47, nonce47, sizeof(input47) },
+ { input48, output48, key48, nonce48, sizeof(input48) },
+ { input49, output49, key49, nonce49, sizeof(input49) },
+ { input50, output50, key50, nonce50, sizeof(input50) },
+ { input51, output51, key51, nonce51, sizeof(input51) },
+ { input52, output52, key52, nonce52, sizeof(input52) },
+ { input53, output53, key53, nonce53, sizeof(input53) },
+ { input54, output54, key54, nonce54, sizeof(input54) },
+ { input55, output55, key55, nonce55, sizeof(input55) },
+ { input56, output56, key56, nonce56, sizeof(input56) },
+ { input57, output57, key57, nonce57, sizeof(input57) },
+ { input58, output58, key58, nonce58, sizeof(input58) },
+ { input59, output59, key59, nonce59, sizeof(input59) },
+ { input60, output60, key60, nonce60, sizeof(input60) },
+ { input61, output61, key61, nonce61, sizeof(input61) },
+ { input62, output62, key62, nonce62, sizeof(input62) },
+ { input63, output63, key63, nonce63, sizeof(input63) },
+ { input64, output64, key64, nonce64, sizeof(input64) },
+ { input65, output65, key65, nonce65, sizeof(input65) },
+ { input66, output66, key66, nonce66, sizeof(input66) },
+ { input67, output67, key67, nonce67, sizeof(input67) },
+ { input68, output68, key68, nonce68, sizeof(input68) },
+ { input69, output69, key69, nonce69, sizeof(input69) },
+ { input70, output70, key70, nonce70, sizeof(input70) },
+ { input71, output71, key71, nonce71, sizeof(input71) },
+ { input72, output72, key72, nonce72, sizeof(input72) },
+ { input73, output73, key73, nonce73, sizeof(input73) },
+ { input74, output74, key74, nonce74, sizeof(input74) }
+};
+
+static const struct hchacha20_testvec hchacha20_testvecs[] __initconst = {{
+ .key = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
+ 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f },
+ .nonce = { 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x4a,
+ 0x00, 0x00, 0x00, 0x00, 0x31, 0x41, 0x59, 0x27 },
+ .output = { 0x82, 0x41, 0x3b, 0x42, 0x27, 0xb2, 0x7b, 0xfe,
+ 0xd3, 0x0e, 0x42, 0x50, 0x8a, 0x87, 0x7d, 0x73,
+ 0xa0, 0xf9, 0xe4, 0xd5, 0x8a, 0x74, 0xa8, 0x53,
+ 0xc1, 0x2e, 0xc4, 0x13, 0x26, 0xd3, 0xec, 0xdc }
+}};
+
+static bool __init chacha20_selftest(void)
+{
+ enum {
+ MAXIMUM_TEST_BUFFER_LEN = 1UL << 10,
+ OUTRAGEOUSLY_HUGE_BUFFER_LEN = PAGE_SIZE * 35 + 17 /* 143k */
+ };
+ size_t i, j, k;
+ u32 derived_key[CHACHA20_KEY_WORDS];
+ u8 *offset_input = NULL, *computed_output = NULL, *massive_input = NULL;
+ u8 offset_key[CHACHA20_KEY_SIZE + 1]
+ __aligned(__alignof__(unsigned long));
+ struct chacha20_ctx state;
+ bool success = true;
+ simd_context_t simd_context;
+
+ offset_input = kmalloc(MAXIMUM_TEST_BUFFER_LEN + 1, GFP_KERNEL);
+ computed_output = kmalloc(MAXIMUM_TEST_BUFFER_LEN + 1, GFP_KERNEL);
+ massive_input = vzalloc(OUTRAGEOUSLY_HUGE_BUFFER_LEN);
+ if (!computed_output || !offset_input || !massive_input) {
+ pr_err("chacha20 self-test malloc: FAIL\n");
+ success = false;
+ goto out;
+ }
+
+ simd_get(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(chacha20_testvecs); ++i) {
+ /* Boring case */
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN + 1);
+ memset(&state, 0, sizeof(state));
+ chacha20_init(&state, chacha20_testvecs[i].key,
+ chacha20_testvecs[i].nonce);
+ chacha20(&state, computed_output, chacha20_testvecs[i].input,
+ chacha20_testvecs[i].ilen, &simd_context);
+ if (memcmp(computed_output, chacha20_testvecs[i].output,
+ chacha20_testvecs[i].ilen)) {
+ pr_err("chacha20 self-test %zu: FAIL\n", i + 1);
+ success = false;
+ }
+ for (k = chacha20_testvecs[i].ilen;
+ k < MAXIMUM_TEST_BUFFER_LEN + 1; ++k) {
+ if (computed_output[k]) {
+ pr_err("chacha20 self-test %zu (zero check): FAIL\n",
+ i + 1);
+ success = false;
+ break;
+ }
+ }
+
+ /* Unaligned case */
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN + 1);
+ memset(&state, 0, sizeof(state));
+ memcpy(offset_input + 1, chacha20_testvecs[i].input,
+ chacha20_testvecs[i].ilen);
+ memcpy(offset_key + 1, chacha20_testvecs[i].key,
+ CHACHA20_KEY_SIZE);
+ chacha20_init(&state, offset_key + 1, chacha20_testvecs[i].nonce);
+ chacha20(&state, computed_output + 1, offset_input + 1,
+ chacha20_testvecs[i].ilen, &simd_context);
+ if (memcmp(computed_output + 1, chacha20_testvecs[i].output,
+ chacha20_testvecs[i].ilen)) {
+ pr_err("chacha20 self-test %zu (unaligned): FAIL\n",
+ i + 1);
+ success = false;
+ }
+ if (computed_output[0]) {
+ pr_err("chacha20 self-test %zu (unaligned, zero check): FAIL\n",
+ i + 1);
+ success = false;
+ }
+ for (k = chacha20_testvecs[i].ilen + 1;
+ k < MAXIMUM_TEST_BUFFER_LEN + 1; ++k) {
+ if (computed_output[k]) {
+ pr_err("chacha20 self-test %zu (unaligned, zero check): FAIL\n",
+ i + 1);
+ success = false;
+ break;
+ }
+ }
+
+ /* Chunked case */
+ if (chacha20_testvecs[i].ilen <= CHACHA20_BLOCK_SIZE)
+ goto next_test;
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN + 1);
+ memset(&state, 0, sizeof(state));
+ chacha20_init(&state, chacha20_testvecs[i].key,
+ chacha20_testvecs[i].nonce);
+ chacha20(&state, computed_output, chacha20_testvecs[i].input,
+ CHACHA20_BLOCK_SIZE, &simd_context);
+ chacha20(&state, computed_output + CHACHA20_BLOCK_SIZE,
+ chacha20_testvecs[i].input + CHACHA20_BLOCK_SIZE,
+ chacha20_testvecs[i].ilen - CHACHA20_BLOCK_SIZE,
+ &simd_context);
+ if (memcmp(computed_output, chacha20_testvecs[i].output,
+ chacha20_testvecs[i].ilen)) {
+ pr_err("chacha20 self-test %zu (chunked): FAIL\n",
+ i + 1);
+ success = false;
+ }
+ for (k = chacha20_testvecs[i].ilen;
+ k < MAXIMUM_TEST_BUFFER_LEN + 1; ++k) {
+ if (computed_output[k]) {
+ pr_err("chacha20 self-test %zu (chunked, zero check): FAIL\n",
+ i + 1);
+ success = false;
+ break;
+ }
+ }
+
+next_test:
+ /* Sliding unaligned case */
+ if (chacha20_testvecs[i].ilen > CHACHA20_BLOCK_SIZE + 1 ||
+ !chacha20_testvecs[i].ilen)
+ continue;
+ for (j = 1; j < CHACHA20_BLOCK_SIZE; ++j) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN + 1);
+ memset(&state, 0, sizeof(state));
+ memcpy(offset_input + j, chacha20_testvecs[i].input,
+ chacha20_testvecs[i].ilen);
+ chacha20_init(&state, chacha20_testvecs[i].key,
+ chacha20_testvecs[i].nonce);
+ chacha20(&state, computed_output + j, offset_input + j,
+ chacha20_testvecs[i].ilen, &simd_context);
+ if (memcmp(computed_output + j,
+ chacha20_testvecs[i].output,
+ chacha20_testvecs[i].ilen)) {
+ pr_err("chacha20 self-test %zu (unaligned, slide %zu): FAIL\n",
+ i + 1, j);
+ success = false;
+ }
+ for (k = j; k < j; ++k) {
+ if (computed_output[k]) {
+ pr_err("chacha20 self-test %zu (unaligned, slide %zu, zero check): FAIL\n",
+ i + 1, j);
+ success = false;
+ break;
+ }
+ }
+ for (k = chacha20_testvecs[i].ilen + j;
+ k < MAXIMUM_TEST_BUFFER_LEN + 1; ++k) {
+ if (computed_output[k]) {
+ pr_err("chacha20 self-test %zu (unaligned, slide %zu, zero check): FAIL\n",
+ i + 1, j);
+ success = false;
+ break;
+ }
+ }
+ }
+ }
+ for (i = 0; i < ARRAY_SIZE(hchacha20_testvecs); ++i) {
+ memset(&derived_key, 0, sizeof(derived_key));
+ hchacha20(derived_key, hchacha20_testvecs[i].nonce,
+ hchacha20_testvecs[i].key, &simd_context);
+ cpu_to_le32_array(derived_key, ARRAY_SIZE(derived_key));
+ if (memcmp(derived_key, hchacha20_testvecs[i].output,
+ CHACHA20_KEY_SIZE)) {
+ pr_err("hchacha20 self-test %zu: FAIL\n", i + 1);
+ success = false;
+ }
+ }
+ memset(&state, 0, sizeof(state));
+ chacha20_init(&state, chacha20_testvecs[0].key,
+ chacha20_testvecs[0].nonce);
+ chacha20(&state, massive_input, massive_input,
+ OUTRAGEOUSLY_HUGE_BUFFER_LEN, &simd_context);
+ chacha20_init(&state, chacha20_testvecs[0].key,
+ chacha20_testvecs[0].nonce);
+ chacha20(&state, massive_input, massive_input,
+ OUTRAGEOUSLY_HUGE_BUFFER_LEN, DONT_USE_SIMD);
+ for (k = 0; k < OUTRAGEOUSLY_HUGE_BUFFER_LEN; ++k) {
+ if (massive_input[k]) {
+ pr_err("chacha20 self-test massive: FAIL\n");
+ success = false;
+ break;
+ }
+ }
+
+ simd_put(&simd_context);
+
+out:
+ kfree(offset_input);
+ kfree(computed_output);
+ vfree(massive_input);
+ return success;
+}
diff --git a/lib/zinc/selftest/run.h b/lib/zinc/selftest/run.h
new file mode 100644
index 000000000000..4cbafe2b2565
--- /dev/null
+++ b/lib/zinc/selftest/run.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_SELFTEST_RUN_H
+#define _ZINC_SELFTEST_RUN_H
+
+#include <linux/kernel.h>
+#include <linux/printk.h>
+#include <linux/bug.h>
+
+static inline bool selftest_run(const char *name, bool (*selftest)(void),
+ bool *const nobs[], unsigned int nobs_len)
+{
+ unsigned long subset = 0, set = 0;
+ unsigned int i;
+ bool ret = true;
+
+ BUILD_BUG_ON(!__builtin_constant_p(nobs_len) ||
+ nobs_len >= BITS_PER_LONG);
+
+ if (!IS_ENABLED(CONFIG_ZINC_SELFTEST))
+ return true;
+
+ for (i = 0; i < nobs_len; ++i)
+ set |= ((unsigned long)*nobs[i]) << i;
+
+ do {
+ for (i = 0; i < nobs_len; ++i)
+ *nobs[i] = (subset >> i) & 1;
+ if (!selftest()) {
+ pr_err("%s self-test combo 0x%lx: FAIL\n", name,
+ subset);
+ ret = false;
+ }
+ subset = (subset - set) & set;
+ } while (subset);
+
+ for (i = 0; i < nobs_len; ++i)
+ *nobs[i] = (set >> i) & 1;
+
+ if (ret)
+ pr_info("%s self-tests: pass\n", name);
+
+ return !WARN_ON(!ret);
+}
+
+#endif
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 05/28] zinc: import Andy Polyakov's ChaCha20 x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
2018-10-06 2:56 ` [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 04/28] zinc: ChaCha20 generic C implementation and selftest Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 06/28] zinc: " Jason A. Donenfeld
` (20 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Andy Polyakov, Thomas Gleixner, Ingo Molnar,
x86, Samuel Neves, Jean-Philippe Aumasson, Andy Lutomirski,
Andrew Morton, Linus Torvalds, kernel-hardening, linux-crypto
These x86_64 vectorized implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
be the same as OpenSSL's commit cded951378069a478391843f5f8653c1eb5128da
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Based-on-code-from: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
.../chacha20/chacha20-x86_64-cryptogams.S | 3433 +++++++++++++++++
1 file changed, 3433 insertions(+)
create mode 100644 lib/zinc/chacha20/chacha20-x86_64-cryptogams.S
diff --git a/lib/zinc/chacha20/chacha20-x86_64-cryptogams.S b/lib/zinc/chacha20/chacha20-x86_64-cryptogams.S
new file mode 100644
index 000000000000..2bfc76f7e01f
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-x86_64-cryptogams.S
@@ -0,0 +1,3433 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+.text
+
+
+
+.align 64
+.Lzero:
+.long 0,0,0,0
+.Lone:
+.long 1,0,0,0
+.Linc:
+.long 0,1,2,3
+.Lfour:
+.long 4,4,4,4
+.Lincy:
+.long 0,2,4,6,1,3,5,7
+.Leight:
+.long 8,8,8,8,8,8,8,8
+.Lrot16:
+.byte 0x2,0x3,0x0,0x1, 0x6,0x7,0x4,0x5, 0xa,0xb,0x8,0x9, 0xe,0xf,0xc,0xd
+.Lrot24:
+.byte 0x3,0x0,0x1,0x2, 0x7,0x4,0x5,0x6, 0xb,0x8,0x9,0xa, 0xf,0xc,0xd,0xe
+.Ltwoy:
+.long 2,0,0,0, 2,0,0,0
+.align 64
+.Lzeroz:
+.long 0,0,0,0, 1,0,0,0, 2,0,0,0, 3,0,0,0
+.Lfourz:
+.long 4,0,0,0, 4,0,0,0, 4,0,0,0, 4,0,0,0
+.Lincz:
+.long 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
+.Lsixteen:
+.long 16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16
+.Lsigma:
+.byte 101,120,112,97,110,100,32,51,50,45,98,121,116,101,32,107,0
+.byte 67,104,97,67,104,97,50,48,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
+.globl ChaCha20_ctr32
+.type ChaCha20_ctr32,@function
+.align 64
+ChaCha20_ctr32:
+.cfi_startproc
+ cmpq $0,%rdx
+ je .Lno_data
+ movq OPENSSL_ia32cap_P+4(%rip),%r10
+ btq $48,%r10
+ jc .LChaCha20_avx512
+ testq %r10,%r10
+ js .LChaCha20_avx512vl
+ testl $512,%r10d
+ jnz .LChaCha20_ssse3
+
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+ subq $64+24,%rsp
+.cfi_adjust_cfa_offset 64+24
+.Lctr32_body:
+
+
+ movdqu (%rcx),%xmm1
+ movdqu 16(%rcx),%xmm2
+ movdqu (%r8),%xmm3
+ movdqa .Lone(%rip),%xmm4
+
+
+ movdqa %xmm1,16(%rsp)
+ movdqa %xmm2,32(%rsp)
+ movdqa %xmm3,48(%rsp)
+ movq %rdx,%rbp
+ jmp .Loop_outer
+
+.align 32
+.Loop_outer:
+ movl $0x61707865,%eax
+ movl $0x3320646e,%ebx
+ movl $0x79622d32,%ecx
+ movl $0x6b206574,%edx
+ movl 16(%rsp),%r8d
+ movl 20(%rsp),%r9d
+ movl 24(%rsp),%r10d
+ movl 28(%rsp),%r11d
+ movd %xmm3,%r12d
+ movl 52(%rsp),%r13d
+ movl 56(%rsp),%r14d
+ movl 60(%rsp),%r15d
+
+ movq %rbp,64+0(%rsp)
+ movl $10,%ebp
+ movq %rsi,64+8(%rsp)
+.byte 102,72,15,126,214
+ movq %rdi,64+16(%rsp)
+ movq %rsi,%rdi
+ shrq $32,%rdi
+ jmp .Loop
+
+.align 32
+.Loop:
+ addl %r8d,%eax
+ xorl %eax,%r12d
+ roll $16,%r12d
+ addl %r9d,%ebx
+ xorl %ebx,%r13d
+ roll $16,%r13d
+ addl %r12d,%esi
+ xorl %esi,%r8d
+ roll $12,%r8d
+ addl %r13d,%edi
+ xorl %edi,%r9d
+ roll $12,%r9d
+ addl %r8d,%eax
+ xorl %eax,%r12d
+ roll $8,%r12d
+ addl %r9d,%ebx
+ xorl %ebx,%r13d
+ roll $8,%r13d
+ addl %r12d,%esi
+ xorl %esi,%r8d
+ roll $7,%r8d
+ addl %r13d,%edi
+ xorl %edi,%r9d
+ roll $7,%r9d
+ movl %esi,32(%rsp)
+ movl %edi,36(%rsp)
+ movl 40(%rsp),%esi
+ movl 44(%rsp),%edi
+ addl %r10d,%ecx
+ xorl %ecx,%r14d
+ roll $16,%r14d
+ addl %r11d,%edx
+ xorl %edx,%r15d
+ roll $16,%r15d
+ addl %r14d,%esi
+ xorl %esi,%r10d
+ roll $12,%r10d
+ addl %r15d,%edi
+ xorl %edi,%r11d
+ roll $12,%r11d
+ addl %r10d,%ecx
+ xorl %ecx,%r14d
+ roll $8,%r14d
+ addl %r11d,%edx
+ xorl %edx,%r15d
+ roll $8,%r15d
+ addl %r14d,%esi
+ xorl %esi,%r10d
+ roll $7,%r10d
+ addl %r15d,%edi
+ xorl %edi,%r11d
+ roll $7,%r11d
+ addl %r9d,%eax
+ xorl %eax,%r15d
+ roll $16,%r15d
+ addl %r10d,%ebx
+ xorl %ebx,%r12d
+ roll $16,%r12d
+ addl %r15d,%esi
+ xorl %esi,%r9d
+ roll $12,%r9d
+ addl %r12d,%edi
+ xorl %edi,%r10d
+ roll $12,%r10d
+ addl %r9d,%eax
+ xorl %eax,%r15d
+ roll $8,%r15d
+ addl %r10d,%ebx
+ xorl %ebx,%r12d
+ roll $8,%r12d
+ addl %r15d,%esi
+ xorl %esi,%r9d
+ roll $7,%r9d
+ addl %r12d,%edi
+ xorl %edi,%r10d
+ roll $7,%r10d
+ movl %esi,40(%rsp)
+ movl %edi,44(%rsp)
+ movl 32(%rsp),%esi
+ movl 36(%rsp),%edi
+ addl %r11d,%ecx
+ xorl %ecx,%r13d
+ roll $16,%r13d
+ addl %r8d,%edx
+ xorl %edx,%r14d
+ roll $16,%r14d
+ addl %r13d,%esi
+ xorl %esi,%r11d
+ roll $12,%r11d
+ addl %r14d,%edi
+ xorl %edi,%r8d
+ roll $12,%r8d
+ addl %r11d,%ecx
+ xorl %ecx,%r13d
+ roll $8,%r13d
+ addl %r8d,%edx
+ xorl %edx,%r14d
+ roll $8,%r14d
+ addl %r13d,%esi
+ xorl %esi,%r11d
+ roll $7,%r11d
+ addl %r14d,%edi
+ xorl %edi,%r8d
+ roll $7,%r8d
+ decl %ebp
+ jnz .Loop
+ movl %edi,36(%rsp)
+ movl %esi,32(%rsp)
+ movq 64(%rsp),%rbp
+ movdqa %xmm2,%xmm1
+ movq 64+8(%rsp),%rsi
+ paddd %xmm4,%xmm3
+ movq 64+16(%rsp),%rdi
+
+ addl $0x61707865,%eax
+ addl $0x3320646e,%ebx
+ addl $0x79622d32,%ecx
+ addl $0x6b206574,%edx
+ addl 16(%rsp),%r8d
+ addl 20(%rsp),%r9d
+ addl 24(%rsp),%r10d
+ addl 28(%rsp),%r11d
+ addl 48(%rsp),%r12d
+ addl 52(%rsp),%r13d
+ addl 56(%rsp),%r14d
+ addl 60(%rsp),%r15d
+ paddd 32(%rsp),%xmm1
+
+ cmpq $64,%rbp
+ jb .Ltail
+
+ xorl 0(%rsi),%eax
+ xorl 4(%rsi),%ebx
+ xorl 8(%rsi),%ecx
+ xorl 12(%rsi),%edx
+ xorl 16(%rsi),%r8d
+ xorl 20(%rsi),%r9d
+ xorl 24(%rsi),%r10d
+ xorl 28(%rsi),%r11d
+ movdqu 32(%rsi),%xmm0
+ xorl 48(%rsi),%r12d
+ xorl 52(%rsi),%r13d
+ xorl 56(%rsi),%r14d
+ xorl 60(%rsi),%r15d
+ leaq 64(%rsi),%rsi
+ pxor %xmm1,%xmm0
+
+ movdqa %xmm2,32(%rsp)
+ movd %xmm3,48(%rsp)
+
+ movl %eax,0(%rdi)
+ movl %ebx,4(%rdi)
+ movl %ecx,8(%rdi)
+ movl %edx,12(%rdi)
+ movl %r8d,16(%rdi)
+ movl %r9d,20(%rdi)
+ movl %r10d,24(%rdi)
+ movl %r11d,28(%rdi)
+ movdqu %xmm0,32(%rdi)
+ movl %r12d,48(%rdi)
+ movl %r13d,52(%rdi)
+ movl %r14d,56(%rdi)
+ movl %r15d,60(%rdi)
+ leaq 64(%rdi),%rdi
+
+ subq $64,%rbp
+ jnz .Loop_outer
+
+ jmp .Ldone
+
+.align 16
+.Ltail:
+ movl %eax,0(%rsp)
+ movl %ebx,4(%rsp)
+ xorq %rbx,%rbx
+ movl %ecx,8(%rsp)
+ movl %edx,12(%rsp)
+ movl %r8d,16(%rsp)
+ movl %r9d,20(%rsp)
+ movl %r10d,24(%rsp)
+ movl %r11d,28(%rsp)
+ movdqa %xmm1,32(%rsp)
+ movl %r12d,48(%rsp)
+ movl %r13d,52(%rsp)
+ movl %r14d,56(%rsp)
+ movl %r15d,60(%rsp)
+
+.Loop_tail:
+ movzbl (%rsi,%rbx,1),%eax
+ movzbl (%rsp,%rbx,1),%edx
+ leaq 1(%rbx),%rbx
+ xorl %edx,%eax
+ movb %al,-1(%rdi,%rbx,1)
+ decq %rbp
+ jnz .Loop_tail
+
+.Ldone:
+ leaq 64+24+48(%rsp),%rsi
+.cfi_def_cfa %rsi,8
+ movq -48(%rsi),%r15
+.cfi_restore %r15
+ movq -40(%rsi),%r14
+.cfi_restore %r14
+ movq -32(%rsi),%r13
+.cfi_restore %r13
+ movq -24(%rsi),%r12
+.cfi_restore %r12
+ movq -16(%rsi),%rbp
+.cfi_restore %rbp
+ movq -8(%rsi),%rbx
+.cfi_restore %rbx
+ leaq (%rsi),%rsp
+.cfi_def_cfa_register %rsp
+.Lno_data:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_ctr32,.-ChaCha20_ctr32
+.type ChaCha20_ssse3,@function
+.align 32
+ChaCha20_ssse3:
+.cfi_startproc
+.LChaCha20_ssse3:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ testl $2048,%r10d
+ jnz .LChaCha20_4xop
+ cmpq $128,%rdx
+ je .LChaCha20_128
+ ja .LChaCha20_4x
+
+.Ldo_sse3_after_all:
+ subq $64+8,%rsp
+ movdqa .Lsigma(%rip),%xmm0
+ movdqu (%rcx),%xmm1
+ movdqu 16(%rcx),%xmm2
+ movdqu (%r8),%xmm3
+ movdqa .Lrot16(%rip),%xmm6
+ movdqa .Lrot24(%rip),%xmm7
+
+ movdqa %xmm0,0(%rsp)
+ movdqa %xmm1,16(%rsp)
+ movdqa %xmm2,32(%rsp)
+ movdqa %xmm3,48(%rsp)
+ movq $10,%r8
+ jmp .Loop_ssse3
+
+.align 32
+.Loop_outer_ssse3:
+ movdqa .Lone(%rip),%xmm3
+ movdqa 0(%rsp),%xmm0
+ movdqa 16(%rsp),%xmm1
+ movdqa 32(%rsp),%xmm2
+ paddd 48(%rsp),%xmm3
+ movq $10,%r8
+ movdqa %xmm3,48(%rsp)
+ jmp .Loop_ssse3
+
+.align 32
+.Loop_ssse3:
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+.byte 102,15,56,0,222
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $20,%xmm1
+ pslld $12,%xmm4
+ por %xmm4,%xmm1
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+.byte 102,15,56,0,223
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $25,%xmm1
+ pslld $7,%xmm4
+ por %xmm4,%xmm1
+ pshufd $78,%xmm2,%xmm2
+ pshufd $57,%xmm1,%xmm1
+ pshufd $147,%xmm3,%xmm3
+ nop
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+.byte 102,15,56,0,222
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $20,%xmm1
+ pslld $12,%xmm4
+ por %xmm4,%xmm1
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+.byte 102,15,56,0,223
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $25,%xmm1
+ pslld $7,%xmm4
+ por %xmm4,%xmm1
+ pshufd $78,%xmm2,%xmm2
+ pshufd $147,%xmm1,%xmm1
+ pshufd $57,%xmm3,%xmm3
+ decq %r8
+ jnz .Loop_ssse3
+ paddd 0(%rsp),%xmm0
+ paddd 16(%rsp),%xmm1
+ paddd 32(%rsp),%xmm2
+ paddd 48(%rsp),%xmm3
+
+ cmpq $64,%rdx
+ jb .Ltail_ssse3
+
+ movdqu 0(%rsi),%xmm4
+ movdqu 16(%rsi),%xmm5
+ pxor %xmm4,%xmm0
+ movdqu 32(%rsi),%xmm4
+ pxor %xmm5,%xmm1
+ movdqu 48(%rsi),%xmm5
+ leaq 64(%rsi),%rsi
+ pxor %xmm4,%xmm2
+ pxor %xmm5,%xmm3
+
+ movdqu %xmm0,0(%rdi)
+ movdqu %xmm1,16(%rdi)
+ movdqu %xmm2,32(%rdi)
+ movdqu %xmm3,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ subq $64,%rdx
+ jnz .Loop_outer_ssse3
+
+ jmp .Ldone_ssse3
+
+.align 16
+.Ltail_ssse3:
+ movdqa %xmm0,0(%rsp)
+ movdqa %xmm1,16(%rsp)
+ movdqa %xmm2,32(%rsp)
+ movdqa %xmm3,48(%rsp)
+ xorq %r8,%r8
+
+.Loop_tail_ssse3:
+ movzbl (%rsi,%r8,1),%eax
+ movzbl (%rsp,%r8,1),%ecx
+ leaq 1(%r8),%r8
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r8,1)
+ decq %rdx
+ jnz .Loop_tail_ssse3
+
+.Ldone_ssse3:
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.Lssse3_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_ssse3,.-ChaCha20_ssse3
+.type ChaCha20_128,@function
+.align 32
+ChaCha20_128:
+.cfi_startproc
+.LChaCha20_128:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ subq $64+8,%rsp
+ movdqa .Lsigma(%rip),%xmm8
+ movdqu (%rcx),%xmm9
+ movdqu 16(%rcx),%xmm2
+ movdqu (%r8),%xmm3
+ movdqa .Lone(%rip),%xmm1
+ movdqa .Lrot16(%rip),%xmm6
+ movdqa .Lrot24(%rip),%xmm7
+
+ movdqa %xmm8,%xmm10
+ movdqa %xmm8,0(%rsp)
+ movdqa %xmm9,%xmm11
+ movdqa %xmm9,16(%rsp)
+ movdqa %xmm2,%xmm0
+ movdqa %xmm2,32(%rsp)
+ paddd %xmm3,%xmm1
+ movdqa %xmm3,48(%rsp)
+ movq $10,%r8
+ jmp .Loop_128
+
+.align 32
+.Loop_128:
+ paddd %xmm9,%xmm8
+ pxor %xmm8,%xmm3
+ paddd %xmm11,%xmm10
+ pxor %xmm10,%xmm1
+.byte 102,15,56,0,222
+.byte 102,15,56,0,206
+ paddd %xmm3,%xmm2
+ paddd %xmm1,%xmm0
+ pxor %xmm2,%xmm9
+ pxor %xmm0,%xmm11
+ movdqa %xmm9,%xmm4
+ psrld $20,%xmm9
+ movdqa %xmm11,%xmm5
+ pslld $12,%xmm4
+ psrld $20,%xmm11
+ por %xmm4,%xmm9
+ pslld $12,%xmm5
+ por %xmm5,%xmm11
+ paddd %xmm9,%xmm8
+ pxor %xmm8,%xmm3
+ paddd %xmm11,%xmm10
+ pxor %xmm10,%xmm1
+.byte 102,15,56,0,223
+.byte 102,15,56,0,207
+ paddd %xmm3,%xmm2
+ paddd %xmm1,%xmm0
+ pxor %xmm2,%xmm9
+ pxor %xmm0,%xmm11
+ movdqa %xmm9,%xmm4
+ psrld $25,%xmm9
+ movdqa %xmm11,%xmm5
+ pslld $7,%xmm4
+ psrld $25,%xmm11
+ por %xmm4,%xmm9
+ pslld $7,%xmm5
+ por %xmm5,%xmm11
+ pshufd $78,%xmm2,%xmm2
+ pshufd $57,%xmm9,%xmm9
+ pshufd $147,%xmm3,%xmm3
+ pshufd $78,%xmm0,%xmm0
+ pshufd $57,%xmm11,%xmm11
+ pshufd $147,%xmm1,%xmm1
+ paddd %xmm9,%xmm8
+ pxor %xmm8,%xmm3
+ paddd %xmm11,%xmm10
+ pxor %xmm10,%xmm1
+.byte 102,15,56,0,222
+.byte 102,15,56,0,206
+ paddd %xmm3,%xmm2
+ paddd %xmm1,%xmm0
+ pxor %xmm2,%xmm9
+ pxor %xmm0,%xmm11
+ movdqa %xmm9,%xmm4
+ psrld $20,%xmm9
+ movdqa %xmm11,%xmm5
+ pslld $12,%xmm4
+ psrld $20,%xmm11
+ por %xmm4,%xmm9
+ pslld $12,%xmm5
+ por %xmm5,%xmm11
+ paddd %xmm9,%xmm8
+ pxor %xmm8,%xmm3
+ paddd %xmm11,%xmm10
+ pxor %xmm10,%xmm1
+.byte 102,15,56,0,223
+.byte 102,15,56,0,207
+ paddd %xmm3,%xmm2
+ paddd %xmm1,%xmm0
+ pxor %xmm2,%xmm9
+ pxor %xmm0,%xmm11
+ movdqa %xmm9,%xmm4
+ psrld $25,%xmm9
+ movdqa %xmm11,%xmm5
+ pslld $7,%xmm4
+ psrld $25,%xmm11
+ por %xmm4,%xmm9
+ pslld $7,%xmm5
+ por %xmm5,%xmm11
+ pshufd $78,%xmm2,%xmm2
+ pshufd $147,%xmm9,%xmm9
+ pshufd $57,%xmm3,%xmm3
+ pshufd $78,%xmm0,%xmm0
+ pshufd $147,%xmm11,%xmm11
+ pshufd $57,%xmm1,%xmm1
+ decq %r8
+ jnz .Loop_128
+ paddd 0(%rsp),%xmm8
+ paddd 16(%rsp),%xmm9
+ paddd 32(%rsp),%xmm2
+ paddd 48(%rsp),%xmm3
+ paddd .Lone(%rip),%xmm1
+ paddd 0(%rsp),%xmm10
+ paddd 16(%rsp),%xmm11
+ paddd 32(%rsp),%xmm0
+ paddd 48(%rsp),%xmm1
+
+ movdqu 0(%rsi),%xmm4
+ movdqu 16(%rsi),%xmm5
+ pxor %xmm4,%xmm8
+ movdqu 32(%rsi),%xmm4
+ pxor %xmm5,%xmm9
+ movdqu 48(%rsi),%xmm5
+ pxor %xmm4,%xmm2
+ movdqu 64(%rsi),%xmm4
+ pxor %xmm5,%xmm3
+ movdqu 80(%rsi),%xmm5
+ pxor %xmm4,%xmm10
+ movdqu 96(%rsi),%xmm4
+ pxor %xmm5,%xmm11
+ movdqu 112(%rsi),%xmm5
+ pxor %xmm4,%xmm0
+ pxor %xmm5,%xmm1
+
+ movdqu %xmm8,0(%rdi)
+ movdqu %xmm9,16(%rdi)
+ movdqu %xmm2,32(%rdi)
+ movdqu %xmm3,48(%rdi)
+ movdqu %xmm10,64(%rdi)
+ movdqu %xmm11,80(%rdi)
+ movdqu %xmm0,96(%rdi)
+ movdqu %xmm1,112(%rdi)
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L128_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_128,.-ChaCha20_128
+.type ChaCha20_4x,@function
+.align 32
+ChaCha20_4x:
+.cfi_startproc
+.LChaCha20_4x:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ movq %r10,%r11
+ shrq $32,%r10
+ testq $32,%r10
+ jnz .LChaCha20_8x
+ cmpq $192,%rdx
+ ja .Lproceed4x
+
+ andq $71303168,%r11
+ cmpq $4194304,%r11
+ je .Ldo_sse3_after_all
+
+.Lproceed4x:
+ subq $0x140+8,%rsp
+ movdqa .Lsigma(%rip),%xmm11
+ movdqu (%rcx),%xmm15
+ movdqu 16(%rcx),%xmm7
+ movdqu (%r8),%xmm3
+ leaq 256(%rsp),%rcx
+ leaq .Lrot16(%rip),%r10
+ leaq .Lrot24(%rip),%r11
+
+ pshufd $0x00,%xmm11,%xmm8
+ pshufd $0x55,%xmm11,%xmm9
+ movdqa %xmm8,64(%rsp)
+ pshufd $0xaa,%xmm11,%xmm10
+ movdqa %xmm9,80(%rsp)
+ pshufd $0xff,%xmm11,%xmm11
+ movdqa %xmm10,96(%rsp)
+ movdqa %xmm11,112(%rsp)
+
+ pshufd $0x00,%xmm15,%xmm12
+ pshufd $0x55,%xmm15,%xmm13
+ movdqa %xmm12,128-256(%rcx)
+ pshufd $0xaa,%xmm15,%xmm14
+ movdqa %xmm13,144-256(%rcx)
+ pshufd $0xff,%xmm15,%xmm15
+ movdqa %xmm14,160-256(%rcx)
+ movdqa %xmm15,176-256(%rcx)
+
+ pshufd $0x00,%xmm7,%xmm4
+ pshufd $0x55,%xmm7,%xmm5
+ movdqa %xmm4,192-256(%rcx)
+ pshufd $0xaa,%xmm7,%xmm6
+ movdqa %xmm5,208-256(%rcx)
+ pshufd $0xff,%xmm7,%xmm7
+ movdqa %xmm6,224-256(%rcx)
+ movdqa %xmm7,240-256(%rcx)
+
+ pshufd $0x00,%xmm3,%xmm0
+ pshufd $0x55,%xmm3,%xmm1
+ paddd .Linc(%rip),%xmm0
+ pshufd $0xaa,%xmm3,%xmm2
+ movdqa %xmm1,272-256(%rcx)
+ pshufd $0xff,%xmm3,%xmm3
+ movdqa %xmm2,288-256(%rcx)
+ movdqa %xmm3,304-256(%rcx)
+
+ jmp .Loop_enter4x
+
+.align 32
+.Loop_outer4x:
+ movdqa 64(%rsp),%xmm8
+ movdqa 80(%rsp),%xmm9
+ movdqa 96(%rsp),%xmm10
+ movdqa 112(%rsp),%xmm11
+ movdqa 128-256(%rcx),%xmm12
+ movdqa 144-256(%rcx),%xmm13
+ movdqa 160-256(%rcx),%xmm14
+ movdqa 176-256(%rcx),%xmm15
+ movdqa 192-256(%rcx),%xmm4
+ movdqa 208-256(%rcx),%xmm5
+ movdqa 224-256(%rcx),%xmm6
+ movdqa 240-256(%rcx),%xmm7
+ movdqa 256-256(%rcx),%xmm0
+ movdqa 272-256(%rcx),%xmm1
+ movdqa 288-256(%rcx),%xmm2
+ movdqa 304-256(%rcx),%xmm3
+ paddd .Lfour(%rip),%xmm0
+
+.Loop_enter4x:
+ movdqa %xmm6,32(%rsp)
+ movdqa %xmm7,48(%rsp)
+ movdqa (%r10),%xmm7
+ movl $10,%eax
+ movdqa %xmm0,256-256(%rcx)
+ jmp .Loop4x
+
+.align 32
+.Loop4x:
+ paddd %xmm12,%xmm8
+ paddd %xmm13,%xmm9
+ pxor %xmm8,%xmm0
+ pxor %xmm9,%xmm1
+.byte 102,15,56,0,199
+.byte 102,15,56,0,207
+ paddd %xmm0,%xmm4
+ paddd %xmm1,%xmm5
+ pxor %xmm4,%xmm12
+ pxor %xmm5,%xmm13
+ movdqa %xmm12,%xmm6
+ pslld $12,%xmm12
+ psrld $20,%xmm6
+ movdqa %xmm13,%xmm7
+ pslld $12,%xmm13
+ por %xmm6,%xmm12
+ psrld $20,%xmm7
+ movdqa (%r11),%xmm6
+ por %xmm7,%xmm13
+ paddd %xmm12,%xmm8
+ paddd %xmm13,%xmm9
+ pxor %xmm8,%xmm0
+ pxor %xmm9,%xmm1
+.byte 102,15,56,0,198
+.byte 102,15,56,0,206
+ paddd %xmm0,%xmm4
+ paddd %xmm1,%xmm5
+ pxor %xmm4,%xmm12
+ pxor %xmm5,%xmm13
+ movdqa %xmm12,%xmm7
+ pslld $7,%xmm12
+ psrld $25,%xmm7
+ movdqa %xmm13,%xmm6
+ pslld $7,%xmm13
+ por %xmm7,%xmm12
+ psrld $25,%xmm6
+ movdqa (%r10),%xmm7
+ por %xmm6,%xmm13
+ movdqa %xmm4,0(%rsp)
+ movdqa %xmm5,16(%rsp)
+ movdqa 32(%rsp),%xmm4
+ movdqa 48(%rsp),%xmm5
+ paddd %xmm14,%xmm10
+ paddd %xmm15,%xmm11
+ pxor %xmm10,%xmm2
+ pxor %xmm11,%xmm3
+.byte 102,15,56,0,215
+.byte 102,15,56,0,223
+ paddd %xmm2,%xmm4
+ paddd %xmm3,%xmm5
+ pxor %xmm4,%xmm14
+ pxor %xmm5,%xmm15
+ movdqa %xmm14,%xmm6
+ pslld $12,%xmm14
+ psrld $20,%xmm6
+ movdqa %xmm15,%xmm7
+ pslld $12,%xmm15
+ por %xmm6,%xmm14
+ psrld $20,%xmm7
+ movdqa (%r11),%xmm6
+ por %xmm7,%xmm15
+ paddd %xmm14,%xmm10
+ paddd %xmm15,%xmm11
+ pxor %xmm10,%xmm2
+ pxor %xmm11,%xmm3
+.byte 102,15,56,0,214
+.byte 102,15,56,0,222
+ paddd %xmm2,%xmm4
+ paddd %xmm3,%xmm5
+ pxor %xmm4,%xmm14
+ pxor %xmm5,%xmm15
+ movdqa %xmm14,%xmm7
+ pslld $7,%xmm14
+ psrld $25,%xmm7
+ movdqa %xmm15,%xmm6
+ pslld $7,%xmm15
+ por %xmm7,%xmm14
+ psrld $25,%xmm6
+ movdqa (%r10),%xmm7
+ por %xmm6,%xmm15
+ paddd %xmm13,%xmm8
+ paddd %xmm14,%xmm9
+ pxor %xmm8,%xmm3
+ pxor %xmm9,%xmm0
+.byte 102,15,56,0,223
+.byte 102,15,56,0,199
+ paddd %xmm3,%xmm4
+ paddd %xmm0,%xmm5
+ pxor %xmm4,%xmm13
+ pxor %xmm5,%xmm14
+ movdqa %xmm13,%xmm6
+ pslld $12,%xmm13
+ psrld $20,%xmm6
+ movdqa %xmm14,%xmm7
+ pslld $12,%xmm14
+ por %xmm6,%xmm13
+ psrld $20,%xmm7
+ movdqa (%r11),%xmm6
+ por %xmm7,%xmm14
+ paddd %xmm13,%xmm8
+ paddd %xmm14,%xmm9
+ pxor %xmm8,%xmm3
+ pxor %xmm9,%xmm0
+.byte 102,15,56,0,222
+.byte 102,15,56,0,198
+ paddd %xmm3,%xmm4
+ paddd %xmm0,%xmm5
+ pxor %xmm4,%xmm13
+ pxor %xmm5,%xmm14
+ movdqa %xmm13,%xmm7
+ pslld $7,%xmm13
+ psrld $25,%xmm7
+ movdqa %xmm14,%xmm6
+ pslld $7,%xmm14
+ por %xmm7,%xmm13
+ psrld $25,%xmm6
+ movdqa (%r10),%xmm7
+ por %xmm6,%xmm14
+ movdqa %xmm4,32(%rsp)
+ movdqa %xmm5,48(%rsp)
+ movdqa 0(%rsp),%xmm4
+ movdqa 16(%rsp),%xmm5
+ paddd %xmm15,%xmm10
+ paddd %xmm12,%xmm11
+ pxor %xmm10,%xmm1
+ pxor %xmm11,%xmm2
+.byte 102,15,56,0,207
+.byte 102,15,56,0,215
+ paddd %xmm1,%xmm4
+ paddd %xmm2,%xmm5
+ pxor %xmm4,%xmm15
+ pxor %xmm5,%xmm12
+ movdqa %xmm15,%xmm6
+ pslld $12,%xmm15
+ psrld $20,%xmm6
+ movdqa %xmm12,%xmm7
+ pslld $12,%xmm12
+ por %xmm6,%xmm15
+ psrld $20,%xmm7
+ movdqa (%r11),%xmm6
+ por %xmm7,%xmm12
+ paddd %xmm15,%xmm10
+ paddd %xmm12,%xmm11
+ pxor %xmm10,%xmm1
+ pxor %xmm11,%xmm2
+.byte 102,15,56,0,206
+.byte 102,15,56,0,214
+ paddd %xmm1,%xmm4
+ paddd %xmm2,%xmm5
+ pxor %xmm4,%xmm15
+ pxor %xmm5,%xmm12
+ movdqa %xmm15,%xmm7
+ pslld $7,%xmm15
+ psrld $25,%xmm7
+ movdqa %xmm12,%xmm6
+ pslld $7,%xmm12
+ por %xmm7,%xmm15
+ psrld $25,%xmm6
+ movdqa (%r10),%xmm7
+ por %xmm6,%xmm12
+ decl %eax
+ jnz .Loop4x
+
+ paddd 64(%rsp),%xmm8
+ paddd 80(%rsp),%xmm9
+ paddd 96(%rsp),%xmm10
+ paddd 112(%rsp),%xmm11
+
+ movdqa %xmm8,%xmm6
+ punpckldq %xmm9,%xmm8
+ movdqa %xmm10,%xmm7
+ punpckldq %xmm11,%xmm10
+ punpckhdq %xmm9,%xmm6
+ punpckhdq %xmm11,%xmm7
+ movdqa %xmm8,%xmm9
+ punpcklqdq %xmm10,%xmm8
+ movdqa %xmm6,%xmm11
+ punpcklqdq %xmm7,%xmm6
+ punpckhqdq %xmm10,%xmm9
+ punpckhqdq %xmm7,%xmm11
+ paddd 128-256(%rcx),%xmm12
+ paddd 144-256(%rcx),%xmm13
+ paddd 160-256(%rcx),%xmm14
+ paddd 176-256(%rcx),%xmm15
+
+ movdqa %xmm8,0(%rsp)
+ movdqa %xmm9,16(%rsp)
+ movdqa 32(%rsp),%xmm8
+ movdqa 48(%rsp),%xmm9
+
+ movdqa %xmm12,%xmm10
+ punpckldq %xmm13,%xmm12
+ movdqa %xmm14,%xmm7
+ punpckldq %xmm15,%xmm14
+ punpckhdq %xmm13,%xmm10
+ punpckhdq %xmm15,%xmm7
+ movdqa %xmm12,%xmm13
+ punpcklqdq %xmm14,%xmm12
+ movdqa %xmm10,%xmm15
+ punpcklqdq %xmm7,%xmm10
+ punpckhqdq %xmm14,%xmm13
+ punpckhqdq %xmm7,%xmm15
+ paddd 192-256(%rcx),%xmm4
+ paddd 208-256(%rcx),%xmm5
+ paddd 224-256(%rcx),%xmm8
+ paddd 240-256(%rcx),%xmm9
+
+ movdqa %xmm6,32(%rsp)
+ movdqa %xmm11,48(%rsp)
+
+ movdqa %xmm4,%xmm14
+ punpckldq %xmm5,%xmm4
+ movdqa %xmm8,%xmm7
+ punpckldq %xmm9,%xmm8
+ punpckhdq %xmm5,%xmm14
+ punpckhdq %xmm9,%xmm7
+ movdqa %xmm4,%xmm5
+ punpcklqdq %xmm8,%xmm4
+ movdqa %xmm14,%xmm9
+ punpcklqdq %xmm7,%xmm14
+ punpckhqdq %xmm8,%xmm5
+ punpckhqdq %xmm7,%xmm9
+ paddd 256-256(%rcx),%xmm0
+ paddd 272-256(%rcx),%xmm1
+ paddd 288-256(%rcx),%xmm2
+ paddd 304-256(%rcx),%xmm3
+
+ movdqa %xmm0,%xmm8
+ punpckldq %xmm1,%xmm0
+ movdqa %xmm2,%xmm7
+ punpckldq %xmm3,%xmm2
+ punpckhdq %xmm1,%xmm8
+ punpckhdq %xmm3,%xmm7
+ movdqa %xmm0,%xmm1
+ punpcklqdq %xmm2,%xmm0
+ movdqa %xmm8,%xmm3
+ punpcklqdq %xmm7,%xmm8
+ punpckhqdq %xmm2,%xmm1
+ punpckhqdq %xmm7,%xmm3
+ cmpq $256,%rdx
+ jb .Ltail4x
+
+ movdqu 0(%rsi),%xmm6
+ movdqu 16(%rsi),%xmm11
+ movdqu 32(%rsi),%xmm2
+ movdqu 48(%rsi),%xmm7
+ pxor 0(%rsp),%xmm6
+ pxor %xmm12,%xmm11
+ pxor %xmm4,%xmm2
+ pxor %xmm0,%xmm7
+
+ movdqu %xmm6,0(%rdi)
+ movdqu 64(%rsi),%xmm6
+ movdqu %xmm11,16(%rdi)
+ movdqu 80(%rsi),%xmm11
+ movdqu %xmm2,32(%rdi)
+ movdqu 96(%rsi),%xmm2
+ movdqu %xmm7,48(%rdi)
+ movdqu 112(%rsi),%xmm7
+ leaq 128(%rsi),%rsi
+ pxor 16(%rsp),%xmm6
+ pxor %xmm13,%xmm11
+ pxor %xmm5,%xmm2
+ pxor %xmm1,%xmm7
+
+ movdqu %xmm6,64(%rdi)
+ movdqu 0(%rsi),%xmm6
+ movdqu %xmm11,80(%rdi)
+ movdqu 16(%rsi),%xmm11
+ movdqu %xmm2,96(%rdi)
+ movdqu 32(%rsi),%xmm2
+ movdqu %xmm7,112(%rdi)
+ leaq 128(%rdi),%rdi
+ movdqu 48(%rsi),%xmm7
+ pxor 32(%rsp),%xmm6
+ pxor %xmm10,%xmm11
+ pxor %xmm14,%xmm2
+ pxor %xmm8,%xmm7
+
+ movdqu %xmm6,0(%rdi)
+ movdqu 64(%rsi),%xmm6
+ movdqu %xmm11,16(%rdi)
+ movdqu 80(%rsi),%xmm11
+ movdqu %xmm2,32(%rdi)
+ movdqu 96(%rsi),%xmm2
+ movdqu %xmm7,48(%rdi)
+ movdqu 112(%rsi),%xmm7
+ leaq 128(%rsi),%rsi
+ pxor 48(%rsp),%xmm6
+ pxor %xmm15,%xmm11
+ pxor %xmm9,%xmm2
+ pxor %xmm3,%xmm7
+ movdqu %xmm6,64(%rdi)
+ movdqu %xmm11,80(%rdi)
+ movdqu %xmm2,96(%rdi)
+ movdqu %xmm7,112(%rdi)
+ leaq 128(%rdi),%rdi
+
+ subq $256,%rdx
+ jnz .Loop_outer4x
+
+ jmp .Ldone4x
+
+.Ltail4x:
+ cmpq $192,%rdx
+ jae .L192_or_more4x
+ cmpq $128,%rdx
+ jae .L128_or_more4x
+ cmpq $64,%rdx
+ jae .L64_or_more4x
+
+
+ xorq %r10,%r10
+
+ movdqa %xmm12,16(%rsp)
+ movdqa %xmm4,32(%rsp)
+ movdqa %xmm0,48(%rsp)
+ jmp .Loop_tail4x
+
+.align 32
+.L64_or_more4x:
+ movdqu 0(%rsi),%xmm6
+ movdqu 16(%rsi),%xmm11
+ movdqu 32(%rsi),%xmm2
+ movdqu 48(%rsi),%xmm7
+ pxor 0(%rsp),%xmm6
+ pxor %xmm12,%xmm11
+ pxor %xmm4,%xmm2
+ pxor %xmm0,%xmm7
+ movdqu %xmm6,0(%rdi)
+ movdqu %xmm11,16(%rdi)
+ movdqu %xmm2,32(%rdi)
+ movdqu %xmm7,48(%rdi)
+ je .Ldone4x
+
+ movdqa 16(%rsp),%xmm6
+ leaq 64(%rsi),%rsi
+ xorq %r10,%r10
+ movdqa %xmm6,0(%rsp)
+ movdqa %xmm13,16(%rsp)
+ leaq 64(%rdi),%rdi
+ movdqa %xmm5,32(%rsp)
+ subq $64,%rdx
+ movdqa %xmm1,48(%rsp)
+ jmp .Loop_tail4x
+
+.align 32
+.L128_or_more4x:
+ movdqu 0(%rsi),%xmm6
+ movdqu 16(%rsi),%xmm11
+ movdqu 32(%rsi),%xmm2
+ movdqu 48(%rsi),%xmm7
+ pxor 0(%rsp),%xmm6
+ pxor %xmm12,%xmm11
+ pxor %xmm4,%xmm2
+ pxor %xmm0,%xmm7
+
+ movdqu %xmm6,0(%rdi)
+ movdqu 64(%rsi),%xmm6
+ movdqu %xmm11,16(%rdi)
+ movdqu 80(%rsi),%xmm11
+ movdqu %xmm2,32(%rdi)
+ movdqu 96(%rsi),%xmm2
+ movdqu %xmm7,48(%rdi)
+ movdqu 112(%rsi),%xmm7
+ pxor 16(%rsp),%xmm6
+ pxor %xmm13,%xmm11
+ pxor %xmm5,%xmm2
+ pxor %xmm1,%xmm7
+ movdqu %xmm6,64(%rdi)
+ movdqu %xmm11,80(%rdi)
+ movdqu %xmm2,96(%rdi)
+ movdqu %xmm7,112(%rdi)
+ je .Ldone4x
+
+ movdqa 32(%rsp),%xmm6
+ leaq 128(%rsi),%rsi
+ xorq %r10,%r10
+ movdqa %xmm6,0(%rsp)
+ movdqa %xmm10,16(%rsp)
+ leaq 128(%rdi),%rdi
+ movdqa %xmm14,32(%rsp)
+ subq $128,%rdx
+ movdqa %xmm8,48(%rsp)
+ jmp .Loop_tail4x
+
+.align 32
+.L192_or_more4x:
+ movdqu 0(%rsi),%xmm6
+ movdqu 16(%rsi),%xmm11
+ movdqu 32(%rsi),%xmm2
+ movdqu 48(%rsi),%xmm7
+ pxor 0(%rsp),%xmm6
+ pxor %xmm12,%xmm11
+ pxor %xmm4,%xmm2
+ pxor %xmm0,%xmm7
+
+ movdqu %xmm6,0(%rdi)
+ movdqu 64(%rsi),%xmm6
+ movdqu %xmm11,16(%rdi)
+ movdqu 80(%rsi),%xmm11
+ movdqu %xmm2,32(%rdi)
+ movdqu 96(%rsi),%xmm2
+ movdqu %xmm7,48(%rdi)
+ movdqu 112(%rsi),%xmm7
+ leaq 128(%rsi),%rsi
+ pxor 16(%rsp),%xmm6
+ pxor %xmm13,%xmm11
+ pxor %xmm5,%xmm2
+ pxor %xmm1,%xmm7
+
+ movdqu %xmm6,64(%rdi)
+ movdqu 0(%rsi),%xmm6
+ movdqu %xmm11,80(%rdi)
+ movdqu 16(%rsi),%xmm11
+ movdqu %xmm2,96(%rdi)
+ movdqu 32(%rsi),%xmm2
+ movdqu %xmm7,112(%rdi)
+ leaq 128(%rdi),%rdi
+ movdqu 48(%rsi),%xmm7
+ pxor 32(%rsp),%xmm6
+ pxor %xmm10,%xmm11
+ pxor %xmm14,%xmm2
+ pxor %xmm8,%xmm7
+ movdqu %xmm6,0(%rdi)
+ movdqu %xmm11,16(%rdi)
+ movdqu %xmm2,32(%rdi)
+ movdqu %xmm7,48(%rdi)
+ je .Ldone4x
+
+ movdqa 48(%rsp),%xmm6
+ leaq 64(%rsi),%rsi
+ xorq %r10,%r10
+ movdqa %xmm6,0(%rsp)
+ movdqa %xmm15,16(%rsp)
+ leaq 64(%rdi),%rdi
+ movdqa %xmm9,32(%rsp)
+ subq $192,%rdx
+ movdqa %xmm3,48(%rsp)
+
+.Loop_tail4x:
+ movzbl (%rsi,%r10,1),%eax
+ movzbl (%rsp,%r10,1),%ecx
+ leaq 1(%r10),%r10
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r10,1)
+ decq %rdx
+ jnz .Loop_tail4x
+
+.Ldone4x:
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L4x_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_4x,.-ChaCha20_4x
+.type ChaCha20_4xop,@function
+.align 32
+ChaCha20_4xop:
+.cfi_startproc
+.LChaCha20_4xop:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ subq $0x140+8,%rsp
+ vzeroupper
+
+ vmovdqa .Lsigma(%rip),%xmm11
+ vmovdqu (%rcx),%xmm3
+ vmovdqu 16(%rcx),%xmm15
+ vmovdqu (%r8),%xmm7
+ leaq 256(%rsp),%rcx
+
+ vpshufd $0x00,%xmm11,%xmm8
+ vpshufd $0x55,%xmm11,%xmm9
+ vmovdqa %xmm8,64(%rsp)
+ vpshufd $0xaa,%xmm11,%xmm10
+ vmovdqa %xmm9,80(%rsp)
+ vpshufd $0xff,%xmm11,%xmm11
+ vmovdqa %xmm10,96(%rsp)
+ vmovdqa %xmm11,112(%rsp)
+
+ vpshufd $0x00,%xmm3,%xmm0
+ vpshufd $0x55,%xmm3,%xmm1
+ vmovdqa %xmm0,128-256(%rcx)
+ vpshufd $0xaa,%xmm3,%xmm2
+ vmovdqa %xmm1,144-256(%rcx)
+ vpshufd $0xff,%xmm3,%xmm3
+ vmovdqa %xmm2,160-256(%rcx)
+ vmovdqa %xmm3,176-256(%rcx)
+
+ vpshufd $0x00,%xmm15,%xmm12
+ vpshufd $0x55,%xmm15,%xmm13
+ vmovdqa %xmm12,192-256(%rcx)
+ vpshufd $0xaa,%xmm15,%xmm14
+ vmovdqa %xmm13,208-256(%rcx)
+ vpshufd $0xff,%xmm15,%xmm15
+ vmovdqa %xmm14,224-256(%rcx)
+ vmovdqa %xmm15,240-256(%rcx)
+
+ vpshufd $0x00,%xmm7,%xmm4
+ vpshufd $0x55,%xmm7,%xmm5
+ vpaddd .Linc(%rip),%xmm4,%xmm4
+ vpshufd $0xaa,%xmm7,%xmm6
+ vmovdqa %xmm5,272-256(%rcx)
+ vpshufd $0xff,%xmm7,%xmm7
+ vmovdqa %xmm6,288-256(%rcx)
+ vmovdqa %xmm7,304-256(%rcx)
+
+ jmp .Loop_enter4xop
+
+.align 32
+.Loop_outer4xop:
+ vmovdqa 64(%rsp),%xmm8
+ vmovdqa 80(%rsp),%xmm9
+ vmovdqa 96(%rsp),%xmm10
+ vmovdqa 112(%rsp),%xmm11
+ vmovdqa 128-256(%rcx),%xmm0
+ vmovdqa 144-256(%rcx),%xmm1
+ vmovdqa 160-256(%rcx),%xmm2
+ vmovdqa 176-256(%rcx),%xmm3
+ vmovdqa 192-256(%rcx),%xmm12
+ vmovdqa 208-256(%rcx),%xmm13
+ vmovdqa 224-256(%rcx),%xmm14
+ vmovdqa 240-256(%rcx),%xmm15
+ vmovdqa 256-256(%rcx),%xmm4
+ vmovdqa 272-256(%rcx),%xmm5
+ vmovdqa 288-256(%rcx),%xmm6
+ vmovdqa 304-256(%rcx),%xmm7
+ vpaddd .Lfour(%rip),%xmm4,%xmm4
+
+.Loop_enter4xop:
+ movl $10,%eax
+ vmovdqa %xmm4,256-256(%rcx)
+ jmp .Loop4xop
+
+.align 32
+.Loop4xop:
+ vpaddd %xmm0,%xmm8,%xmm8
+ vpaddd %xmm1,%xmm9,%xmm9
+ vpaddd %xmm2,%xmm10,%xmm10
+ vpaddd %xmm3,%xmm11,%xmm11
+ vpxor %xmm4,%xmm8,%xmm4
+ vpxor %xmm5,%xmm9,%xmm5
+ vpxor %xmm6,%xmm10,%xmm6
+ vpxor %xmm7,%xmm11,%xmm7
+.byte 143,232,120,194,228,16
+.byte 143,232,120,194,237,16
+.byte 143,232,120,194,246,16
+.byte 143,232,120,194,255,16
+ vpaddd %xmm4,%xmm12,%xmm12
+ vpaddd %xmm5,%xmm13,%xmm13
+ vpaddd %xmm6,%xmm14,%xmm14
+ vpaddd %xmm7,%xmm15,%xmm15
+ vpxor %xmm0,%xmm12,%xmm0
+ vpxor %xmm1,%xmm13,%xmm1
+ vpxor %xmm14,%xmm2,%xmm2
+ vpxor %xmm15,%xmm3,%xmm3
+.byte 143,232,120,194,192,12
+.byte 143,232,120,194,201,12
+.byte 143,232,120,194,210,12
+.byte 143,232,120,194,219,12
+ vpaddd %xmm8,%xmm0,%xmm8
+ vpaddd %xmm9,%xmm1,%xmm9
+ vpaddd %xmm2,%xmm10,%xmm10
+ vpaddd %xmm3,%xmm11,%xmm11
+ vpxor %xmm4,%xmm8,%xmm4
+ vpxor %xmm5,%xmm9,%xmm5
+ vpxor %xmm6,%xmm10,%xmm6
+ vpxor %xmm7,%xmm11,%xmm7
+.byte 143,232,120,194,228,8
+.byte 143,232,120,194,237,8
+.byte 143,232,120,194,246,8
+.byte 143,232,120,194,255,8
+ vpaddd %xmm4,%xmm12,%xmm12
+ vpaddd %xmm5,%xmm13,%xmm13
+ vpaddd %xmm6,%xmm14,%xmm14
+ vpaddd %xmm7,%xmm15,%xmm15
+ vpxor %xmm0,%xmm12,%xmm0
+ vpxor %xmm1,%xmm13,%xmm1
+ vpxor %xmm14,%xmm2,%xmm2
+ vpxor %xmm15,%xmm3,%xmm3
+.byte 143,232,120,194,192,7
+.byte 143,232,120,194,201,7
+.byte 143,232,120,194,210,7
+.byte 143,232,120,194,219,7
+ vpaddd %xmm1,%xmm8,%xmm8
+ vpaddd %xmm2,%xmm9,%xmm9
+ vpaddd %xmm3,%xmm10,%xmm10
+ vpaddd %xmm0,%xmm11,%xmm11
+ vpxor %xmm7,%xmm8,%xmm7
+ vpxor %xmm4,%xmm9,%xmm4
+ vpxor %xmm5,%xmm10,%xmm5
+ vpxor %xmm6,%xmm11,%xmm6
+.byte 143,232,120,194,255,16
+.byte 143,232,120,194,228,16
+.byte 143,232,120,194,237,16
+.byte 143,232,120,194,246,16
+ vpaddd %xmm7,%xmm14,%xmm14
+ vpaddd %xmm4,%xmm15,%xmm15
+ vpaddd %xmm5,%xmm12,%xmm12
+ vpaddd %xmm6,%xmm13,%xmm13
+ vpxor %xmm1,%xmm14,%xmm1
+ vpxor %xmm2,%xmm15,%xmm2
+ vpxor %xmm12,%xmm3,%xmm3
+ vpxor %xmm13,%xmm0,%xmm0
+.byte 143,232,120,194,201,12
+.byte 143,232,120,194,210,12
+.byte 143,232,120,194,219,12
+.byte 143,232,120,194,192,12
+ vpaddd %xmm8,%xmm1,%xmm8
+ vpaddd %xmm9,%xmm2,%xmm9
+ vpaddd %xmm3,%xmm10,%xmm10
+ vpaddd %xmm0,%xmm11,%xmm11
+ vpxor %xmm7,%xmm8,%xmm7
+ vpxor %xmm4,%xmm9,%xmm4
+ vpxor %xmm5,%xmm10,%xmm5
+ vpxor %xmm6,%xmm11,%xmm6
+.byte 143,232,120,194,255,8
+.byte 143,232,120,194,228,8
+.byte 143,232,120,194,237,8
+.byte 143,232,120,194,246,8
+ vpaddd %xmm7,%xmm14,%xmm14
+ vpaddd %xmm4,%xmm15,%xmm15
+ vpaddd %xmm5,%xmm12,%xmm12
+ vpaddd %xmm6,%xmm13,%xmm13
+ vpxor %xmm1,%xmm14,%xmm1
+ vpxor %xmm2,%xmm15,%xmm2
+ vpxor %xmm12,%xmm3,%xmm3
+ vpxor %xmm13,%xmm0,%xmm0
+.byte 143,232,120,194,201,7
+.byte 143,232,120,194,210,7
+.byte 143,232,120,194,219,7
+.byte 143,232,120,194,192,7
+ decl %eax
+ jnz .Loop4xop
+
+ vpaddd 64(%rsp),%xmm8,%xmm8
+ vpaddd 80(%rsp),%xmm9,%xmm9
+ vpaddd 96(%rsp),%xmm10,%xmm10
+ vpaddd 112(%rsp),%xmm11,%xmm11
+
+ vmovdqa %xmm14,32(%rsp)
+ vmovdqa %xmm15,48(%rsp)
+
+ vpunpckldq %xmm9,%xmm8,%xmm14
+ vpunpckldq %xmm11,%xmm10,%xmm15
+ vpunpckhdq %xmm9,%xmm8,%xmm8
+ vpunpckhdq %xmm11,%xmm10,%xmm10
+ vpunpcklqdq %xmm15,%xmm14,%xmm9
+ vpunpckhqdq %xmm15,%xmm14,%xmm14
+ vpunpcklqdq %xmm10,%xmm8,%xmm11
+ vpunpckhqdq %xmm10,%xmm8,%xmm8
+ vpaddd 128-256(%rcx),%xmm0,%xmm0
+ vpaddd 144-256(%rcx),%xmm1,%xmm1
+ vpaddd 160-256(%rcx),%xmm2,%xmm2
+ vpaddd 176-256(%rcx),%xmm3,%xmm3
+
+ vmovdqa %xmm9,0(%rsp)
+ vmovdqa %xmm14,16(%rsp)
+ vmovdqa 32(%rsp),%xmm9
+ vmovdqa 48(%rsp),%xmm14
+
+ vpunpckldq %xmm1,%xmm0,%xmm10
+ vpunpckldq %xmm3,%xmm2,%xmm15
+ vpunpckhdq %xmm1,%xmm0,%xmm0
+ vpunpckhdq %xmm3,%xmm2,%xmm2
+ vpunpcklqdq %xmm15,%xmm10,%xmm1
+ vpunpckhqdq %xmm15,%xmm10,%xmm10
+ vpunpcklqdq %xmm2,%xmm0,%xmm3
+ vpunpckhqdq %xmm2,%xmm0,%xmm0
+ vpaddd 192-256(%rcx),%xmm12,%xmm12
+ vpaddd 208-256(%rcx),%xmm13,%xmm13
+ vpaddd 224-256(%rcx),%xmm9,%xmm9
+ vpaddd 240-256(%rcx),%xmm14,%xmm14
+
+ vpunpckldq %xmm13,%xmm12,%xmm2
+ vpunpckldq %xmm14,%xmm9,%xmm15
+ vpunpckhdq %xmm13,%xmm12,%xmm12
+ vpunpckhdq %xmm14,%xmm9,%xmm9
+ vpunpcklqdq %xmm15,%xmm2,%xmm13
+ vpunpckhqdq %xmm15,%xmm2,%xmm2
+ vpunpcklqdq %xmm9,%xmm12,%xmm14
+ vpunpckhqdq %xmm9,%xmm12,%xmm12
+ vpaddd 256-256(%rcx),%xmm4,%xmm4
+ vpaddd 272-256(%rcx),%xmm5,%xmm5
+ vpaddd 288-256(%rcx),%xmm6,%xmm6
+ vpaddd 304-256(%rcx),%xmm7,%xmm7
+
+ vpunpckldq %xmm5,%xmm4,%xmm9
+ vpunpckldq %xmm7,%xmm6,%xmm15
+ vpunpckhdq %xmm5,%xmm4,%xmm4
+ vpunpckhdq %xmm7,%xmm6,%xmm6
+ vpunpcklqdq %xmm15,%xmm9,%xmm5
+ vpunpckhqdq %xmm15,%xmm9,%xmm9
+ vpunpcklqdq %xmm6,%xmm4,%xmm7
+ vpunpckhqdq %xmm6,%xmm4,%xmm4
+ vmovdqa 0(%rsp),%xmm6
+ vmovdqa 16(%rsp),%xmm15
+
+ cmpq $256,%rdx
+ jb .Ltail4xop
+
+ vpxor 0(%rsi),%xmm6,%xmm6
+ vpxor 16(%rsi),%xmm1,%xmm1
+ vpxor 32(%rsi),%xmm13,%xmm13
+ vpxor 48(%rsi),%xmm5,%xmm5
+ vpxor 64(%rsi),%xmm15,%xmm15
+ vpxor 80(%rsi),%xmm10,%xmm10
+ vpxor 96(%rsi),%xmm2,%xmm2
+ vpxor 112(%rsi),%xmm9,%xmm9
+ leaq 128(%rsi),%rsi
+ vpxor 0(%rsi),%xmm11,%xmm11
+ vpxor 16(%rsi),%xmm3,%xmm3
+ vpxor 32(%rsi),%xmm14,%xmm14
+ vpxor 48(%rsi),%xmm7,%xmm7
+ vpxor 64(%rsi),%xmm8,%xmm8
+ vpxor 80(%rsi),%xmm0,%xmm0
+ vpxor 96(%rsi),%xmm12,%xmm12
+ vpxor 112(%rsi),%xmm4,%xmm4
+ leaq 128(%rsi),%rsi
+
+ vmovdqu %xmm6,0(%rdi)
+ vmovdqu %xmm1,16(%rdi)
+ vmovdqu %xmm13,32(%rdi)
+ vmovdqu %xmm5,48(%rdi)
+ vmovdqu %xmm15,64(%rdi)
+ vmovdqu %xmm10,80(%rdi)
+ vmovdqu %xmm2,96(%rdi)
+ vmovdqu %xmm9,112(%rdi)
+ leaq 128(%rdi),%rdi
+ vmovdqu %xmm11,0(%rdi)
+ vmovdqu %xmm3,16(%rdi)
+ vmovdqu %xmm14,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ vmovdqu %xmm8,64(%rdi)
+ vmovdqu %xmm0,80(%rdi)
+ vmovdqu %xmm12,96(%rdi)
+ vmovdqu %xmm4,112(%rdi)
+ leaq 128(%rdi),%rdi
+
+ subq $256,%rdx
+ jnz .Loop_outer4xop
+
+ jmp .Ldone4xop
+
+.align 32
+.Ltail4xop:
+ cmpq $192,%rdx
+ jae .L192_or_more4xop
+ cmpq $128,%rdx
+ jae .L128_or_more4xop
+ cmpq $64,%rdx
+ jae .L64_or_more4xop
+
+ xorq %r10,%r10
+ vmovdqa %xmm6,0(%rsp)
+ vmovdqa %xmm1,16(%rsp)
+ vmovdqa %xmm13,32(%rsp)
+ vmovdqa %xmm5,48(%rsp)
+ jmp .Loop_tail4xop
+
+.align 32
+.L64_or_more4xop:
+ vpxor 0(%rsi),%xmm6,%xmm6
+ vpxor 16(%rsi),%xmm1,%xmm1
+ vpxor 32(%rsi),%xmm13,%xmm13
+ vpxor 48(%rsi),%xmm5,%xmm5
+ vmovdqu %xmm6,0(%rdi)
+ vmovdqu %xmm1,16(%rdi)
+ vmovdqu %xmm13,32(%rdi)
+ vmovdqu %xmm5,48(%rdi)
+ je .Ldone4xop
+
+ leaq 64(%rsi),%rsi
+ vmovdqa %xmm15,0(%rsp)
+ xorq %r10,%r10
+ vmovdqa %xmm10,16(%rsp)
+ leaq 64(%rdi),%rdi
+ vmovdqa %xmm2,32(%rsp)
+ subq $64,%rdx
+ vmovdqa %xmm9,48(%rsp)
+ jmp .Loop_tail4xop
+
+.align 32
+.L128_or_more4xop:
+ vpxor 0(%rsi),%xmm6,%xmm6
+ vpxor 16(%rsi),%xmm1,%xmm1
+ vpxor 32(%rsi),%xmm13,%xmm13
+ vpxor 48(%rsi),%xmm5,%xmm5
+ vpxor 64(%rsi),%xmm15,%xmm15
+ vpxor 80(%rsi),%xmm10,%xmm10
+ vpxor 96(%rsi),%xmm2,%xmm2
+ vpxor 112(%rsi),%xmm9,%xmm9
+
+ vmovdqu %xmm6,0(%rdi)
+ vmovdqu %xmm1,16(%rdi)
+ vmovdqu %xmm13,32(%rdi)
+ vmovdqu %xmm5,48(%rdi)
+ vmovdqu %xmm15,64(%rdi)
+ vmovdqu %xmm10,80(%rdi)
+ vmovdqu %xmm2,96(%rdi)
+ vmovdqu %xmm9,112(%rdi)
+ je .Ldone4xop
+
+ leaq 128(%rsi),%rsi
+ vmovdqa %xmm11,0(%rsp)
+ xorq %r10,%r10
+ vmovdqa %xmm3,16(%rsp)
+ leaq 128(%rdi),%rdi
+ vmovdqa %xmm14,32(%rsp)
+ subq $128,%rdx
+ vmovdqa %xmm7,48(%rsp)
+ jmp .Loop_tail4xop
+
+.align 32
+.L192_or_more4xop:
+ vpxor 0(%rsi),%xmm6,%xmm6
+ vpxor 16(%rsi),%xmm1,%xmm1
+ vpxor 32(%rsi),%xmm13,%xmm13
+ vpxor 48(%rsi),%xmm5,%xmm5
+ vpxor 64(%rsi),%xmm15,%xmm15
+ vpxor 80(%rsi),%xmm10,%xmm10
+ vpxor 96(%rsi),%xmm2,%xmm2
+ vpxor 112(%rsi),%xmm9,%xmm9
+ leaq 128(%rsi),%rsi
+ vpxor 0(%rsi),%xmm11,%xmm11
+ vpxor 16(%rsi),%xmm3,%xmm3
+ vpxor 32(%rsi),%xmm14,%xmm14
+ vpxor 48(%rsi),%xmm7,%xmm7
+
+ vmovdqu %xmm6,0(%rdi)
+ vmovdqu %xmm1,16(%rdi)
+ vmovdqu %xmm13,32(%rdi)
+ vmovdqu %xmm5,48(%rdi)
+ vmovdqu %xmm15,64(%rdi)
+ vmovdqu %xmm10,80(%rdi)
+ vmovdqu %xmm2,96(%rdi)
+ vmovdqu %xmm9,112(%rdi)
+ leaq 128(%rdi),%rdi
+ vmovdqu %xmm11,0(%rdi)
+ vmovdqu %xmm3,16(%rdi)
+ vmovdqu %xmm14,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ je .Ldone4xop
+
+ leaq 64(%rsi),%rsi
+ vmovdqa %xmm8,0(%rsp)
+ xorq %r10,%r10
+ vmovdqa %xmm0,16(%rsp)
+ leaq 64(%rdi),%rdi
+ vmovdqa %xmm12,32(%rsp)
+ subq $192,%rdx
+ vmovdqa %xmm4,48(%rsp)
+
+.Loop_tail4xop:
+ movzbl (%rsi,%r10,1),%eax
+ movzbl (%rsp,%r10,1),%ecx
+ leaq 1(%r10),%r10
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r10,1)
+ decq %rdx
+ jnz .Loop_tail4xop
+
+.Ldone4xop:
+ vzeroupper
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L4xop_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_4xop,.-ChaCha20_4xop
+.type ChaCha20_8x,@function
+.align 32
+ChaCha20_8x:
+.cfi_startproc
+.LChaCha20_8x:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ subq $0x280+8,%rsp
+ andq $-32,%rsp
+ vzeroupper
+
+
+
+
+
+
+
+
+
+
+ vbroadcasti128 .Lsigma(%rip),%ymm11
+ vbroadcasti128 (%rcx),%ymm3
+ vbroadcasti128 16(%rcx),%ymm15
+ vbroadcasti128 (%r8),%ymm7
+ leaq 256(%rsp),%rcx
+ leaq 512(%rsp),%rax
+ leaq .Lrot16(%rip),%r10
+ leaq .Lrot24(%rip),%r11
+
+ vpshufd $0x00,%ymm11,%ymm8
+ vpshufd $0x55,%ymm11,%ymm9
+ vmovdqa %ymm8,128-256(%rcx)
+ vpshufd $0xaa,%ymm11,%ymm10
+ vmovdqa %ymm9,160-256(%rcx)
+ vpshufd $0xff,%ymm11,%ymm11
+ vmovdqa %ymm10,192-256(%rcx)
+ vmovdqa %ymm11,224-256(%rcx)
+
+ vpshufd $0x00,%ymm3,%ymm0
+ vpshufd $0x55,%ymm3,%ymm1
+ vmovdqa %ymm0,256-256(%rcx)
+ vpshufd $0xaa,%ymm3,%ymm2
+ vmovdqa %ymm1,288-256(%rcx)
+ vpshufd $0xff,%ymm3,%ymm3
+ vmovdqa %ymm2,320-256(%rcx)
+ vmovdqa %ymm3,352-256(%rcx)
+
+ vpshufd $0x00,%ymm15,%ymm12
+ vpshufd $0x55,%ymm15,%ymm13
+ vmovdqa %ymm12,384-512(%rax)
+ vpshufd $0xaa,%ymm15,%ymm14
+ vmovdqa %ymm13,416-512(%rax)
+ vpshufd $0xff,%ymm15,%ymm15
+ vmovdqa %ymm14,448-512(%rax)
+ vmovdqa %ymm15,480-512(%rax)
+
+ vpshufd $0x00,%ymm7,%ymm4
+ vpshufd $0x55,%ymm7,%ymm5
+ vpaddd .Lincy(%rip),%ymm4,%ymm4
+ vpshufd $0xaa,%ymm7,%ymm6
+ vmovdqa %ymm5,544-512(%rax)
+ vpshufd $0xff,%ymm7,%ymm7
+ vmovdqa %ymm6,576-512(%rax)
+ vmovdqa %ymm7,608-512(%rax)
+
+ jmp .Loop_enter8x
+
+.align 32
+.Loop_outer8x:
+ vmovdqa 128-256(%rcx),%ymm8
+ vmovdqa 160-256(%rcx),%ymm9
+ vmovdqa 192-256(%rcx),%ymm10
+ vmovdqa 224-256(%rcx),%ymm11
+ vmovdqa 256-256(%rcx),%ymm0
+ vmovdqa 288-256(%rcx),%ymm1
+ vmovdqa 320-256(%rcx),%ymm2
+ vmovdqa 352-256(%rcx),%ymm3
+ vmovdqa 384-512(%rax),%ymm12
+ vmovdqa 416-512(%rax),%ymm13
+ vmovdqa 448-512(%rax),%ymm14
+ vmovdqa 480-512(%rax),%ymm15
+ vmovdqa 512-512(%rax),%ymm4
+ vmovdqa 544-512(%rax),%ymm5
+ vmovdqa 576-512(%rax),%ymm6
+ vmovdqa 608-512(%rax),%ymm7
+ vpaddd .Leight(%rip),%ymm4,%ymm4
+
+.Loop_enter8x:
+ vmovdqa %ymm14,64(%rsp)
+ vmovdqa %ymm15,96(%rsp)
+ vbroadcasti128 (%r10),%ymm15
+ vmovdqa %ymm4,512-512(%rax)
+ movl $10,%eax
+ jmp .Loop8x
+
+.align 32
+.Loop8x:
+ vpaddd %ymm0,%ymm8,%ymm8
+ vpxor %ymm4,%ymm8,%ymm4
+ vpshufb %ymm15,%ymm4,%ymm4
+ vpaddd %ymm1,%ymm9,%ymm9
+ vpxor %ymm5,%ymm9,%ymm5
+ vpshufb %ymm15,%ymm5,%ymm5
+ vpaddd %ymm4,%ymm12,%ymm12
+ vpxor %ymm0,%ymm12,%ymm0
+ vpslld $12,%ymm0,%ymm14
+ vpsrld $20,%ymm0,%ymm0
+ vpor %ymm0,%ymm14,%ymm0
+ vbroadcasti128 (%r11),%ymm14
+ vpaddd %ymm5,%ymm13,%ymm13
+ vpxor %ymm1,%ymm13,%ymm1
+ vpslld $12,%ymm1,%ymm15
+ vpsrld $20,%ymm1,%ymm1
+ vpor %ymm1,%ymm15,%ymm1
+ vpaddd %ymm0,%ymm8,%ymm8
+ vpxor %ymm4,%ymm8,%ymm4
+ vpshufb %ymm14,%ymm4,%ymm4
+ vpaddd %ymm1,%ymm9,%ymm9
+ vpxor %ymm5,%ymm9,%ymm5
+ vpshufb %ymm14,%ymm5,%ymm5
+ vpaddd %ymm4,%ymm12,%ymm12
+ vpxor %ymm0,%ymm12,%ymm0
+ vpslld $7,%ymm0,%ymm15
+ vpsrld $25,%ymm0,%ymm0
+ vpor %ymm0,%ymm15,%ymm0
+ vbroadcasti128 (%r10),%ymm15
+ vpaddd %ymm5,%ymm13,%ymm13
+ vpxor %ymm1,%ymm13,%ymm1
+ vpslld $7,%ymm1,%ymm14
+ vpsrld $25,%ymm1,%ymm1
+ vpor %ymm1,%ymm14,%ymm1
+ vmovdqa %ymm12,0(%rsp)
+ vmovdqa %ymm13,32(%rsp)
+ vmovdqa 64(%rsp),%ymm12
+ vmovdqa 96(%rsp),%ymm13
+ vpaddd %ymm2,%ymm10,%ymm10
+ vpxor %ymm6,%ymm10,%ymm6
+ vpshufb %ymm15,%ymm6,%ymm6
+ vpaddd %ymm3,%ymm11,%ymm11
+ vpxor %ymm7,%ymm11,%ymm7
+ vpshufb %ymm15,%ymm7,%ymm7
+ vpaddd %ymm6,%ymm12,%ymm12
+ vpxor %ymm2,%ymm12,%ymm2
+ vpslld $12,%ymm2,%ymm14
+ vpsrld $20,%ymm2,%ymm2
+ vpor %ymm2,%ymm14,%ymm2
+ vbroadcasti128 (%r11),%ymm14
+ vpaddd %ymm7,%ymm13,%ymm13
+ vpxor %ymm3,%ymm13,%ymm3
+ vpslld $12,%ymm3,%ymm15
+ vpsrld $20,%ymm3,%ymm3
+ vpor %ymm3,%ymm15,%ymm3
+ vpaddd %ymm2,%ymm10,%ymm10
+ vpxor %ymm6,%ymm10,%ymm6
+ vpshufb %ymm14,%ymm6,%ymm6
+ vpaddd %ymm3,%ymm11,%ymm11
+ vpxor %ymm7,%ymm11,%ymm7
+ vpshufb %ymm14,%ymm7,%ymm7
+ vpaddd %ymm6,%ymm12,%ymm12
+ vpxor %ymm2,%ymm12,%ymm2
+ vpslld $7,%ymm2,%ymm15
+ vpsrld $25,%ymm2,%ymm2
+ vpor %ymm2,%ymm15,%ymm2
+ vbroadcasti128 (%r10),%ymm15
+ vpaddd %ymm7,%ymm13,%ymm13
+ vpxor %ymm3,%ymm13,%ymm3
+ vpslld $7,%ymm3,%ymm14
+ vpsrld $25,%ymm3,%ymm3
+ vpor %ymm3,%ymm14,%ymm3
+ vpaddd %ymm1,%ymm8,%ymm8
+ vpxor %ymm7,%ymm8,%ymm7
+ vpshufb %ymm15,%ymm7,%ymm7
+ vpaddd %ymm2,%ymm9,%ymm9
+ vpxor %ymm4,%ymm9,%ymm4
+ vpshufb %ymm15,%ymm4,%ymm4
+ vpaddd %ymm7,%ymm12,%ymm12
+ vpxor %ymm1,%ymm12,%ymm1
+ vpslld $12,%ymm1,%ymm14
+ vpsrld $20,%ymm1,%ymm1
+ vpor %ymm1,%ymm14,%ymm1
+ vbroadcasti128 (%r11),%ymm14
+ vpaddd %ymm4,%ymm13,%ymm13
+ vpxor %ymm2,%ymm13,%ymm2
+ vpslld $12,%ymm2,%ymm15
+ vpsrld $20,%ymm2,%ymm2
+ vpor %ymm2,%ymm15,%ymm2
+ vpaddd %ymm1,%ymm8,%ymm8
+ vpxor %ymm7,%ymm8,%ymm7
+ vpshufb %ymm14,%ymm7,%ymm7
+ vpaddd %ymm2,%ymm9,%ymm9
+ vpxor %ymm4,%ymm9,%ymm4
+ vpshufb %ymm14,%ymm4,%ymm4
+ vpaddd %ymm7,%ymm12,%ymm12
+ vpxor %ymm1,%ymm12,%ymm1
+ vpslld $7,%ymm1,%ymm15
+ vpsrld $25,%ymm1,%ymm1
+ vpor %ymm1,%ymm15,%ymm1
+ vbroadcasti128 (%r10),%ymm15
+ vpaddd %ymm4,%ymm13,%ymm13
+ vpxor %ymm2,%ymm13,%ymm2
+ vpslld $7,%ymm2,%ymm14
+ vpsrld $25,%ymm2,%ymm2
+ vpor %ymm2,%ymm14,%ymm2
+ vmovdqa %ymm12,64(%rsp)
+ vmovdqa %ymm13,96(%rsp)
+ vmovdqa 0(%rsp),%ymm12
+ vmovdqa 32(%rsp),%ymm13
+ vpaddd %ymm3,%ymm10,%ymm10
+ vpxor %ymm5,%ymm10,%ymm5
+ vpshufb %ymm15,%ymm5,%ymm5
+ vpaddd %ymm0,%ymm11,%ymm11
+ vpxor %ymm6,%ymm11,%ymm6
+ vpshufb %ymm15,%ymm6,%ymm6
+ vpaddd %ymm5,%ymm12,%ymm12
+ vpxor %ymm3,%ymm12,%ymm3
+ vpslld $12,%ymm3,%ymm14
+ vpsrld $20,%ymm3,%ymm3
+ vpor %ymm3,%ymm14,%ymm3
+ vbroadcasti128 (%r11),%ymm14
+ vpaddd %ymm6,%ymm13,%ymm13
+ vpxor %ymm0,%ymm13,%ymm0
+ vpslld $12,%ymm0,%ymm15
+ vpsrld $20,%ymm0,%ymm0
+ vpor %ymm0,%ymm15,%ymm0
+ vpaddd %ymm3,%ymm10,%ymm10
+ vpxor %ymm5,%ymm10,%ymm5
+ vpshufb %ymm14,%ymm5,%ymm5
+ vpaddd %ymm0,%ymm11,%ymm11
+ vpxor %ymm6,%ymm11,%ymm6
+ vpshufb %ymm14,%ymm6,%ymm6
+ vpaddd %ymm5,%ymm12,%ymm12
+ vpxor %ymm3,%ymm12,%ymm3
+ vpslld $7,%ymm3,%ymm15
+ vpsrld $25,%ymm3,%ymm3
+ vpor %ymm3,%ymm15,%ymm3
+ vbroadcasti128 (%r10),%ymm15
+ vpaddd %ymm6,%ymm13,%ymm13
+ vpxor %ymm0,%ymm13,%ymm0
+ vpslld $7,%ymm0,%ymm14
+ vpsrld $25,%ymm0,%ymm0
+ vpor %ymm0,%ymm14,%ymm0
+ decl %eax
+ jnz .Loop8x
+
+ leaq 512(%rsp),%rax
+ vpaddd 128-256(%rcx),%ymm8,%ymm8
+ vpaddd 160-256(%rcx),%ymm9,%ymm9
+ vpaddd 192-256(%rcx),%ymm10,%ymm10
+ vpaddd 224-256(%rcx),%ymm11,%ymm11
+
+ vpunpckldq %ymm9,%ymm8,%ymm14
+ vpunpckldq %ymm11,%ymm10,%ymm15
+ vpunpckhdq %ymm9,%ymm8,%ymm8
+ vpunpckhdq %ymm11,%ymm10,%ymm10
+ vpunpcklqdq %ymm15,%ymm14,%ymm9
+ vpunpckhqdq %ymm15,%ymm14,%ymm14
+ vpunpcklqdq %ymm10,%ymm8,%ymm11
+ vpunpckhqdq %ymm10,%ymm8,%ymm8
+ vpaddd 256-256(%rcx),%ymm0,%ymm0
+ vpaddd 288-256(%rcx),%ymm1,%ymm1
+ vpaddd 320-256(%rcx),%ymm2,%ymm2
+ vpaddd 352-256(%rcx),%ymm3,%ymm3
+
+ vpunpckldq %ymm1,%ymm0,%ymm10
+ vpunpckldq %ymm3,%ymm2,%ymm15
+ vpunpckhdq %ymm1,%ymm0,%ymm0
+ vpunpckhdq %ymm3,%ymm2,%ymm2
+ vpunpcklqdq %ymm15,%ymm10,%ymm1
+ vpunpckhqdq %ymm15,%ymm10,%ymm10
+ vpunpcklqdq %ymm2,%ymm0,%ymm3
+ vpunpckhqdq %ymm2,%ymm0,%ymm0
+ vperm2i128 $0x20,%ymm1,%ymm9,%ymm15
+ vperm2i128 $0x31,%ymm1,%ymm9,%ymm1
+ vperm2i128 $0x20,%ymm10,%ymm14,%ymm9
+ vperm2i128 $0x31,%ymm10,%ymm14,%ymm10
+ vperm2i128 $0x20,%ymm3,%ymm11,%ymm14
+ vperm2i128 $0x31,%ymm3,%ymm11,%ymm3
+ vperm2i128 $0x20,%ymm0,%ymm8,%ymm11
+ vperm2i128 $0x31,%ymm0,%ymm8,%ymm0
+ vmovdqa %ymm15,0(%rsp)
+ vmovdqa %ymm9,32(%rsp)
+ vmovdqa 64(%rsp),%ymm15
+ vmovdqa 96(%rsp),%ymm9
+
+ vpaddd 384-512(%rax),%ymm12,%ymm12
+ vpaddd 416-512(%rax),%ymm13,%ymm13
+ vpaddd 448-512(%rax),%ymm15,%ymm15
+ vpaddd 480-512(%rax),%ymm9,%ymm9
+
+ vpunpckldq %ymm13,%ymm12,%ymm2
+ vpunpckldq %ymm9,%ymm15,%ymm8
+ vpunpckhdq %ymm13,%ymm12,%ymm12
+ vpunpckhdq %ymm9,%ymm15,%ymm15
+ vpunpcklqdq %ymm8,%ymm2,%ymm13
+ vpunpckhqdq %ymm8,%ymm2,%ymm2
+ vpunpcklqdq %ymm15,%ymm12,%ymm9
+ vpunpckhqdq %ymm15,%ymm12,%ymm12
+ vpaddd 512-512(%rax),%ymm4,%ymm4
+ vpaddd 544-512(%rax),%ymm5,%ymm5
+ vpaddd 576-512(%rax),%ymm6,%ymm6
+ vpaddd 608-512(%rax),%ymm7,%ymm7
+
+ vpunpckldq %ymm5,%ymm4,%ymm15
+ vpunpckldq %ymm7,%ymm6,%ymm8
+ vpunpckhdq %ymm5,%ymm4,%ymm4
+ vpunpckhdq %ymm7,%ymm6,%ymm6
+ vpunpcklqdq %ymm8,%ymm15,%ymm5
+ vpunpckhqdq %ymm8,%ymm15,%ymm15
+ vpunpcklqdq %ymm6,%ymm4,%ymm7
+ vpunpckhqdq %ymm6,%ymm4,%ymm4
+ vperm2i128 $0x20,%ymm5,%ymm13,%ymm8
+ vperm2i128 $0x31,%ymm5,%ymm13,%ymm5
+ vperm2i128 $0x20,%ymm15,%ymm2,%ymm13
+ vperm2i128 $0x31,%ymm15,%ymm2,%ymm15
+ vperm2i128 $0x20,%ymm7,%ymm9,%ymm2
+ vperm2i128 $0x31,%ymm7,%ymm9,%ymm7
+ vperm2i128 $0x20,%ymm4,%ymm12,%ymm9
+ vperm2i128 $0x31,%ymm4,%ymm12,%ymm4
+ vmovdqa 0(%rsp),%ymm6
+ vmovdqa 32(%rsp),%ymm12
+
+ cmpq $512,%rdx
+ jb .Ltail8x
+
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ leaq 128(%rsi),%rsi
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ leaq 128(%rdi),%rdi
+
+ vpxor 0(%rsi),%ymm12,%ymm12
+ vpxor 32(%rsi),%ymm13,%ymm13
+ vpxor 64(%rsi),%ymm10,%ymm10
+ vpxor 96(%rsi),%ymm15,%ymm15
+ leaq 128(%rsi),%rsi
+ vmovdqu %ymm12,0(%rdi)
+ vmovdqu %ymm13,32(%rdi)
+ vmovdqu %ymm10,64(%rdi)
+ vmovdqu %ymm15,96(%rdi)
+ leaq 128(%rdi),%rdi
+
+ vpxor 0(%rsi),%ymm14,%ymm14
+ vpxor 32(%rsi),%ymm2,%ymm2
+ vpxor 64(%rsi),%ymm3,%ymm3
+ vpxor 96(%rsi),%ymm7,%ymm7
+ leaq 128(%rsi),%rsi
+ vmovdqu %ymm14,0(%rdi)
+ vmovdqu %ymm2,32(%rdi)
+ vmovdqu %ymm3,64(%rdi)
+ vmovdqu %ymm7,96(%rdi)
+ leaq 128(%rdi),%rdi
+
+ vpxor 0(%rsi),%ymm11,%ymm11
+ vpxor 32(%rsi),%ymm9,%ymm9
+ vpxor 64(%rsi),%ymm0,%ymm0
+ vpxor 96(%rsi),%ymm4,%ymm4
+ leaq 128(%rsi),%rsi
+ vmovdqu %ymm11,0(%rdi)
+ vmovdqu %ymm9,32(%rdi)
+ vmovdqu %ymm0,64(%rdi)
+ vmovdqu %ymm4,96(%rdi)
+ leaq 128(%rdi),%rdi
+
+ subq $512,%rdx
+ jnz .Loop_outer8x
+
+ jmp .Ldone8x
+
+.Ltail8x:
+ cmpq $448,%rdx
+ jae .L448_or_more8x
+ cmpq $384,%rdx
+ jae .L384_or_more8x
+ cmpq $320,%rdx
+ jae .L320_or_more8x
+ cmpq $256,%rdx
+ jae .L256_or_more8x
+ cmpq $192,%rdx
+ jae .L192_or_more8x
+ cmpq $128,%rdx
+ jae .L128_or_more8x
+ cmpq $64,%rdx
+ jae .L64_or_more8x
+
+ xorq %r10,%r10
+ vmovdqa %ymm6,0(%rsp)
+ vmovdqa %ymm8,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L64_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ je .Ldone8x
+
+ leaq 64(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm1,0(%rsp)
+ leaq 64(%rdi),%rdi
+ subq $64,%rdx
+ vmovdqa %ymm5,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L128_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ je .Ldone8x
+
+ leaq 128(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm12,0(%rsp)
+ leaq 128(%rdi),%rdi
+ subq $128,%rdx
+ vmovdqa %ymm13,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L192_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vpxor 128(%rsi),%ymm12,%ymm12
+ vpxor 160(%rsi),%ymm13,%ymm13
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ vmovdqu %ymm12,128(%rdi)
+ vmovdqu %ymm13,160(%rdi)
+ je .Ldone8x
+
+ leaq 192(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm10,0(%rsp)
+ leaq 192(%rdi),%rdi
+ subq $192,%rdx
+ vmovdqa %ymm15,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L256_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vpxor 128(%rsi),%ymm12,%ymm12
+ vpxor 160(%rsi),%ymm13,%ymm13
+ vpxor 192(%rsi),%ymm10,%ymm10
+ vpxor 224(%rsi),%ymm15,%ymm15
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ vmovdqu %ymm12,128(%rdi)
+ vmovdqu %ymm13,160(%rdi)
+ vmovdqu %ymm10,192(%rdi)
+ vmovdqu %ymm15,224(%rdi)
+ je .Ldone8x
+
+ leaq 256(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm14,0(%rsp)
+ leaq 256(%rdi),%rdi
+ subq $256,%rdx
+ vmovdqa %ymm2,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L320_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vpxor 128(%rsi),%ymm12,%ymm12
+ vpxor 160(%rsi),%ymm13,%ymm13
+ vpxor 192(%rsi),%ymm10,%ymm10
+ vpxor 224(%rsi),%ymm15,%ymm15
+ vpxor 256(%rsi),%ymm14,%ymm14
+ vpxor 288(%rsi),%ymm2,%ymm2
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ vmovdqu %ymm12,128(%rdi)
+ vmovdqu %ymm13,160(%rdi)
+ vmovdqu %ymm10,192(%rdi)
+ vmovdqu %ymm15,224(%rdi)
+ vmovdqu %ymm14,256(%rdi)
+ vmovdqu %ymm2,288(%rdi)
+ je .Ldone8x
+
+ leaq 320(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm3,0(%rsp)
+ leaq 320(%rdi),%rdi
+ subq $320,%rdx
+ vmovdqa %ymm7,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L384_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vpxor 128(%rsi),%ymm12,%ymm12
+ vpxor 160(%rsi),%ymm13,%ymm13
+ vpxor 192(%rsi),%ymm10,%ymm10
+ vpxor 224(%rsi),%ymm15,%ymm15
+ vpxor 256(%rsi),%ymm14,%ymm14
+ vpxor 288(%rsi),%ymm2,%ymm2
+ vpxor 320(%rsi),%ymm3,%ymm3
+ vpxor 352(%rsi),%ymm7,%ymm7
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ vmovdqu %ymm12,128(%rdi)
+ vmovdqu %ymm13,160(%rdi)
+ vmovdqu %ymm10,192(%rdi)
+ vmovdqu %ymm15,224(%rdi)
+ vmovdqu %ymm14,256(%rdi)
+ vmovdqu %ymm2,288(%rdi)
+ vmovdqu %ymm3,320(%rdi)
+ vmovdqu %ymm7,352(%rdi)
+ je .Ldone8x
+
+ leaq 384(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm11,0(%rsp)
+ leaq 384(%rdi),%rdi
+ subq $384,%rdx
+ vmovdqa %ymm9,32(%rsp)
+ jmp .Loop_tail8x
+
+.align 32
+.L448_or_more8x:
+ vpxor 0(%rsi),%ymm6,%ymm6
+ vpxor 32(%rsi),%ymm8,%ymm8
+ vpxor 64(%rsi),%ymm1,%ymm1
+ vpxor 96(%rsi),%ymm5,%ymm5
+ vpxor 128(%rsi),%ymm12,%ymm12
+ vpxor 160(%rsi),%ymm13,%ymm13
+ vpxor 192(%rsi),%ymm10,%ymm10
+ vpxor 224(%rsi),%ymm15,%ymm15
+ vpxor 256(%rsi),%ymm14,%ymm14
+ vpxor 288(%rsi),%ymm2,%ymm2
+ vpxor 320(%rsi),%ymm3,%ymm3
+ vpxor 352(%rsi),%ymm7,%ymm7
+ vpxor 384(%rsi),%ymm11,%ymm11
+ vpxor 416(%rsi),%ymm9,%ymm9
+ vmovdqu %ymm6,0(%rdi)
+ vmovdqu %ymm8,32(%rdi)
+ vmovdqu %ymm1,64(%rdi)
+ vmovdqu %ymm5,96(%rdi)
+ vmovdqu %ymm12,128(%rdi)
+ vmovdqu %ymm13,160(%rdi)
+ vmovdqu %ymm10,192(%rdi)
+ vmovdqu %ymm15,224(%rdi)
+ vmovdqu %ymm14,256(%rdi)
+ vmovdqu %ymm2,288(%rdi)
+ vmovdqu %ymm3,320(%rdi)
+ vmovdqu %ymm7,352(%rdi)
+ vmovdqu %ymm11,384(%rdi)
+ vmovdqu %ymm9,416(%rdi)
+ je .Ldone8x
+
+ leaq 448(%rsi),%rsi
+ xorq %r10,%r10
+ vmovdqa %ymm0,0(%rsp)
+ leaq 448(%rdi),%rdi
+ subq $448,%rdx
+ vmovdqa %ymm4,32(%rsp)
+
+.Loop_tail8x:
+ movzbl (%rsi,%r10,1),%eax
+ movzbl (%rsp,%r10,1),%ecx
+ leaq 1(%r10),%r10
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r10,1)
+ decq %rdx
+ jnz .Loop_tail8x
+
+.Ldone8x:
+ vzeroall
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L8x_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_8x,.-ChaCha20_8x
+.type ChaCha20_avx512,@function
+.align 32
+ChaCha20_avx512:
+.cfi_startproc
+.LChaCha20_avx512:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ cmpq $512,%rdx
+ ja .LChaCha20_16x
+
+ subq $64+8,%rsp
+ vbroadcasti32x4 .Lsigma(%rip),%zmm0
+ vbroadcasti32x4 (%rcx),%zmm1
+ vbroadcasti32x4 16(%rcx),%zmm2
+ vbroadcasti32x4 (%r8),%zmm3
+
+ vmovdqa32 %zmm0,%zmm16
+ vmovdqa32 %zmm1,%zmm17
+ vmovdqa32 %zmm2,%zmm18
+ vpaddd .Lzeroz(%rip),%zmm3,%zmm3
+ vmovdqa32 .Lfourz(%rip),%zmm20
+ movq $10,%r8
+ vmovdqa32 %zmm3,%zmm19
+ jmp .Loop_avx512
+
+.align 16
+.Loop_outer_avx512:
+ vmovdqa32 %zmm16,%zmm0
+ vmovdqa32 %zmm17,%zmm1
+ vmovdqa32 %zmm18,%zmm2
+ vpaddd %zmm20,%zmm19,%zmm3
+ movq $10,%r8
+ vmovdqa32 %zmm3,%zmm19
+ jmp .Loop_avx512
+
+.align 32
+.Loop_avx512:
+ vpaddd %zmm1,%zmm0,%zmm0
+ vpxord %zmm0,%zmm3,%zmm3
+ vprold $16,%zmm3,%zmm3
+ vpaddd %zmm3,%zmm2,%zmm2
+ vpxord %zmm2,%zmm1,%zmm1
+ vprold $12,%zmm1,%zmm1
+ vpaddd %zmm1,%zmm0,%zmm0
+ vpxord %zmm0,%zmm3,%zmm3
+ vprold $8,%zmm3,%zmm3
+ vpaddd %zmm3,%zmm2,%zmm2
+ vpxord %zmm2,%zmm1,%zmm1
+ vprold $7,%zmm1,%zmm1
+ vpshufd $78,%zmm2,%zmm2
+ vpshufd $57,%zmm1,%zmm1
+ vpshufd $147,%zmm3,%zmm3
+ vpaddd %zmm1,%zmm0,%zmm0
+ vpxord %zmm0,%zmm3,%zmm3
+ vprold $16,%zmm3,%zmm3
+ vpaddd %zmm3,%zmm2,%zmm2
+ vpxord %zmm2,%zmm1,%zmm1
+ vprold $12,%zmm1,%zmm1
+ vpaddd %zmm1,%zmm0,%zmm0
+ vpxord %zmm0,%zmm3,%zmm3
+ vprold $8,%zmm3,%zmm3
+ vpaddd %zmm3,%zmm2,%zmm2
+ vpxord %zmm2,%zmm1,%zmm1
+ vprold $7,%zmm1,%zmm1
+ vpshufd $78,%zmm2,%zmm2
+ vpshufd $147,%zmm1,%zmm1
+ vpshufd $57,%zmm3,%zmm3
+ decq %r8
+ jnz .Loop_avx512
+ vpaddd %zmm16,%zmm0,%zmm0
+ vpaddd %zmm17,%zmm1,%zmm1
+ vpaddd %zmm18,%zmm2,%zmm2
+ vpaddd %zmm19,%zmm3,%zmm3
+
+ subq $64,%rdx
+ jb .Ltail64_avx512
+
+ vpxor 0(%rsi),%xmm0,%xmm4
+ vpxor 16(%rsi),%xmm1,%xmm5
+ vpxor 32(%rsi),%xmm2,%xmm6
+ vpxor 48(%rsi),%xmm3,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jz .Ldone_avx512
+
+ vextracti32x4 $1,%zmm0,%xmm4
+ vextracti32x4 $1,%zmm1,%xmm5
+ vextracti32x4 $1,%zmm2,%xmm6
+ vextracti32x4 $1,%zmm3,%xmm7
+
+ subq $64,%rdx
+ jb .Ltail_avx512
+
+ vpxor 0(%rsi),%xmm4,%xmm4
+ vpxor 16(%rsi),%xmm5,%xmm5
+ vpxor 32(%rsi),%xmm6,%xmm6
+ vpxor 48(%rsi),%xmm7,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jz .Ldone_avx512
+
+ vextracti32x4 $2,%zmm0,%xmm4
+ vextracti32x4 $2,%zmm1,%xmm5
+ vextracti32x4 $2,%zmm2,%xmm6
+ vextracti32x4 $2,%zmm3,%xmm7
+
+ subq $64,%rdx
+ jb .Ltail_avx512
+
+ vpxor 0(%rsi),%xmm4,%xmm4
+ vpxor 16(%rsi),%xmm5,%xmm5
+ vpxor 32(%rsi),%xmm6,%xmm6
+ vpxor 48(%rsi),%xmm7,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jz .Ldone_avx512
+
+ vextracti32x4 $3,%zmm0,%xmm4
+ vextracti32x4 $3,%zmm1,%xmm5
+ vextracti32x4 $3,%zmm2,%xmm6
+ vextracti32x4 $3,%zmm3,%xmm7
+
+ subq $64,%rdx
+ jb .Ltail_avx512
+
+ vpxor 0(%rsi),%xmm4,%xmm4
+ vpxor 16(%rsi),%xmm5,%xmm5
+ vpxor 32(%rsi),%xmm6,%xmm6
+ vpxor 48(%rsi),%xmm7,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jnz .Loop_outer_avx512
+
+ jmp .Ldone_avx512
+
+.align 16
+.Ltail64_avx512:
+ vmovdqa %xmm0,0(%rsp)
+ vmovdqa %xmm1,16(%rsp)
+ vmovdqa %xmm2,32(%rsp)
+ vmovdqa %xmm3,48(%rsp)
+ addq $64,%rdx
+ jmp .Loop_tail_avx512
+
+.align 16
+.Ltail_avx512:
+ vmovdqa %xmm4,0(%rsp)
+ vmovdqa %xmm5,16(%rsp)
+ vmovdqa %xmm6,32(%rsp)
+ vmovdqa %xmm7,48(%rsp)
+ addq $64,%rdx
+
+.Loop_tail_avx512:
+ movzbl (%rsi,%r8,1),%eax
+ movzbl (%rsp,%r8,1),%ecx
+ leaq 1(%r8),%r8
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r8,1)
+ decq %rdx
+ jnz .Loop_tail_avx512
+
+ vmovdqu32 %zmm16,0(%rsp)
+
+.Ldone_avx512:
+ vzeroall
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.Lavx512_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_avx512,.-ChaCha20_avx512
+.type ChaCha20_avx512vl,@function
+.align 32
+ChaCha20_avx512vl:
+.cfi_startproc
+.LChaCha20_avx512vl:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ cmpq $128,%rdx
+ ja .LChaCha20_8xvl
+
+ subq $64+8,%rsp
+ vbroadcasti128 .Lsigma(%rip),%ymm0
+ vbroadcasti128 (%rcx),%ymm1
+ vbroadcasti128 16(%rcx),%ymm2
+ vbroadcasti128 (%r8),%ymm3
+
+ vmovdqa32 %ymm0,%ymm16
+ vmovdqa32 %ymm1,%ymm17
+ vmovdqa32 %ymm2,%ymm18
+ vpaddd .Lzeroz(%rip),%ymm3,%ymm3
+ vmovdqa32 .Ltwoy(%rip),%ymm20
+ movq $10,%r8
+ vmovdqa32 %ymm3,%ymm19
+ jmp .Loop_avx512vl
+
+.align 16
+.Loop_outer_avx512vl:
+ vmovdqa32 %ymm18,%ymm2
+ vpaddd %ymm20,%ymm19,%ymm3
+ movq $10,%r8
+ vmovdqa32 %ymm3,%ymm19
+ jmp .Loop_avx512vl
+
+.align 32
+.Loop_avx512vl:
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $16,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $12,%ymm1,%ymm1
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $8,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $7,%ymm1,%ymm1
+ vpshufd $78,%ymm2,%ymm2
+ vpshufd $57,%ymm1,%ymm1
+ vpshufd $147,%ymm3,%ymm3
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $16,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $12,%ymm1,%ymm1
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $8,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $7,%ymm1,%ymm1
+ vpshufd $78,%ymm2,%ymm2
+ vpshufd $147,%ymm1,%ymm1
+ vpshufd $57,%ymm3,%ymm3
+ decq %r8
+ jnz .Loop_avx512vl
+ vpaddd %ymm16,%ymm0,%ymm0
+ vpaddd %ymm17,%ymm1,%ymm1
+ vpaddd %ymm18,%ymm2,%ymm2
+ vpaddd %ymm19,%ymm3,%ymm3
+
+ subq $64,%rdx
+ jb .Ltail64_avx512vl
+
+ vpxor 0(%rsi),%xmm0,%xmm4
+ vpxor 16(%rsi),%xmm1,%xmm5
+ vpxor 32(%rsi),%xmm2,%xmm6
+ vpxor 48(%rsi),%xmm3,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jz .Ldone_avx512vl
+
+ vextracti128 $1,%ymm0,%xmm4
+ vextracti128 $1,%ymm1,%xmm5
+ vextracti128 $1,%ymm2,%xmm6
+ vextracti128 $1,%ymm3,%xmm7
+
+ subq $64,%rdx
+ jb .Ltail_avx512vl
+
+ vpxor 0(%rsi),%xmm4,%xmm4
+ vpxor 16(%rsi),%xmm5,%xmm5
+ vpxor 32(%rsi),%xmm6,%xmm6
+ vpxor 48(%rsi),%xmm7,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ vmovdqa32 %ymm16,%ymm0
+ vmovdqa32 %ymm17,%ymm1
+ jnz .Loop_outer_avx512vl
+
+ jmp .Ldone_avx512vl
+
+.align 16
+.Ltail64_avx512vl:
+ vmovdqa %xmm0,0(%rsp)
+ vmovdqa %xmm1,16(%rsp)
+ vmovdqa %xmm2,32(%rsp)
+ vmovdqa %xmm3,48(%rsp)
+ addq $64,%rdx
+ jmp .Loop_tail_avx512vl
+
+.align 16
+.Ltail_avx512vl:
+ vmovdqa %xmm4,0(%rsp)
+ vmovdqa %xmm5,16(%rsp)
+ vmovdqa %xmm6,32(%rsp)
+ vmovdqa %xmm7,48(%rsp)
+ addq $64,%rdx
+
+.Loop_tail_avx512vl:
+ movzbl (%rsi,%r8,1),%eax
+ movzbl (%rsp,%r8,1),%ecx
+ leaq 1(%r8),%r8
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r8,1)
+ decq %rdx
+ jnz .Loop_tail_avx512vl
+
+ vmovdqu32 %ymm16,0(%rsp)
+ vmovdqu32 %ymm16,32(%rsp)
+
+.Ldone_avx512vl:
+ vzeroall
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.Lavx512vl_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_avx512vl,.-ChaCha20_avx512vl
+.type ChaCha20_16x,@function
+.align 32
+ChaCha20_16x:
+.cfi_startproc
+.LChaCha20_16x:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ subq $64+8,%rsp
+ andq $-64,%rsp
+ vzeroupper
+
+ leaq .Lsigma(%rip),%r10
+ vbroadcasti32x4 (%r10),%zmm3
+ vbroadcasti32x4 (%rcx),%zmm7
+ vbroadcasti32x4 16(%rcx),%zmm11
+ vbroadcasti32x4 (%r8),%zmm15
+
+ vpshufd $0x00,%zmm3,%zmm0
+ vpshufd $0x55,%zmm3,%zmm1
+ vpshufd $0xaa,%zmm3,%zmm2
+ vpshufd $0xff,%zmm3,%zmm3
+ vmovdqa64 %zmm0,%zmm16
+ vmovdqa64 %zmm1,%zmm17
+ vmovdqa64 %zmm2,%zmm18
+ vmovdqa64 %zmm3,%zmm19
+
+ vpshufd $0x00,%zmm7,%zmm4
+ vpshufd $0x55,%zmm7,%zmm5
+ vpshufd $0xaa,%zmm7,%zmm6
+ vpshufd $0xff,%zmm7,%zmm7
+ vmovdqa64 %zmm4,%zmm20
+ vmovdqa64 %zmm5,%zmm21
+ vmovdqa64 %zmm6,%zmm22
+ vmovdqa64 %zmm7,%zmm23
+
+ vpshufd $0x00,%zmm11,%zmm8
+ vpshufd $0x55,%zmm11,%zmm9
+ vpshufd $0xaa,%zmm11,%zmm10
+ vpshufd $0xff,%zmm11,%zmm11
+ vmovdqa64 %zmm8,%zmm24
+ vmovdqa64 %zmm9,%zmm25
+ vmovdqa64 %zmm10,%zmm26
+ vmovdqa64 %zmm11,%zmm27
+
+ vpshufd $0x00,%zmm15,%zmm12
+ vpshufd $0x55,%zmm15,%zmm13
+ vpshufd $0xaa,%zmm15,%zmm14
+ vpshufd $0xff,%zmm15,%zmm15
+ vpaddd .Lincz(%rip),%zmm12,%zmm12
+ vmovdqa64 %zmm12,%zmm28
+ vmovdqa64 %zmm13,%zmm29
+ vmovdqa64 %zmm14,%zmm30
+ vmovdqa64 %zmm15,%zmm31
+
+ movl $10,%eax
+ jmp .Loop16x
+
+.align 32
+.Loop_outer16x:
+ vpbroadcastd 0(%r10),%zmm0
+ vpbroadcastd 4(%r10),%zmm1
+ vpbroadcastd 8(%r10),%zmm2
+ vpbroadcastd 12(%r10),%zmm3
+ vpaddd .Lsixteen(%rip),%zmm28,%zmm28
+ vmovdqa64 %zmm20,%zmm4
+ vmovdqa64 %zmm21,%zmm5
+ vmovdqa64 %zmm22,%zmm6
+ vmovdqa64 %zmm23,%zmm7
+ vmovdqa64 %zmm24,%zmm8
+ vmovdqa64 %zmm25,%zmm9
+ vmovdqa64 %zmm26,%zmm10
+ vmovdqa64 %zmm27,%zmm11
+ vmovdqa64 %zmm28,%zmm12
+ vmovdqa64 %zmm29,%zmm13
+ vmovdqa64 %zmm30,%zmm14
+ vmovdqa64 %zmm31,%zmm15
+
+ vmovdqa64 %zmm0,%zmm16
+ vmovdqa64 %zmm1,%zmm17
+ vmovdqa64 %zmm2,%zmm18
+ vmovdqa64 %zmm3,%zmm19
+
+ movl $10,%eax
+ jmp .Loop16x
+
+.align 32
+.Loop16x:
+ vpaddd %zmm4,%zmm0,%zmm0
+ vpaddd %zmm5,%zmm1,%zmm1
+ vpaddd %zmm6,%zmm2,%zmm2
+ vpaddd %zmm7,%zmm3,%zmm3
+ vpxord %zmm0,%zmm12,%zmm12
+ vpxord %zmm1,%zmm13,%zmm13
+ vpxord %zmm2,%zmm14,%zmm14
+ vpxord %zmm3,%zmm15,%zmm15
+ vprold $16,%zmm12,%zmm12
+ vprold $16,%zmm13,%zmm13
+ vprold $16,%zmm14,%zmm14
+ vprold $16,%zmm15,%zmm15
+ vpaddd %zmm12,%zmm8,%zmm8
+ vpaddd %zmm13,%zmm9,%zmm9
+ vpaddd %zmm14,%zmm10,%zmm10
+ vpaddd %zmm15,%zmm11,%zmm11
+ vpxord %zmm8,%zmm4,%zmm4
+ vpxord %zmm9,%zmm5,%zmm5
+ vpxord %zmm10,%zmm6,%zmm6
+ vpxord %zmm11,%zmm7,%zmm7
+ vprold $12,%zmm4,%zmm4
+ vprold $12,%zmm5,%zmm5
+ vprold $12,%zmm6,%zmm6
+ vprold $12,%zmm7,%zmm7
+ vpaddd %zmm4,%zmm0,%zmm0
+ vpaddd %zmm5,%zmm1,%zmm1
+ vpaddd %zmm6,%zmm2,%zmm2
+ vpaddd %zmm7,%zmm3,%zmm3
+ vpxord %zmm0,%zmm12,%zmm12
+ vpxord %zmm1,%zmm13,%zmm13
+ vpxord %zmm2,%zmm14,%zmm14
+ vpxord %zmm3,%zmm15,%zmm15
+ vprold $8,%zmm12,%zmm12
+ vprold $8,%zmm13,%zmm13
+ vprold $8,%zmm14,%zmm14
+ vprold $8,%zmm15,%zmm15
+ vpaddd %zmm12,%zmm8,%zmm8
+ vpaddd %zmm13,%zmm9,%zmm9
+ vpaddd %zmm14,%zmm10,%zmm10
+ vpaddd %zmm15,%zmm11,%zmm11
+ vpxord %zmm8,%zmm4,%zmm4
+ vpxord %zmm9,%zmm5,%zmm5
+ vpxord %zmm10,%zmm6,%zmm6
+ vpxord %zmm11,%zmm7,%zmm7
+ vprold $7,%zmm4,%zmm4
+ vprold $7,%zmm5,%zmm5
+ vprold $7,%zmm6,%zmm6
+ vprold $7,%zmm7,%zmm7
+ vpaddd %zmm5,%zmm0,%zmm0
+ vpaddd %zmm6,%zmm1,%zmm1
+ vpaddd %zmm7,%zmm2,%zmm2
+ vpaddd %zmm4,%zmm3,%zmm3
+ vpxord %zmm0,%zmm15,%zmm15
+ vpxord %zmm1,%zmm12,%zmm12
+ vpxord %zmm2,%zmm13,%zmm13
+ vpxord %zmm3,%zmm14,%zmm14
+ vprold $16,%zmm15,%zmm15
+ vprold $16,%zmm12,%zmm12
+ vprold $16,%zmm13,%zmm13
+ vprold $16,%zmm14,%zmm14
+ vpaddd %zmm15,%zmm10,%zmm10
+ vpaddd %zmm12,%zmm11,%zmm11
+ vpaddd %zmm13,%zmm8,%zmm8
+ vpaddd %zmm14,%zmm9,%zmm9
+ vpxord %zmm10,%zmm5,%zmm5
+ vpxord %zmm11,%zmm6,%zmm6
+ vpxord %zmm8,%zmm7,%zmm7
+ vpxord %zmm9,%zmm4,%zmm4
+ vprold $12,%zmm5,%zmm5
+ vprold $12,%zmm6,%zmm6
+ vprold $12,%zmm7,%zmm7
+ vprold $12,%zmm4,%zmm4
+ vpaddd %zmm5,%zmm0,%zmm0
+ vpaddd %zmm6,%zmm1,%zmm1
+ vpaddd %zmm7,%zmm2,%zmm2
+ vpaddd %zmm4,%zmm3,%zmm3
+ vpxord %zmm0,%zmm15,%zmm15
+ vpxord %zmm1,%zmm12,%zmm12
+ vpxord %zmm2,%zmm13,%zmm13
+ vpxord %zmm3,%zmm14,%zmm14
+ vprold $8,%zmm15,%zmm15
+ vprold $8,%zmm12,%zmm12
+ vprold $8,%zmm13,%zmm13
+ vprold $8,%zmm14,%zmm14
+ vpaddd %zmm15,%zmm10,%zmm10
+ vpaddd %zmm12,%zmm11,%zmm11
+ vpaddd %zmm13,%zmm8,%zmm8
+ vpaddd %zmm14,%zmm9,%zmm9
+ vpxord %zmm10,%zmm5,%zmm5
+ vpxord %zmm11,%zmm6,%zmm6
+ vpxord %zmm8,%zmm7,%zmm7
+ vpxord %zmm9,%zmm4,%zmm4
+ vprold $7,%zmm5,%zmm5
+ vprold $7,%zmm6,%zmm6
+ vprold $7,%zmm7,%zmm7
+ vprold $7,%zmm4,%zmm4
+ decl %eax
+ jnz .Loop16x
+
+ vpaddd %zmm16,%zmm0,%zmm0
+ vpaddd %zmm17,%zmm1,%zmm1
+ vpaddd %zmm18,%zmm2,%zmm2
+ vpaddd %zmm19,%zmm3,%zmm3
+
+ vpunpckldq %zmm1,%zmm0,%zmm18
+ vpunpckldq %zmm3,%zmm2,%zmm19
+ vpunpckhdq %zmm1,%zmm0,%zmm0
+ vpunpckhdq %zmm3,%zmm2,%zmm2
+ vpunpcklqdq %zmm19,%zmm18,%zmm1
+ vpunpckhqdq %zmm19,%zmm18,%zmm18
+ vpunpcklqdq %zmm2,%zmm0,%zmm3
+ vpunpckhqdq %zmm2,%zmm0,%zmm0
+ vpaddd %zmm20,%zmm4,%zmm4
+ vpaddd %zmm21,%zmm5,%zmm5
+ vpaddd %zmm22,%zmm6,%zmm6
+ vpaddd %zmm23,%zmm7,%zmm7
+
+ vpunpckldq %zmm5,%zmm4,%zmm2
+ vpunpckldq %zmm7,%zmm6,%zmm19
+ vpunpckhdq %zmm5,%zmm4,%zmm4
+ vpunpckhdq %zmm7,%zmm6,%zmm6
+ vpunpcklqdq %zmm19,%zmm2,%zmm5
+ vpunpckhqdq %zmm19,%zmm2,%zmm2
+ vpunpcklqdq %zmm6,%zmm4,%zmm7
+ vpunpckhqdq %zmm6,%zmm4,%zmm4
+ vshufi32x4 $0x44,%zmm5,%zmm1,%zmm19
+ vshufi32x4 $0xee,%zmm5,%zmm1,%zmm5
+ vshufi32x4 $0x44,%zmm2,%zmm18,%zmm1
+ vshufi32x4 $0xee,%zmm2,%zmm18,%zmm2
+ vshufi32x4 $0x44,%zmm7,%zmm3,%zmm18
+ vshufi32x4 $0xee,%zmm7,%zmm3,%zmm7
+ vshufi32x4 $0x44,%zmm4,%zmm0,%zmm3
+ vshufi32x4 $0xee,%zmm4,%zmm0,%zmm4
+ vpaddd %zmm24,%zmm8,%zmm8
+ vpaddd %zmm25,%zmm9,%zmm9
+ vpaddd %zmm26,%zmm10,%zmm10
+ vpaddd %zmm27,%zmm11,%zmm11
+
+ vpunpckldq %zmm9,%zmm8,%zmm6
+ vpunpckldq %zmm11,%zmm10,%zmm0
+ vpunpckhdq %zmm9,%zmm8,%zmm8
+ vpunpckhdq %zmm11,%zmm10,%zmm10
+ vpunpcklqdq %zmm0,%zmm6,%zmm9
+ vpunpckhqdq %zmm0,%zmm6,%zmm6
+ vpunpcklqdq %zmm10,%zmm8,%zmm11
+ vpunpckhqdq %zmm10,%zmm8,%zmm8
+ vpaddd %zmm28,%zmm12,%zmm12
+ vpaddd %zmm29,%zmm13,%zmm13
+ vpaddd %zmm30,%zmm14,%zmm14
+ vpaddd %zmm31,%zmm15,%zmm15
+
+ vpunpckldq %zmm13,%zmm12,%zmm10
+ vpunpckldq %zmm15,%zmm14,%zmm0
+ vpunpckhdq %zmm13,%zmm12,%zmm12
+ vpunpckhdq %zmm15,%zmm14,%zmm14
+ vpunpcklqdq %zmm0,%zmm10,%zmm13
+ vpunpckhqdq %zmm0,%zmm10,%zmm10
+ vpunpcklqdq %zmm14,%zmm12,%zmm15
+ vpunpckhqdq %zmm14,%zmm12,%zmm12
+ vshufi32x4 $0x44,%zmm13,%zmm9,%zmm0
+ vshufi32x4 $0xee,%zmm13,%zmm9,%zmm13
+ vshufi32x4 $0x44,%zmm10,%zmm6,%zmm9
+ vshufi32x4 $0xee,%zmm10,%zmm6,%zmm10
+ vshufi32x4 $0x44,%zmm15,%zmm11,%zmm6
+ vshufi32x4 $0xee,%zmm15,%zmm11,%zmm15
+ vshufi32x4 $0x44,%zmm12,%zmm8,%zmm11
+ vshufi32x4 $0xee,%zmm12,%zmm8,%zmm12
+ vshufi32x4 $0x88,%zmm0,%zmm19,%zmm16
+ vshufi32x4 $0xdd,%zmm0,%zmm19,%zmm19
+ vshufi32x4 $0x88,%zmm13,%zmm5,%zmm0
+ vshufi32x4 $0xdd,%zmm13,%zmm5,%zmm13
+ vshufi32x4 $0x88,%zmm9,%zmm1,%zmm17
+ vshufi32x4 $0xdd,%zmm9,%zmm1,%zmm1
+ vshufi32x4 $0x88,%zmm10,%zmm2,%zmm9
+ vshufi32x4 $0xdd,%zmm10,%zmm2,%zmm10
+ vshufi32x4 $0x88,%zmm6,%zmm18,%zmm14
+ vshufi32x4 $0xdd,%zmm6,%zmm18,%zmm18
+ vshufi32x4 $0x88,%zmm15,%zmm7,%zmm6
+ vshufi32x4 $0xdd,%zmm15,%zmm7,%zmm15
+ vshufi32x4 $0x88,%zmm11,%zmm3,%zmm8
+ vshufi32x4 $0xdd,%zmm11,%zmm3,%zmm3
+ vshufi32x4 $0x88,%zmm12,%zmm4,%zmm11
+ vshufi32x4 $0xdd,%zmm12,%zmm4,%zmm12
+ cmpq $1024,%rdx
+ jb .Ltail16x
+
+ vpxord 0(%rsi),%zmm16,%zmm16
+ vpxord 64(%rsi),%zmm17,%zmm17
+ vpxord 128(%rsi),%zmm14,%zmm14
+ vpxord 192(%rsi),%zmm8,%zmm8
+ vmovdqu32 %zmm16,0(%rdi)
+ vmovdqu32 %zmm17,64(%rdi)
+ vmovdqu32 %zmm14,128(%rdi)
+ vmovdqu32 %zmm8,192(%rdi)
+
+ vpxord 256(%rsi),%zmm19,%zmm19
+ vpxord 320(%rsi),%zmm1,%zmm1
+ vpxord 384(%rsi),%zmm18,%zmm18
+ vpxord 448(%rsi),%zmm3,%zmm3
+ vmovdqu32 %zmm19,256(%rdi)
+ vmovdqu32 %zmm1,320(%rdi)
+ vmovdqu32 %zmm18,384(%rdi)
+ vmovdqu32 %zmm3,448(%rdi)
+
+ vpxord 512(%rsi),%zmm0,%zmm0
+ vpxord 576(%rsi),%zmm9,%zmm9
+ vpxord 640(%rsi),%zmm6,%zmm6
+ vpxord 704(%rsi),%zmm11,%zmm11
+ vmovdqu32 %zmm0,512(%rdi)
+ vmovdqu32 %zmm9,576(%rdi)
+ vmovdqu32 %zmm6,640(%rdi)
+ vmovdqu32 %zmm11,704(%rdi)
+
+ vpxord 768(%rsi),%zmm13,%zmm13
+ vpxord 832(%rsi),%zmm10,%zmm10
+ vpxord 896(%rsi),%zmm15,%zmm15
+ vpxord 960(%rsi),%zmm12,%zmm12
+ leaq 1024(%rsi),%rsi
+ vmovdqu32 %zmm13,768(%rdi)
+ vmovdqu32 %zmm10,832(%rdi)
+ vmovdqu32 %zmm15,896(%rdi)
+ vmovdqu32 %zmm12,960(%rdi)
+ leaq 1024(%rdi),%rdi
+
+ subq $1024,%rdx
+ jnz .Loop_outer16x
+
+ jmp .Ldone16x
+
+.align 32
+.Ltail16x:
+ xorq %r10,%r10
+ subq %rsi,%rdi
+ cmpq $64,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm16,%zmm16
+ vmovdqu32 %zmm16,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm17,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $128,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm17,%zmm17
+ vmovdqu32 %zmm17,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm14,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $192,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm14,%zmm14
+ vmovdqu32 %zmm14,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm8,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $256,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm8,%zmm8
+ vmovdqu32 %zmm8,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm19,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $320,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm19,%zmm19
+ vmovdqu32 %zmm19,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm1,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $384,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm1,%zmm1
+ vmovdqu32 %zmm1,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm18,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $448,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm18,%zmm18
+ vmovdqu32 %zmm18,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm3,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $512,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm3,%zmm3
+ vmovdqu32 %zmm3,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm0,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $576,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm0,%zmm0
+ vmovdqu32 %zmm0,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm9,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $640,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm9,%zmm9
+ vmovdqu32 %zmm9,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm6,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $704,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm6,%zmm6
+ vmovdqu32 %zmm6,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm11,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $768,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm11,%zmm11
+ vmovdqu32 %zmm11,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm13,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $832,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm13,%zmm13
+ vmovdqu32 %zmm13,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm10,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $896,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm10,%zmm10
+ vmovdqu32 %zmm10,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm15,%zmm16
+ leaq 64(%rsi),%rsi
+
+ cmpq $960,%rdx
+ jb .Less_than_64_16x
+ vpxord (%rsi),%zmm15,%zmm15
+ vmovdqu32 %zmm15,(%rdi,%rsi,1)
+ je .Ldone16x
+ vmovdqa32 %zmm12,%zmm16
+ leaq 64(%rsi),%rsi
+
+.Less_than_64_16x:
+ vmovdqa32 %zmm16,0(%rsp)
+ leaq (%rdi,%rsi,1),%rdi
+ andq $63,%rdx
+
+.Loop_tail16x:
+ movzbl (%rsi,%r10,1),%eax
+ movzbl (%rsp,%r10,1),%ecx
+ leaq 1(%r10),%r10
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r10,1)
+ decq %rdx
+ jnz .Loop_tail16x
+
+ vpxord %zmm16,%zmm16,%zmm16
+ vmovdqa32 %zmm16,0(%rsp)
+
+.Ldone16x:
+ vzeroall
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L16x_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_16x,.-ChaCha20_16x
+.type ChaCha20_8xvl,@function
+.align 32
+ChaCha20_8xvl:
+.cfi_startproc
+.LChaCha20_8xvl:
+ movq %rsp,%r9
+.cfi_def_cfa_register %r9
+ subq $64+8,%rsp
+ andq $-64,%rsp
+ vzeroupper
+
+ leaq .Lsigma(%rip),%r10
+ vbroadcasti128 (%r10),%ymm3
+ vbroadcasti128 (%rcx),%ymm7
+ vbroadcasti128 16(%rcx),%ymm11
+ vbroadcasti128 (%r8),%ymm15
+
+ vpshufd $0x00,%ymm3,%ymm0
+ vpshufd $0x55,%ymm3,%ymm1
+ vpshufd $0xaa,%ymm3,%ymm2
+ vpshufd $0xff,%ymm3,%ymm3
+ vmovdqa64 %ymm0,%ymm16
+ vmovdqa64 %ymm1,%ymm17
+ vmovdqa64 %ymm2,%ymm18
+ vmovdqa64 %ymm3,%ymm19
+
+ vpshufd $0x00,%ymm7,%ymm4
+ vpshufd $0x55,%ymm7,%ymm5
+ vpshufd $0xaa,%ymm7,%ymm6
+ vpshufd $0xff,%ymm7,%ymm7
+ vmovdqa64 %ymm4,%ymm20
+ vmovdqa64 %ymm5,%ymm21
+ vmovdqa64 %ymm6,%ymm22
+ vmovdqa64 %ymm7,%ymm23
+
+ vpshufd $0x00,%ymm11,%ymm8
+ vpshufd $0x55,%ymm11,%ymm9
+ vpshufd $0xaa,%ymm11,%ymm10
+ vpshufd $0xff,%ymm11,%ymm11
+ vmovdqa64 %ymm8,%ymm24
+ vmovdqa64 %ymm9,%ymm25
+ vmovdqa64 %ymm10,%ymm26
+ vmovdqa64 %ymm11,%ymm27
+
+ vpshufd $0x00,%ymm15,%ymm12
+ vpshufd $0x55,%ymm15,%ymm13
+ vpshufd $0xaa,%ymm15,%ymm14
+ vpshufd $0xff,%ymm15,%ymm15
+ vpaddd .Lincy(%rip),%ymm12,%ymm12
+ vmovdqa64 %ymm12,%ymm28
+ vmovdqa64 %ymm13,%ymm29
+ vmovdqa64 %ymm14,%ymm30
+ vmovdqa64 %ymm15,%ymm31
+
+ movl $10,%eax
+ jmp .Loop8xvl
+
+.align 32
+.Loop_outer8xvl:
+
+
+ vpbroadcastd 8(%r10),%ymm2
+ vpbroadcastd 12(%r10),%ymm3
+ vpaddd .Leight(%rip),%ymm28,%ymm28
+ vmovdqa64 %ymm20,%ymm4
+ vmovdqa64 %ymm21,%ymm5
+ vmovdqa64 %ymm22,%ymm6
+ vmovdqa64 %ymm23,%ymm7
+ vmovdqa64 %ymm24,%ymm8
+ vmovdqa64 %ymm25,%ymm9
+ vmovdqa64 %ymm26,%ymm10
+ vmovdqa64 %ymm27,%ymm11
+ vmovdqa64 %ymm28,%ymm12
+ vmovdqa64 %ymm29,%ymm13
+ vmovdqa64 %ymm30,%ymm14
+ vmovdqa64 %ymm31,%ymm15
+
+ vmovdqa64 %ymm0,%ymm16
+ vmovdqa64 %ymm1,%ymm17
+ vmovdqa64 %ymm2,%ymm18
+ vmovdqa64 %ymm3,%ymm19
+
+ movl $10,%eax
+ jmp .Loop8xvl
+
+.align 32
+.Loop8xvl:
+ vpaddd %ymm4,%ymm0,%ymm0
+ vpaddd %ymm5,%ymm1,%ymm1
+ vpaddd %ymm6,%ymm2,%ymm2
+ vpaddd %ymm7,%ymm3,%ymm3
+ vpxor %ymm0,%ymm12,%ymm12
+ vpxor %ymm1,%ymm13,%ymm13
+ vpxor %ymm2,%ymm14,%ymm14
+ vpxor %ymm3,%ymm15,%ymm15
+ vprold $16,%ymm12,%ymm12
+ vprold $16,%ymm13,%ymm13
+ vprold $16,%ymm14,%ymm14
+ vprold $16,%ymm15,%ymm15
+ vpaddd %ymm12,%ymm8,%ymm8
+ vpaddd %ymm13,%ymm9,%ymm9
+ vpaddd %ymm14,%ymm10,%ymm10
+ vpaddd %ymm15,%ymm11,%ymm11
+ vpxor %ymm8,%ymm4,%ymm4
+ vpxor %ymm9,%ymm5,%ymm5
+ vpxor %ymm10,%ymm6,%ymm6
+ vpxor %ymm11,%ymm7,%ymm7
+ vprold $12,%ymm4,%ymm4
+ vprold $12,%ymm5,%ymm5
+ vprold $12,%ymm6,%ymm6
+ vprold $12,%ymm7,%ymm7
+ vpaddd %ymm4,%ymm0,%ymm0
+ vpaddd %ymm5,%ymm1,%ymm1
+ vpaddd %ymm6,%ymm2,%ymm2
+ vpaddd %ymm7,%ymm3,%ymm3
+ vpxor %ymm0,%ymm12,%ymm12
+ vpxor %ymm1,%ymm13,%ymm13
+ vpxor %ymm2,%ymm14,%ymm14
+ vpxor %ymm3,%ymm15,%ymm15
+ vprold $8,%ymm12,%ymm12
+ vprold $8,%ymm13,%ymm13
+ vprold $8,%ymm14,%ymm14
+ vprold $8,%ymm15,%ymm15
+ vpaddd %ymm12,%ymm8,%ymm8
+ vpaddd %ymm13,%ymm9,%ymm9
+ vpaddd %ymm14,%ymm10,%ymm10
+ vpaddd %ymm15,%ymm11,%ymm11
+ vpxor %ymm8,%ymm4,%ymm4
+ vpxor %ymm9,%ymm5,%ymm5
+ vpxor %ymm10,%ymm6,%ymm6
+ vpxor %ymm11,%ymm7,%ymm7
+ vprold $7,%ymm4,%ymm4
+ vprold $7,%ymm5,%ymm5
+ vprold $7,%ymm6,%ymm6
+ vprold $7,%ymm7,%ymm7
+ vpaddd %ymm5,%ymm0,%ymm0
+ vpaddd %ymm6,%ymm1,%ymm1
+ vpaddd %ymm7,%ymm2,%ymm2
+ vpaddd %ymm4,%ymm3,%ymm3
+ vpxor %ymm0,%ymm15,%ymm15
+ vpxor %ymm1,%ymm12,%ymm12
+ vpxor %ymm2,%ymm13,%ymm13
+ vpxor %ymm3,%ymm14,%ymm14
+ vprold $16,%ymm15,%ymm15
+ vprold $16,%ymm12,%ymm12
+ vprold $16,%ymm13,%ymm13
+ vprold $16,%ymm14,%ymm14
+ vpaddd %ymm15,%ymm10,%ymm10
+ vpaddd %ymm12,%ymm11,%ymm11
+ vpaddd %ymm13,%ymm8,%ymm8
+ vpaddd %ymm14,%ymm9,%ymm9
+ vpxor %ymm10,%ymm5,%ymm5
+ vpxor %ymm11,%ymm6,%ymm6
+ vpxor %ymm8,%ymm7,%ymm7
+ vpxor %ymm9,%ymm4,%ymm4
+ vprold $12,%ymm5,%ymm5
+ vprold $12,%ymm6,%ymm6
+ vprold $12,%ymm7,%ymm7
+ vprold $12,%ymm4,%ymm4
+ vpaddd %ymm5,%ymm0,%ymm0
+ vpaddd %ymm6,%ymm1,%ymm1
+ vpaddd %ymm7,%ymm2,%ymm2
+ vpaddd %ymm4,%ymm3,%ymm3
+ vpxor %ymm0,%ymm15,%ymm15
+ vpxor %ymm1,%ymm12,%ymm12
+ vpxor %ymm2,%ymm13,%ymm13
+ vpxor %ymm3,%ymm14,%ymm14
+ vprold $8,%ymm15,%ymm15
+ vprold $8,%ymm12,%ymm12
+ vprold $8,%ymm13,%ymm13
+ vprold $8,%ymm14,%ymm14
+ vpaddd %ymm15,%ymm10,%ymm10
+ vpaddd %ymm12,%ymm11,%ymm11
+ vpaddd %ymm13,%ymm8,%ymm8
+ vpaddd %ymm14,%ymm9,%ymm9
+ vpxor %ymm10,%ymm5,%ymm5
+ vpxor %ymm11,%ymm6,%ymm6
+ vpxor %ymm8,%ymm7,%ymm7
+ vpxor %ymm9,%ymm4,%ymm4
+ vprold $7,%ymm5,%ymm5
+ vprold $7,%ymm6,%ymm6
+ vprold $7,%ymm7,%ymm7
+ vprold $7,%ymm4,%ymm4
+ decl %eax
+ jnz .Loop8xvl
+
+ vpaddd %ymm16,%ymm0,%ymm0
+ vpaddd %ymm17,%ymm1,%ymm1
+ vpaddd %ymm18,%ymm2,%ymm2
+ vpaddd %ymm19,%ymm3,%ymm3
+
+ vpunpckldq %ymm1,%ymm0,%ymm18
+ vpunpckldq %ymm3,%ymm2,%ymm19
+ vpunpckhdq %ymm1,%ymm0,%ymm0
+ vpunpckhdq %ymm3,%ymm2,%ymm2
+ vpunpcklqdq %ymm19,%ymm18,%ymm1
+ vpunpckhqdq %ymm19,%ymm18,%ymm18
+ vpunpcklqdq %ymm2,%ymm0,%ymm3
+ vpunpckhqdq %ymm2,%ymm0,%ymm0
+ vpaddd %ymm20,%ymm4,%ymm4
+ vpaddd %ymm21,%ymm5,%ymm5
+ vpaddd %ymm22,%ymm6,%ymm6
+ vpaddd %ymm23,%ymm7,%ymm7
+
+ vpunpckldq %ymm5,%ymm4,%ymm2
+ vpunpckldq %ymm7,%ymm6,%ymm19
+ vpunpckhdq %ymm5,%ymm4,%ymm4
+ vpunpckhdq %ymm7,%ymm6,%ymm6
+ vpunpcklqdq %ymm19,%ymm2,%ymm5
+ vpunpckhqdq %ymm19,%ymm2,%ymm2
+ vpunpcklqdq %ymm6,%ymm4,%ymm7
+ vpunpckhqdq %ymm6,%ymm4,%ymm4
+ vshufi32x4 $0,%ymm5,%ymm1,%ymm19
+ vshufi32x4 $3,%ymm5,%ymm1,%ymm5
+ vshufi32x4 $0,%ymm2,%ymm18,%ymm1
+ vshufi32x4 $3,%ymm2,%ymm18,%ymm2
+ vshufi32x4 $0,%ymm7,%ymm3,%ymm18
+ vshufi32x4 $3,%ymm7,%ymm3,%ymm7
+ vshufi32x4 $0,%ymm4,%ymm0,%ymm3
+ vshufi32x4 $3,%ymm4,%ymm0,%ymm4
+ vpaddd %ymm24,%ymm8,%ymm8
+ vpaddd %ymm25,%ymm9,%ymm9
+ vpaddd %ymm26,%ymm10,%ymm10
+ vpaddd %ymm27,%ymm11,%ymm11
+
+ vpunpckldq %ymm9,%ymm8,%ymm6
+ vpunpckldq %ymm11,%ymm10,%ymm0
+ vpunpckhdq %ymm9,%ymm8,%ymm8
+ vpunpckhdq %ymm11,%ymm10,%ymm10
+ vpunpcklqdq %ymm0,%ymm6,%ymm9
+ vpunpckhqdq %ymm0,%ymm6,%ymm6
+ vpunpcklqdq %ymm10,%ymm8,%ymm11
+ vpunpckhqdq %ymm10,%ymm8,%ymm8
+ vpaddd %ymm28,%ymm12,%ymm12
+ vpaddd %ymm29,%ymm13,%ymm13
+ vpaddd %ymm30,%ymm14,%ymm14
+ vpaddd %ymm31,%ymm15,%ymm15
+
+ vpunpckldq %ymm13,%ymm12,%ymm10
+ vpunpckldq %ymm15,%ymm14,%ymm0
+ vpunpckhdq %ymm13,%ymm12,%ymm12
+ vpunpckhdq %ymm15,%ymm14,%ymm14
+ vpunpcklqdq %ymm0,%ymm10,%ymm13
+ vpunpckhqdq %ymm0,%ymm10,%ymm10
+ vpunpcklqdq %ymm14,%ymm12,%ymm15
+ vpunpckhqdq %ymm14,%ymm12,%ymm12
+ vperm2i128 $0x20,%ymm13,%ymm9,%ymm0
+ vperm2i128 $0x31,%ymm13,%ymm9,%ymm13
+ vperm2i128 $0x20,%ymm10,%ymm6,%ymm9
+ vperm2i128 $0x31,%ymm10,%ymm6,%ymm10
+ vperm2i128 $0x20,%ymm15,%ymm11,%ymm6
+ vperm2i128 $0x31,%ymm15,%ymm11,%ymm15
+ vperm2i128 $0x20,%ymm12,%ymm8,%ymm11
+ vperm2i128 $0x31,%ymm12,%ymm8,%ymm12
+ cmpq $512,%rdx
+ jb .Ltail8xvl
+
+ movl $0x80,%eax
+ vpxord 0(%rsi),%ymm19,%ymm19
+ vpxor 32(%rsi),%ymm0,%ymm0
+ vpxor 64(%rsi),%ymm5,%ymm5
+ vpxor 96(%rsi),%ymm13,%ymm13
+ leaq (%rsi,%rax,1),%rsi
+ vmovdqu32 %ymm19,0(%rdi)
+ vmovdqu %ymm0,32(%rdi)
+ vmovdqu %ymm5,64(%rdi)
+ vmovdqu %ymm13,96(%rdi)
+ leaq (%rdi,%rax,1),%rdi
+
+ vpxor 0(%rsi),%ymm1,%ymm1
+ vpxor 32(%rsi),%ymm9,%ymm9
+ vpxor 64(%rsi),%ymm2,%ymm2
+ vpxor 96(%rsi),%ymm10,%ymm10
+ leaq (%rsi,%rax,1),%rsi
+ vmovdqu %ymm1,0(%rdi)
+ vmovdqu %ymm9,32(%rdi)
+ vmovdqu %ymm2,64(%rdi)
+ vmovdqu %ymm10,96(%rdi)
+ leaq (%rdi,%rax,1),%rdi
+
+ vpxord 0(%rsi),%ymm18,%ymm18
+ vpxor 32(%rsi),%ymm6,%ymm6
+ vpxor 64(%rsi),%ymm7,%ymm7
+ vpxor 96(%rsi),%ymm15,%ymm15
+ leaq (%rsi,%rax,1),%rsi
+ vmovdqu32 %ymm18,0(%rdi)
+ vmovdqu %ymm6,32(%rdi)
+ vmovdqu %ymm7,64(%rdi)
+ vmovdqu %ymm15,96(%rdi)
+ leaq (%rdi,%rax,1),%rdi
+
+ vpxor 0(%rsi),%ymm3,%ymm3
+ vpxor 32(%rsi),%ymm11,%ymm11
+ vpxor 64(%rsi),%ymm4,%ymm4
+ vpxor 96(%rsi),%ymm12,%ymm12
+ leaq (%rsi,%rax,1),%rsi
+ vmovdqu %ymm3,0(%rdi)
+ vmovdqu %ymm11,32(%rdi)
+ vmovdqu %ymm4,64(%rdi)
+ vmovdqu %ymm12,96(%rdi)
+ leaq (%rdi,%rax,1),%rdi
+
+ vpbroadcastd 0(%r10),%ymm0
+ vpbroadcastd 4(%r10),%ymm1
+
+ subq $512,%rdx
+ jnz .Loop_outer8xvl
+
+ jmp .Ldone8xvl
+
+.align 32
+.Ltail8xvl:
+ vmovdqa64 %ymm19,%ymm8
+ xorq %r10,%r10
+ subq %rsi,%rdi
+ cmpq $64,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm8,%ymm8
+ vpxor 32(%rsi),%ymm0,%ymm0
+ vmovdqu %ymm8,0(%rdi,%rsi,1)
+ vmovdqu %ymm0,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm5,%ymm8
+ vmovdqa %ymm13,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $128,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm5,%ymm5
+ vpxor 32(%rsi),%ymm13,%ymm13
+ vmovdqu %ymm5,0(%rdi,%rsi,1)
+ vmovdqu %ymm13,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm1,%ymm8
+ vmovdqa %ymm9,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $192,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm1,%ymm1
+ vpxor 32(%rsi),%ymm9,%ymm9
+ vmovdqu %ymm1,0(%rdi,%rsi,1)
+ vmovdqu %ymm9,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm2,%ymm8
+ vmovdqa %ymm10,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $256,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm2,%ymm2
+ vpxor 32(%rsi),%ymm10,%ymm10
+ vmovdqu %ymm2,0(%rdi,%rsi,1)
+ vmovdqu %ymm10,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa32 %ymm18,%ymm8
+ vmovdqa %ymm6,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $320,%rdx
+ jb .Less_than_64_8xvl
+ vpxord 0(%rsi),%ymm18,%ymm18
+ vpxor 32(%rsi),%ymm6,%ymm6
+ vmovdqu32 %ymm18,0(%rdi,%rsi,1)
+ vmovdqu %ymm6,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm7,%ymm8
+ vmovdqa %ymm15,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $384,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm7,%ymm7
+ vpxor 32(%rsi),%ymm15,%ymm15
+ vmovdqu %ymm7,0(%rdi,%rsi,1)
+ vmovdqu %ymm15,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm3,%ymm8
+ vmovdqa %ymm11,%ymm0
+ leaq 64(%rsi),%rsi
+
+ cmpq $448,%rdx
+ jb .Less_than_64_8xvl
+ vpxor 0(%rsi),%ymm3,%ymm3
+ vpxor 32(%rsi),%ymm11,%ymm11
+ vmovdqu %ymm3,0(%rdi,%rsi,1)
+ vmovdqu %ymm11,32(%rdi,%rsi,1)
+ je .Ldone8xvl
+ vmovdqa %ymm4,%ymm8
+ vmovdqa %ymm12,%ymm0
+ leaq 64(%rsi),%rsi
+
+.Less_than_64_8xvl:
+ vmovdqa %ymm8,0(%rsp)
+ vmovdqa %ymm0,32(%rsp)
+ leaq (%rdi,%rsi,1),%rdi
+ andq $63,%rdx
+
+.Loop_tail8xvl:
+ movzbl (%rsi,%r10,1),%eax
+ movzbl (%rsp,%r10,1),%ecx
+ leaq 1(%r10),%r10
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r10,1)
+ decq %rdx
+ jnz .Loop_tail8xvl
+
+ vpxor %ymm8,%ymm8,%ymm8
+ vmovdqa %ymm8,0(%rsp)
+ vmovdqa %ymm8,32(%rsp)
+
+.Ldone8xvl:
+ vzeroall
+ leaq (%r9),%rsp
+.cfi_def_cfa_register %rsp
+.L8xvl_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size ChaCha20_8xvl,.-ChaCha20_8xvl
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 06/28] zinc: ChaCha20 x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (2 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 05/28] zinc: import Andy Polyakov's ChaCha20 x86_64 implementation Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 07/28] zinc: import Andy Polyakov's ChaCha20 ARM and ARM64 implementations Jason A. Donenfeld
` (19 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Thomas Gleixner, Ingo Molnar,
x86, Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This ports SSSE3, AVX-2, AVX-512F, and AVX-512VL implementations for
ChaCha20. The AVX-512F implementation is disabled on Skylake, due to
throttling, and the VL ymm implementation is used instead. These come
from Andy Polyakov's implementation, with the following modifications
from Samuel Neves:
- Some cosmetic changes, like renaming labels to .Lname, constants,
and other Linux conventions.
- CPU feature checking is done in C by the glue code, so that has been
removed from the assembly.
- Eliminate translating certain instructions, such as pshufb, palignr,
vprotd, etc, to .byte directives. This is meant for compatibility
with ancient toolchains, but presumably it is unnecessary here,
since the build system already does checks on what GNU as can
assemble.
- When aligning the stack, the original code was saving %rsp to %r9.
To keep objtool happy, we use instead the DRAP idiom to save %rsp
to %r10:
leaq 8(%rsp),%r10
... code here ...
leaq -8(%r10),%rsp
- The original code assumes the stack comes aligned to 16 bytes. This
is not necessarily the case, and to avoid crashes,
`andq $-alignment, %rsp` was added in the prolog of a few functions.
- The original hardcodes returns as .byte 0xf3,0xc3, aka "rep ret".
We replace this by "ret". "rep ret" was meant to help with AMD K8
chips, cf. http://repzret.org/p/repzret. It makes no sense to
continue to use this kludge for code that won't even run on ancient
AMD chips.
Cycle counts on a Core i7 6700HQ using the AVX-2 codepath, comparing
this implementation ("new") to the implementation in the current crypto
api ("old"):
size old new
---- ---- ----
0 62 52
16 414 376
32 410 400
48 414 422
64 362 356
80 714 666
96 714 700
112 712 718
128 692 646
144 1042 674
160 1042 694
176 1042 726
192 1018 650
208 1366 686
224 1366 696
240 1366 722
256 640 656
272 988 1246
288 988 1276
304 992 1296
320 972 1222
336 1318 1256
352 1318 1276
368 1316 1294
384 1294 1218
400 1642 1258
416 1642 1282
432 1642 1302
448 1628 1224
464 1970 1258
480 1970 1280
496 1970 1300
512 656 676
528 1010 1290
544 1010 1306
560 1010 1332
576 986 1254
592 1340 1284
608 1334 1310
624 1340 1334
640 1314 1254
656 1664 1282
672 1674 1306
688 1662 1336
704 1638 1250
720 1992 1292
736 1994 1308
752 1988 1334
768 1252 1254
784 1596 1290
800 1596 1314
816 1596 1330
832 1576 1256
848 1922 1286
864 1922 1314
880 1926 1338
896 1898 1258
912 2248 1288
928 2248 1320
944 2248 1338
960 2226 1268
976 2574 1288
992 2576 1312
1008 2574 1340
Cycle counts on a Xeon Gold 5120 using the AVX-512 codepath:
size old new
---- ---- ----
0 64 54
16 386 372
32 388 396
48 388 420
64 366 350
80 708 666
96 708 692
112 706 736
128 692 648
144 1036 682
160 1036 708
176 1036 730
192 1016 658
208 1360 684
224 1362 708
240 1360 732
256 644 500
272 990 526
288 988 556
304 988 576
320 972 500
336 1314 532
352 1316 558
368 1318 578
384 1308 506
400 1644 532
416 1644 556
432 1644 594
448 1624 508
464 1970 534
480 1970 556
496 1968 582
512 660 624
528 1016 682
544 1016 702
560 1018 728
576 998 654
592 1344 680
608 1344 708
624 1344 730
640 1326 654
656 1670 686
672 1670 708
688 1670 732
704 1652 658
720 1998 682
736 1998 710
752 1996 734
768 1256 662
784 1606 688
800 1606 714
816 1606 736
832 1584 660
848 1948 688
864 1950 714
880 1948 736
896 1912 688
912 2258 718
928 2258 744
944 2256 768
960 2238 692
976 2584 718
992 2584 744
1008 2584 770
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 1 +
lib/zinc/chacha20/chacha20-x86_64-glue.c | 103 ++
...-x86_64-cryptogams.S => chacha20-x86_64.S} | 1557 ++++-------------
lib/zinc/chacha20/chacha20.c | 4 +
4 files changed, 486 insertions(+), 1179 deletions(-)
create mode 100644 lib/zinc/chacha20/chacha20-x86_64-glue.c
rename lib/zinc/chacha20/{chacha20-x86_64-cryptogams.S => chacha20-x86_64.S} (71%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 3d80144d55a6..223a0816c918 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -3,4 +3,5 @@ ccflags-y += -D'pr_fmt(fmt)="zinc: " fmt'
ccflags-$(CONFIG_ZINC_DEBUG) += -DDEBUG
zinc_chacha20-y := chacha20/chacha20.o
+zinc_chacha20-$(CONFIG_ZINC_ARCH_X86_64) += chacha20/chacha20-x86_64.o
obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
diff --git a/lib/zinc/chacha20/chacha20-x86_64-glue.c b/lib/zinc/chacha20/chacha20-x86_64-glue.c
new file mode 100644
index 000000000000..8629d5d420e6
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-x86_64-glue.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/fpu/api.h>
+#include <asm/cpufeature.h>
+#include <asm/processor.h>
+#include <asm/intel-family.h>
+
+asmlinkage void hchacha20_ssse3(u32 *derived_key, const u8 *nonce,
+ const u8 *key);
+asmlinkage void chacha20_ssse3(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+asmlinkage void chacha20_avx2(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+asmlinkage void chacha20_avx512(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+asmlinkage void chacha20_avx512vl(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+
+static bool chacha20_use_ssse3 __ro_after_init;
+static bool chacha20_use_avx2 __ro_after_init;
+static bool chacha20_use_avx512 __ro_after_init;
+static bool chacha20_use_avx512vl __ro_after_init;
+static bool *const chacha20_nobs[] __initconst = {
+ &chacha20_use_ssse3, &chacha20_use_avx2, &chacha20_use_avx512,
+ &chacha20_use_avx512vl };
+
+static void __init chacha20_fpu_init(void)
+{
+ chacha20_use_ssse3 = boot_cpu_has(X86_FEATURE_SSSE3);
+ chacha20_use_avx2 =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
+ chacha20_use_avx512 =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ boot_cpu_has(X86_FEATURE_AVX512F) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM |
+ XFEATURE_MASK_AVX512, NULL) &&
+ /* Skylake downclocks unacceptably much when using zmm. */
+ boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X;
+ chacha20_use_avx512vl =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ boot_cpu_has(X86_FEATURE_AVX512F) &&
+ boot_cpu_has(X86_FEATURE_AVX512VL) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM |
+ XFEATURE_MASK_AVX512, NULL);
+}
+
+static inline bool chacha20_arch(struct chacha20_ctx *ctx, u8 *dst,
+ const u8 *src, size_t len,
+ simd_context_t *simd_context)
+{
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(PAGE_SIZE < CHACHA20_BLOCK_SIZE ||
+ PAGE_SIZE % CHACHA20_BLOCK_SIZE);
+
+ if (!IS_ENABLED(CONFIG_AS_SSSE3) || !chacha20_use_ssse3 ||
+ len <= CHACHA20_BLOCK_SIZE || !simd_use(simd_context))
+ return false;
+
+ for (;;) {
+ const size_t bytes = min_t(size_t, len, PAGE_SIZE);
+
+ if (IS_ENABLED(CONFIG_AS_AVX512) && chacha20_use_avx512 &&
+ len >= CHACHA20_BLOCK_SIZE * 8)
+ chacha20_avx512(dst, src, bytes, ctx->key, ctx->counter);
+ else if (IS_ENABLED(CONFIG_AS_AVX512) && chacha20_use_avx512vl &&
+ len >= CHACHA20_BLOCK_SIZE * 4)
+ chacha20_avx512vl(dst, src, bytes, ctx->key, ctx->counter);
+ else if (IS_ENABLED(CONFIG_AS_AVX2) && chacha20_use_avx2 &&
+ len >= CHACHA20_BLOCK_SIZE * 4)
+ chacha20_avx2(dst, src, bytes, ctx->key, ctx->counter);
+ else
+ chacha20_ssse3(dst, src, bytes, ctx->key, ctx->counter);
+ ctx->counter[0] += (bytes + 63) / 64;
+ len -= bytes;
+ if (!len)
+ break;
+ dst += bytes;
+ src += bytes;
+ simd_relax(simd_context);
+ }
+
+ return true;
+}
+
+static inline bool hchacha20_arch(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ if (IS_ENABLED(CONFIG_AS_SSSE3) && chacha20_use_ssse3 &&
+ simd_use(simd_context)) {
+ hchacha20_ssse3(derived_key, nonce, key);
+ return true;
+ }
+ return false;
+}
diff --git a/lib/zinc/chacha20/chacha20-x86_64-cryptogams.S b/lib/zinc/chacha20/chacha20-x86_64.S
similarity index 71%
rename from lib/zinc/chacha20/chacha20-x86_64-cryptogams.S
rename to lib/zinc/chacha20/chacha20-x86_64.S
index 2bfc76f7e01f..3d10c7f21642 100644
--- a/lib/zinc/chacha20/chacha20-x86_64-cryptogams.S
+++ b/lib/zinc/chacha20/chacha20-x86_64.S
@@ -1,351 +1,148 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2017 Samuel Neves <sneves@dei.uc.pt>. All Rights Reserved.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-.text
+#include <linux/linkage.h>
-
-
-.align 64
+.section .rodata.cst16.Lzero, "aM", @progbits, 16
+.align 16
.Lzero:
.long 0,0,0,0
+.section .rodata.cst16.Lone, "aM", @progbits, 16
+.align 16
.Lone:
.long 1,0,0,0
+.section .rodata.cst16.Linc, "aM", @progbits, 16
+.align 16
.Linc:
.long 0,1,2,3
+.section .rodata.cst16.Lfour, "aM", @progbits, 16
+.align 16
.Lfour:
.long 4,4,4,4
+.section .rodata.cst32.Lincy, "aM", @progbits, 32
+.align 32
.Lincy:
.long 0,2,4,6,1,3,5,7
+.section .rodata.cst32.Leight, "aM", @progbits, 32
+.align 32
.Leight:
.long 8,8,8,8,8,8,8,8
+.section .rodata.cst16.Lrot16, "aM", @progbits, 16
+.align 16
.Lrot16:
.byte 0x2,0x3,0x0,0x1, 0x6,0x7,0x4,0x5, 0xa,0xb,0x8,0x9, 0xe,0xf,0xc,0xd
+.section .rodata.cst16.Lrot24, "aM", @progbits, 16
+.align 16
.Lrot24:
.byte 0x3,0x0,0x1,0x2, 0x7,0x4,0x5,0x6, 0xb,0x8,0x9,0xa, 0xf,0xc,0xd,0xe
-.Ltwoy:
-.long 2,0,0,0, 2,0,0,0
+.section .rodata.cst16.Lsigma, "aM", @progbits, 16
+.align 16
+.Lsigma:
+.byte 101,120,112,97,110,100,32,51,50,45,98,121,116,101,32,107,0
+.section .rodata.cst64.Lzeroz, "aM", @progbits, 64
.align 64
.Lzeroz:
.long 0,0,0,0, 1,0,0,0, 2,0,0,0, 3,0,0,0
+.section .rodata.cst64.Lfourz, "aM", @progbits, 64
+.align 64
.Lfourz:
.long 4,0,0,0, 4,0,0,0, 4,0,0,0, 4,0,0,0
+.section .rodata.cst64.Lincz, "aM", @progbits, 64
+.align 64
.Lincz:
.long 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
+.section .rodata.cst64.Lsixteen, "aM", @progbits, 64
+.align 64
.Lsixteen:
.long 16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16
-.Lsigma:
-.byte 101,120,112,97,110,100,32,51,50,45,98,121,116,101,32,107,0
-.byte 67,104,97,67,104,97,50,48,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
-.globl ChaCha20_ctr32
-.type ChaCha20_ctr32,@function
+.section .rodata.cst32.Ltwoy, "aM", @progbits, 32
.align 64
-ChaCha20_ctr32:
-.cfi_startproc
- cmpq $0,%rdx
- je .Lno_data
- movq OPENSSL_ia32cap_P+4(%rip),%r10
- btq $48,%r10
- jc .LChaCha20_avx512
- testq %r10,%r10
- js .LChaCha20_avx512vl
- testl $512,%r10d
- jnz .LChaCha20_ssse3
-
- pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
- pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
- pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
- pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
- pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
- subq $64+24,%rsp
-.cfi_adjust_cfa_offset 64+24
-.Lctr32_body:
-
-
- movdqu (%rcx),%xmm1
- movdqu 16(%rcx),%xmm2
- movdqu (%r8),%xmm3
- movdqa .Lone(%rip),%xmm4
-
+.Ltwoy:
+.long 2,0,0,0, 2,0,0,0
- movdqa %xmm1,16(%rsp)
- movdqa %xmm2,32(%rsp)
- movdqa %xmm3,48(%rsp)
- movq %rdx,%rbp
- jmp .Loop_outer
+.text
+#ifdef CONFIG_AS_SSSE3
.align 32
-.Loop_outer:
- movl $0x61707865,%eax
- movl $0x3320646e,%ebx
- movl $0x79622d32,%ecx
- movl $0x6b206574,%edx
- movl 16(%rsp),%r8d
- movl 20(%rsp),%r9d
- movl 24(%rsp),%r10d
- movl 28(%rsp),%r11d
- movd %xmm3,%r12d
- movl 52(%rsp),%r13d
- movl 56(%rsp),%r14d
- movl 60(%rsp),%r15d
-
- movq %rbp,64+0(%rsp)
- movl $10,%ebp
- movq %rsi,64+8(%rsp)
-.byte 102,72,15,126,214
- movq %rdi,64+16(%rsp)
- movq %rsi,%rdi
- shrq $32,%rdi
- jmp .Loop
+ENTRY(hchacha20_ssse3)
+ movdqa .Lsigma(%rip),%xmm0
+ movdqu (%rdx),%xmm1
+ movdqu 16(%rdx),%xmm2
+ movdqu (%rsi),%xmm3
+ movdqa .Lrot16(%rip),%xmm6
+ movdqa .Lrot24(%rip),%xmm7
+ movq $10,%r8
+ .align 32
+.Loop_hssse3:
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+ pshufb %xmm6,%xmm3
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $20,%xmm1
+ pslld $12,%xmm4
+ por %xmm4,%xmm1
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+ pshufb %xmm7,%xmm3
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $25,%xmm1
+ pslld $7,%xmm4
+ por %xmm4,%xmm1
+ pshufd $78,%xmm2,%xmm2
+ pshufd $57,%xmm1,%xmm1
+ pshufd $147,%xmm3,%xmm3
+ nop
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+ pshufb %xmm6,%xmm3
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $20,%xmm1
+ pslld $12,%xmm4
+ por %xmm4,%xmm1
+ paddd %xmm1,%xmm0
+ pxor %xmm0,%xmm3
+ pshufb %xmm7,%xmm3
+ paddd %xmm3,%xmm2
+ pxor %xmm2,%xmm1
+ movdqa %xmm1,%xmm4
+ psrld $25,%xmm1
+ pslld $7,%xmm4
+ por %xmm4,%xmm1
+ pshufd $78,%xmm2,%xmm2
+ pshufd $147,%xmm1,%xmm1
+ pshufd $57,%xmm3,%xmm3
+ decq %r8
+ jnz .Loop_hssse3
+ movdqu %xmm0,0(%rdi)
+ movdqu %xmm3,16(%rdi)
+ ret
+ENDPROC(hchacha20_ssse3)
.align 32
-.Loop:
- addl %r8d,%eax
- xorl %eax,%r12d
- roll $16,%r12d
- addl %r9d,%ebx
- xorl %ebx,%r13d
- roll $16,%r13d
- addl %r12d,%esi
- xorl %esi,%r8d
- roll $12,%r8d
- addl %r13d,%edi
- xorl %edi,%r9d
- roll $12,%r9d
- addl %r8d,%eax
- xorl %eax,%r12d
- roll $8,%r12d
- addl %r9d,%ebx
- xorl %ebx,%r13d
- roll $8,%r13d
- addl %r12d,%esi
- xorl %esi,%r8d
- roll $7,%r8d
- addl %r13d,%edi
- xorl %edi,%r9d
- roll $7,%r9d
- movl %esi,32(%rsp)
- movl %edi,36(%rsp)
- movl 40(%rsp),%esi
- movl 44(%rsp),%edi
- addl %r10d,%ecx
- xorl %ecx,%r14d
- roll $16,%r14d
- addl %r11d,%edx
- xorl %edx,%r15d
- roll $16,%r15d
- addl %r14d,%esi
- xorl %esi,%r10d
- roll $12,%r10d
- addl %r15d,%edi
- xorl %edi,%r11d
- roll $12,%r11d
- addl %r10d,%ecx
- xorl %ecx,%r14d
- roll $8,%r14d
- addl %r11d,%edx
- xorl %edx,%r15d
- roll $8,%r15d
- addl %r14d,%esi
- xorl %esi,%r10d
- roll $7,%r10d
- addl %r15d,%edi
- xorl %edi,%r11d
- roll $7,%r11d
- addl %r9d,%eax
- xorl %eax,%r15d
- roll $16,%r15d
- addl %r10d,%ebx
- xorl %ebx,%r12d
- roll $16,%r12d
- addl %r15d,%esi
- xorl %esi,%r9d
- roll $12,%r9d
- addl %r12d,%edi
- xorl %edi,%r10d
- roll $12,%r10d
- addl %r9d,%eax
- xorl %eax,%r15d
- roll $8,%r15d
- addl %r10d,%ebx
- xorl %ebx,%r12d
- roll $8,%r12d
- addl %r15d,%esi
- xorl %esi,%r9d
- roll $7,%r9d
- addl %r12d,%edi
- xorl %edi,%r10d
- roll $7,%r10d
- movl %esi,40(%rsp)
- movl %edi,44(%rsp)
- movl 32(%rsp),%esi
- movl 36(%rsp),%edi
- addl %r11d,%ecx
- xorl %ecx,%r13d
- roll $16,%r13d
- addl %r8d,%edx
- xorl %edx,%r14d
- roll $16,%r14d
- addl %r13d,%esi
- xorl %esi,%r11d
- roll $12,%r11d
- addl %r14d,%edi
- xorl %edi,%r8d
- roll $12,%r8d
- addl %r11d,%ecx
- xorl %ecx,%r13d
- roll $8,%r13d
- addl %r8d,%edx
- xorl %edx,%r14d
- roll $8,%r14d
- addl %r13d,%esi
- xorl %esi,%r11d
- roll $7,%r11d
- addl %r14d,%edi
- xorl %edi,%r8d
- roll $7,%r8d
- decl %ebp
- jnz .Loop
- movl %edi,36(%rsp)
- movl %esi,32(%rsp)
- movq 64(%rsp),%rbp
- movdqa %xmm2,%xmm1
- movq 64+8(%rsp),%rsi
- paddd %xmm4,%xmm3
- movq 64+16(%rsp),%rdi
-
- addl $0x61707865,%eax
- addl $0x3320646e,%ebx
- addl $0x79622d32,%ecx
- addl $0x6b206574,%edx
- addl 16(%rsp),%r8d
- addl 20(%rsp),%r9d
- addl 24(%rsp),%r10d
- addl 28(%rsp),%r11d
- addl 48(%rsp),%r12d
- addl 52(%rsp),%r13d
- addl 56(%rsp),%r14d
- addl 60(%rsp),%r15d
- paddd 32(%rsp),%xmm1
-
- cmpq $64,%rbp
- jb .Ltail
-
- xorl 0(%rsi),%eax
- xorl 4(%rsi),%ebx
- xorl 8(%rsi),%ecx
- xorl 12(%rsi),%edx
- xorl 16(%rsi),%r8d
- xorl 20(%rsi),%r9d
- xorl 24(%rsi),%r10d
- xorl 28(%rsi),%r11d
- movdqu 32(%rsi),%xmm0
- xorl 48(%rsi),%r12d
- xorl 52(%rsi),%r13d
- xorl 56(%rsi),%r14d
- xorl 60(%rsi),%r15d
- leaq 64(%rsi),%rsi
- pxor %xmm1,%xmm0
-
- movdqa %xmm2,32(%rsp)
- movd %xmm3,48(%rsp)
-
- movl %eax,0(%rdi)
- movl %ebx,4(%rdi)
- movl %ecx,8(%rdi)
- movl %edx,12(%rdi)
- movl %r8d,16(%rdi)
- movl %r9d,20(%rdi)
- movl %r10d,24(%rdi)
- movl %r11d,28(%rdi)
- movdqu %xmm0,32(%rdi)
- movl %r12d,48(%rdi)
- movl %r13d,52(%rdi)
- movl %r14d,56(%rdi)
- movl %r15d,60(%rdi)
- leaq 64(%rdi),%rdi
-
- subq $64,%rbp
- jnz .Loop_outer
-
- jmp .Ldone
+ENTRY(chacha20_ssse3)
+.Lchacha20_ssse3:
+ cmpq $0,%rdx
+ je .Lssse3_epilogue
+ leaq 8(%rsp),%r10
-.align 16
-.Ltail:
- movl %eax,0(%rsp)
- movl %ebx,4(%rsp)
- xorq %rbx,%rbx
- movl %ecx,8(%rsp)
- movl %edx,12(%rsp)
- movl %r8d,16(%rsp)
- movl %r9d,20(%rsp)
- movl %r10d,24(%rsp)
- movl %r11d,28(%rsp)
- movdqa %xmm1,32(%rsp)
- movl %r12d,48(%rsp)
- movl %r13d,52(%rsp)
- movl %r14d,56(%rsp)
- movl %r15d,60(%rsp)
-
-.Loop_tail:
- movzbl (%rsi,%rbx,1),%eax
- movzbl (%rsp,%rbx,1),%edx
- leaq 1(%rbx),%rbx
- xorl %edx,%eax
- movb %al,-1(%rdi,%rbx,1)
- decq %rbp
- jnz .Loop_tail
-
-.Ldone:
- leaq 64+24+48(%rsp),%rsi
-.cfi_def_cfa %rsi,8
- movq -48(%rsi),%r15
-.cfi_restore %r15
- movq -40(%rsi),%r14
-.cfi_restore %r14
- movq -32(%rsi),%r13
-.cfi_restore %r13
- movq -24(%rsi),%r12
-.cfi_restore %r12
- movq -16(%rsi),%rbp
-.cfi_restore %rbp
- movq -8(%rsi),%rbx
-.cfi_restore %rbx
- leaq (%rsi),%rsp
-.cfi_def_cfa_register %rsp
-.Lno_data:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_ctr32,.-ChaCha20_ctr32
-.type ChaCha20_ssse3,@function
-.align 32
-ChaCha20_ssse3:
-.cfi_startproc
-.LChaCha20_ssse3:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
- testl $2048,%r10d
- jnz .LChaCha20_4xop
cmpq $128,%rdx
- je .LChaCha20_128
- ja .LChaCha20_4x
+ ja .Lchacha20_4x
.Ldo_sse3_after_all:
subq $64+8,%rsp
+ andq $-32,%rsp
movdqa .Lsigma(%rip),%xmm0
movdqu (%rcx),%xmm1
movdqu 16(%rcx),%xmm2
@@ -375,7 +172,7 @@ ChaCha20_ssse3:
.Loop_ssse3:
paddd %xmm1,%xmm0
pxor %xmm0,%xmm3
-.byte 102,15,56,0,222
+ pshufb %xmm6,%xmm3
paddd %xmm3,%xmm2
pxor %xmm2,%xmm1
movdqa %xmm1,%xmm4
@@ -384,7 +181,7 @@ ChaCha20_ssse3:
por %xmm4,%xmm1
paddd %xmm1,%xmm0
pxor %xmm0,%xmm3
-.byte 102,15,56,0,223
+ pshufb %xmm7,%xmm3
paddd %xmm3,%xmm2
pxor %xmm2,%xmm1
movdqa %xmm1,%xmm4
@@ -397,7 +194,7 @@ ChaCha20_ssse3:
nop
paddd %xmm1,%xmm0
pxor %xmm0,%xmm3
-.byte 102,15,56,0,222
+ pshufb %xmm6,%xmm3
paddd %xmm3,%xmm2
pxor %xmm2,%xmm1
movdqa %xmm1,%xmm4
@@ -406,7 +203,7 @@ ChaCha20_ssse3:
por %xmm4,%xmm1
paddd %xmm1,%xmm0
pxor %xmm0,%xmm3
-.byte 102,15,56,0,223
+ pshufb %xmm7,%xmm3
paddd %xmm3,%xmm2
pxor %xmm2,%xmm1
movdqa %xmm1,%xmm4
@@ -465,194 +262,24 @@ ChaCha20_ssse3:
jnz .Loop_tail_ssse3
.Ldone_ssse3:
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.Lssse3_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_ssse3,.-ChaCha20_ssse3
-.type ChaCha20_128,@function
-.align 32
-ChaCha20_128:
-.cfi_startproc
-.LChaCha20_128:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
- subq $64+8,%rsp
- movdqa .Lsigma(%rip),%xmm8
- movdqu (%rcx),%xmm9
- movdqu 16(%rcx),%xmm2
- movdqu (%r8),%xmm3
- movdqa .Lone(%rip),%xmm1
- movdqa .Lrot16(%rip),%xmm6
- movdqa .Lrot24(%rip),%xmm7
+ leaq -8(%r10),%rsp
- movdqa %xmm8,%xmm10
- movdqa %xmm8,0(%rsp)
- movdqa %xmm9,%xmm11
- movdqa %xmm9,16(%rsp)
- movdqa %xmm2,%xmm0
- movdqa %xmm2,32(%rsp)
- paddd %xmm3,%xmm1
- movdqa %xmm3,48(%rsp)
- movq $10,%r8
- jmp .Loop_128
-
-.align 32
-.Loop_128:
- paddd %xmm9,%xmm8
- pxor %xmm8,%xmm3
- paddd %xmm11,%xmm10
- pxor %xmm10,%xmm1
-.byte 102,15,56,0,222
-.byte 102,15,56,0,206
- paddd %xmm3,%xmm2
- paddd %xmm1,%xmm0
- pxor %xmm2,%xmm9
- pxor %xmm0,%xmm11
- movdqa %xmm9,%xmm4
- psrld $20,%xmm9
- movdqa %xmm11,%xmm5
- pslld $12,%xmm4
- psrld $20,%xmm11
- por %xmm4,%xmm9
- pslld $12,%xmm5
- por %xmm5,%xmm11
- paddd %xmm9,%xmm8
- pxor %xmm8,%xmm3
- paddd %xmm11,%xmm10
- pxor %xmm10,%xmm1
-.byte 102,15,56,0,223
-.byte 102,15,56,0,207
- paddd %xmm3,%xmm2
- paddd %xmm1,%xmm0
- pxor %xmm2,%xmm9
- pxor %xmm0,%xmm11
- movdqa %xmm9,%xmm4
- psrld $25,%xmm9
- movdqa %xmm11,%xmm5
- pslld $7,%xmm4
- psrld $25,%xmm11
- por %xmm4,%xmm9
- pslld $7,%xmm5
- por %xmm5,%xmm11
- pshufd $78,%xmm2,%xmm2
- pshufd $57,%xmm9,%xmm9
- pshufd $147,%xmm3,%xmm3
- pshufd $78,%xmm0,%xmm0
- pshufd $57,%xmm11,%xmm11
- pshufd $147,%xmm1,%xmm1
- paddd %xmm9,%xmm8
- pxor %xmm8,%xmm3
- paddd %xmm11,%xmm10
- pxor %xmm10,%xmm1
-.byte 102,15,56,0,222
-.byte 102,15,56,0,206
- paddd %xmm3,%xmm2
- paddd %xmm1,%xmm0
- pxor %xmm2,%xmm9
- pxor %xmm0,%xmm11
- movdqa %xmm9,%xmm4
- psrld $20,%xmm9
- movdqa %xmm11,%xmm5
- pslld $12,%xmm4
- psrld $20,%xmm11
- por %xmm4,%xmm9
- pslld $12,%xmm5
- por %xmm5,%xmm11
- paddd %xmm9,%xmm8
- pxor %xmm8,%xmm3
- paddd %xmm11,%xmm10
- pxor %xmm10,%xmm1
-.byte 102,15,56,0,223
-.byte 102,15,56,0,207
- paddd %xmm3,%xmm2
- paddd %xmm1,%xmm0
- pxor %xmm2,%xmm9
- pxor %xmm0,%xmm11
- movdqa %xmm9,%xmm4
- psrld $25,%xmm9
- movdqa %xmm11,%xmm5
- pslld $7,%xmm4
- psrld $25,%xmm11
- por %xmm4,%xmm9
- pslld $7,%xmm5
- por %xmm5,%xmm11
- pshufd $78,%xmm2,%xmm2
- pshufd $147,%xmm9,%xmm9
- pshufd $57,%xmm3,%xmm3
- pshufd $78,%xmm0,%xmm0
- pshufd $147,%xmm11,%xmm11
- pshufd $57,%xmm1,%xmm1
- decq %r8
- jnz .Loop_128
- paddd 0(%rsp),%xmm8
- paddd 16(%rsp),%xmm9
- paddd 32(%rsp),%xmm2
- paddd 48(%rsp),%xmm3
- paddd .Lone(%rip),%xmm1
- paddd 0(%rsp),%xmm10
- paddd 16(%rsp),%xmm11
- paddd 32(%rsp),%xmm0
- paddd 48(%rsp),%xmm1
-
- movdqu 0(%rsi),%xmm4
- movdqu 16(%rsi),%xmm5
- pxor %xmm4,%xmm8
- movdqu 32(%rsi),%xmm4
- pxor %xmm5,%xmm9
- movdqu 48(%rsi),%xmm5
- pxor %xmm4,%xmm2
- movdqu 64(%rsi),%xmm4
- pxor %xmm5,%xmm3
- movdqu 80(%rsi),%xmm5
- pxor %xmm4,%xmm10
- movdqu 96(%rsi),%xmm4
- pxor %xmm5,%xmm11
- movdqu 112(%rsi),%xmm5
- pxor %xmm4,%xmm0
- pxor %xmm5,%xmm1
+.Lssse3_epilogue:
+ ret
- movdqu %xmm8,0(%rdi)
- movdqu %xmm9,16(%rdi)
- movdqu %xmm2,32(%rdi)
- movdqu %xmm3,48(%rdi)
- movdqu %xmm10,64(%rdi)
- movdqu %xmm11,80(%rdi)
- movdqu %xmm0,96(%rdi)
- movdqu %xmm1,112(%rdi)
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.L128_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_128,.-ChaCha20_128
-.type ChaCha20_4x,@function
.align 32
-ChaCha20_4x:
-.cfi_startproc
-.LChaCha20_4x:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
- movq %r10,%r11
- shrq $32,%r10
- testq $32,%r10
- jnz .LChaCha20_8x
- cmpq $192,%rdx
- ja .Lproceed4x
-
- andq $71303168,%r11
- cmpq $4194304,%r11
- je .Ldo_sse3_after_all
+.Lchacha20_4x:
+ leaq 8(%rsp),%r10
.Lproceed4x:
subq $0x140+8,%rsp
+ andq $-32,%rsp
movdqa .Lsigma(%rip),%xmm11
movdqu (%rcx),%xmm15
movdqu 16(%rcx),%xmm7
movdqu (%r8),%xmm3
leaq 256(%rsp),%rcx
- leaq .Lrot16(%rip),%r10
+ leaq .Lrot16(%rip),%r9
leaq .Lrot24(%rip),%r11
pshufd $0x00,%xmm11,%xmm8
@@ -716,7 +343,7 @@ ChaCha20_4x:
.Loop_enter4x:
movdqa %xmm6,32(%rsp)
movdqa %xmm7,48(%rsp)
- movdqa (%r10),%xmm7
+ movdqa (%r9),%xmm7
movl $10,%eax
movdqa %xmm0,256-256(%rcx)
jmp .Loop4x
@@ -727,8 +354,8 @@ ChaCha20_4x:
paddd %xmm13,%xmm9
pxor %xmm8,%xmm0
pxor %xmm9,%xmm1
-.byte 102,15,56,0,199
-.byte 102,15,56,0,207
+ pshufb %xmm7,%xmm0
+ pshufb %xmm7,%xmm1
paddd %xmm0,%xmm4
paddd %xmm1,%xmm5
pxor %xmm4,%xmm12
@@ -746,8 +373,8 @@ ChaCha20_4x:
paddd %xmm13,%xmm9
pxor %xmm8,%xmm0
pxor %xmm9,%xmm1
-.byte 102,15,56,0,198
-.byte 102,15,56,0,206
+ pshufb %xmm6,%xmm0
+ pshufb %xmm6,%xmm1
paddd %xmm0,%xmm4
paddd %xmm1,%xmm5
pxor %xmm4,%xmm12
@@ -759,7 +386,7 @@ ChaCha20_4x:
pslld $7,%xmm13
por %xmm7,%xmm12
psrld $25,%xmm6
- movdqa (%r10),%xmm7
+ movdqa (%r9),%xmm7
por %xmm6,%xmm13
movdqa %xmm4,0(%rsp)
movdqa %xmm5,16(%rsp)
@@ -769,8 +396,8 @@ ChaCha20_4x:
paddd %xmm15,%xmm11
pxor %xmm10,%xmm2
pxor %xmm11,%xmm3
-.byte 102,15,56,0,215
-.byte 102,15,56,0,223
+ pshufb %xmm7,%xmm2
+ pshufb %xmm7,%xmm3
paddd %xmm2,%xmm4
paddd %xmm3,%xmm5
pxor %xmm4,%xmm14
@@ -788,8 +415,8 @@ ChaCha20_4x:
paddd %xmm15,%xmm11
pxor %xmm10,%xmm2
pxor %xmm11,%xmm3
-.byte 102,15,56,0,214
-.byte 102,15,56,0,222
+ pshufb %xmm6,%xmm2
+ pshufb %xmm6,%xmm3
paddd %xmm2,%xmm4
paddd %xmm3,%xmm5
pxor %xmm4,%xmm14
@@ -801,14 +428,14 @@ ChaCha20_4x:
pslld $7,%xmm15
por %xmm7,%xmm14
psrld $25,%xmm6
- movdqa (%r10),%xmm7
+ movdqa (%r9),%xmm7
por %xmm6,%xmm15
paddd %xmm13,%xmm8
paddd %xmm14,%xmm9
pxor %xmm8,%xmm3
pxor %xmm9,%xmm0
-.byte 102,15,56,0,223
-.byte 102,15,56,0,199
+ pshufb %xmm7,%xmm3
+ pshufb %xmm7,%xmm0
paddd %xmm3,%xmm4
paddd %xmm0,%xmm5
pxor %xmm4,%xmm13
@@ -826,8 +453,8 @@ ChaCha20_4x:
paddd %xmm14,%xmm9
pxor %xmm8,%xmm3
pxor %xmm9,%xmm0
-.byte 102,15,56,0,222
-.byte 102,15,56,0,198
+ pshufb %xmm6,%xmm3
+ pshufb %xmm6,%xmm0
paddd %xmm3,%xmm4
paddd %xmm0,%xmm5
pxor %xmm4,%xmm13
@@ -839,7 +466,7 @@ ChaCha20_4x:
pslld $7,%xmm14
por %xmm7,%xmm13
psrld $25,%xmm6
- movdqa (%r10),%xmm7
+ movdqa (%r9),%xmm7
por %xmm6,%xmm14
movdqa %xmm4,32(%rsp)
movdqa %xmm5,48(%rsp)
@@ -849,8 +476,8 @@ ChaCha20_4x:
paddd %xmm12,%xmm11
pxor %xmm10,%xmm1
pxor %xmm11,%xmm2
-.byte 102,15,56,0,207
-.byte 102,15,56,0,215
+ pshufb %xmm7,%xmm1
+ pshufb %xmm7,%xmm2
paddd %xmm1,%xmm4
paddd %xmm2,%xmm5
pxor %xmm4,%xmm15
@@ -868,8 +495,8 @@ ChaCha20_4x:
paddd %xmm12,%xmm11
pxor %xmm10,%xmm1
pxor %xmm11,%xmm2
-.byte 102,15,56,0,206
-.byte 102,15,56,0,214
+ pshufb %xmm6,%xmm1
+ pshufb %xmm6,%xmm2
paddd %xmm1,%xmm4
paddd %xmm2,%xmm5
pxor %xmm4,%xmm15
@@ -881,7 +508,7 @@ ChaCha20_4x:
pslld $7,%xmm12
por %xmm7,%xmm15
psrld $25,%xmm6
- movdqa (%r10),%xmm7
+ movdqa (%r9),%xmm7
por %xmm6,%xmm12
decl %eax
jnz .Loop4x
@@ -1035,7 +662,7 @@ ChaCha20_4x:
jae .L64_or_more4x
- xorq %r10,%r10
+ xorq %r9,%r9
movdqa %xmm12,16(%rsp)
movdqa %xmm4,32(%rsp)
@@ -1060,7 +687,7 @@ ChaCha20_4x:
movdqa 16(%rsp),%xmm6
leaq 64(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
movdqa %xmm6,0(%rsp)
movdqa %xmm13,16(%rsp)
leaq 64(%rdi),%rdi
@@ -1100,7 +727,7 @@ ChaCha20_4x:
movdqa 32(%rsp),%xmm6
leaq 128(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
movdqa %xmm6,0(%rsp)
movdqa %xmm10,16(%rsp)
leaq 128(%rdi),%rdi
@@ -1155,7 +782,7 @@ ChaCha20_4x:
movdqa 48(%rsp),%xmm6
leaq 64(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
movdqa %xmm6,0(%rsp)
movdqa %xmm15,16(%rsp)
leaq 64(%rdi),%rdi
@@ -1164,463 +791,41 @@ ChaCha20_4x:
movdqa %xmm3,48(%rsp)
.Loop_tail4x:
- movzbl (%rsi,%r10,1),%eax
- movzbl (%rsp,%r10,1),%ecx
- leaq 1(%r10),%r10
+ movzbl (%rsi,%r9,1),%eax
+ movzbl (%rsp,%r9,1),%ecx
+ leaq 1(%r9),%r9
xorl %ecx,%eax
- movb %al,-1(%rdi,%r10,1)
+ movb %al,-1(%rdi,%r9,1)
decq %rdx
jnz .Loop_tail4x
.Ldone4x:
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.L4x_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_4x,.-ChaCha20_4x
-.type ChaCha20_4xop,@function
-.align 32
-ChaCha20_4xop:
-.cfi_startproc
-.LChaCha20_4xop:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
- subq $0x140+8,%rsp
- vzeroupper
+ leaq -8(%r10),%rsp
- vmovdqa .Lsigma(%rip),%xmm11
- vmovdqu (%rcx),%xmm3
- vmovdqu 16(%rcx),%xmm15
- vmovdqu (%r8),%xmm7
- leaq 256(%rsp),%rcx
-
- vpshufd $0x00,%xmm11,%xmm8
- vpshufd $0x55,%xmm11,%xmm9
- vmovdqa %xmm8,64(%rsp)
- vpshufd $0xaa,%xmm11,%xmm10
- vmovdqa %xmm9,80(%rsp)
- vpshufd $0xff,%xmm11,%xmm11
- vmovdqa %xmm10,96(%rsp)
- vmovdqa %xmm11,112(%rsp)
-
- vpshufd $0x00,%xmm3,%xmm0
- vpshufd $0x55,%xmm3,%xmm1
- vmovdqa %xmm0,128-256(%rcx)
- vpshufd $0xaa,%xmm3,%xmm2
- vmovdqa %xmm1,144-256(%rcx)
- vpshufd $0xff,%xmm3,%xmm3
- vmovdqa %xmm2,160-256(%rcx)
- vmovdqa %xmm3,176-256(%rcx)
-
- vpshufd $0x00,%xmm15,%xmm12
- vpshufd $0x55,%xmm15,%xmm13
- vmovdqa %xmm12,192-256(%rcx)
- vpshufd $0xaa,%xmm15,%xmm14
- vmovdqa %xmm13,208-256(%rcx)
- vpshufd $0xff,%xmm15,%xmm15
- vmovdqa %xmm14,224-256(%rcx)
- vmovdqa %xmm15,240-256(%rcx)
-
- vpshufd $0x00,%xmm7,%xmm4
- vpshufd $0x55,%xmm7,%xmm5
- vpaddd .Linc(%rip),%xmm4,%xmm4
- vpshufd $0xaa,%xmm7,%xmm6
- vmovdqa %xmm5,272-256(%rcx)
- vpshufd $0xff,%xmm7,%xmm7
- vmovdqa %xmm6,288-256(%rcx)
- vmovdqa %xmm7,304-256(%rcx)
-
- jmp .Loop_enter4xop
-
-.align 32
-.Loop_outer4xop:
- vmovdqa 64(%rsp),%xmm8
- vmovdqa 80(%rsp),%xmm9
- vmovdqa 96(%rsp),%xmm10
- vmovdqa 112(%rsp),%xmm11
- vmovdqa 128-256(%rcx),%xmm0
- vmovdqa 144-256(%rcx),%xmm1
- vmovdqa 160-256(%rcx),%xmm2
- vmovdqa 176-256(%rcx),%xmm3
- vmovdqa 192-256(%rcx),%xmm12
- vmovdqa 208-256(%rcx),%xmm13
- vmovdqa 224-256(%rcx),%xmm14
- vmovdqa 240-256(%rcx),%xmm15
- vmovdqa 256-256(%rcx),%xmm4
- vmovdqa 272-256(%rcx),%xmm5
- vmovdqa 288-256(%rcx),%xmm6
- vmovdqa 304-256(%rcx),%xmm7
- vpaddd .Lfour(%rip),%xmm4,%xmm4
-
-.Loop_enter4xop:
- movl $10,%eax
- vmovdqa %xmm4,256-256(%rcx)
- jmp .Loop4xop
-
-.align 32
-.Loop4xop:
- vpaddd %xmm0,%xmm8,%xmm8
- vpaddd %xmm1,%xmm9,%xmm9
- vpaddd %xmm2,%xmm10,%xmm10
- vpaddd %xmm3,%xmm11,%xmm11
- vpxor %xmm4,%xmm8,%xmm4
- vpxor %xmm5,%xmm9,%xmm5
- vpxor %xmm6,%xmm10,%xmm6
- vpxor %xmm7,%xmm11,%xmm7
-.byte 143,232,120,194,228,16
-.byte 143,232,120,194,237,16
-.byte 143,232,120,194,246,16
-.byte 143,232,120,194,255,16
- vpaddd %xmm4,%xmm12,%xmm12
- vpaddd %xmm5,%xmm13,%xmm13
- vpaddd %xmm6,%xmm14,%xmm14
- vpaddd %xmm7,%xmm15,%xmm15
- vpxor %xmm0,%xmm12,%xmm0
- vpxor %xmm1,%xmm13,%xmm1
- vpxor %xmm14,%xmm2,%xmm2
- vpxor %xmm15,%xmm3,%xmm3
-.byte 143,232,120,194,192,12
-.byte 143,232,120,194,201,12
-.byte 143,232,120,194,210,12
-.byte 143,232,120,194,219,12
- vpaddd %xmm8,%xmm0,%xmm8
- vpaddd %xmm9,%xmm1,%xmm9
- vpaddd %xmm2,%xmm10,%xmm10
- vpaddd %xmm3,%xmm11,%xmm11
- vpxor %xmm4,%xmm8,%xmm4
- vpxor %xmm5,%xmm9,%xmm5
- vpxor %xmm6,%xmm10,%xmm6
- vpxor %xmm7,%xmm11,%xmm7
-.byte 143,232,120,194,228,8
-.byte 143,232,120,194,237,8
-.byte 143,232,120,194,246,8
-.byte 143,232,120,194,255,8
- vpaddd %xmm4,%xmm12,%xmm12
- vpaddd %xmm5,%xmm13,%xmm13
- vpaddd %xmm6,%xmm14,%xmm14
- vpaddd %xmm7,%xmm15,%xmm15
- vpxor %xmm0,%xmm12,%xmm0
- vpxor %xmm1,%xmm13,%xmm1
- vpxor %xmm14,%xmm2,%xmm2
- vpxor %xmm15,%xmm3,%xmm3
-.byte 143,232,120,194,192,7
-.byte 143,232,120,194,201,7
-.byte 143,232,120,194,210,7
-.byte 143,232,120,194,219,7
- vpaddd %xmm1,%xmm8,%xmm8
- vpaddd %xmm2,%xmm9,%xmm9
- vpaddd %xmm3,%xmm10,%xmm10
- vpaddd %xmm0,%xmm11,%xmm11
- vpxor %xmm7,%xmm8,%xmm7
- vpxor %xmm4,%xmm9,%xmm4
- vpxor %xmm5,%xmm10,%xmm5
- vpxor %xmm6,%xmm11,%xmm6
-.byte 143,232,120,194,255,16
-.byte 143,232,120,194,228,16
-.byte 143,232,120,194,237,16
-.byte 143,232,120,194,246,16
- vpaddd %xmm7,%xmm14,%xmm14
- vpaddd %xmm4,%xmm15,%xmm15
- vpaddd %xmm5,%xmm12,%xmm12
- vpaddd %xmm6,%xmm13,%xmm13
- vpxor %xmm1,%xmm14,%xmm1
- vpxor %xmm2,%xmm15,%xmm2
- vpxor %xmm12,%xmm3,%xmm3
- vpxor %xmm13,%xmm0,%xmm0
-.byte 143,232,120,194,201,12
-.byte 143,232,120,194,210,12
-.byte 143,232,120,194,219,12
-.byte 143,232,120,194,192,12
- vpaddd %xmm8,%xmm1,%xmm8
- vpaddd %xmm9,%xmm2,%xmm9
- vpaddd %xmm3,%xmm10,%xmm10
- vpaddd %xmm0,%xmm11,%xmm11
- vpxor %xmm7,%xmm8,%xmm7
- vpxor %xmm4,%xmm9,%xmm4
- vpxor %xmm5,%xmm10,%xmm5
- vpxor %xmm6,%xmm11,%xmm6
-.byte 143,232,120,194,255,8
-.byte 143,232,120,194,228,8
-.byte 143,232,120,194,237,8
-.byte 143,232,120,194,246,8
- vpaddd %xmm7,%xmm14,%xmm14
- vpaddd %xmm4,%xmm15,%xmm15
- vpaddd %xmm5,%xmm12,%xmm12
- vpaddd %xmm6,%xmm13,%xmm13
- vpxor %xmm1,%xmm14,%xmm1
- vpxor %xmm2,%xmm15,%xmm2
- vpxor %xmm12,%xmm3,%xmm3
- vpxor %xmm13,%xmm0,%xmm0
-.byte 143,232,120,194,201,7
-.byte 143,232,120,194,210,7
-.byte 143,232,120,194,219,7
-.byte 143,232,120,194,192,7
- decl %eax
- jnz .Loop4xop
-
- vpaddd 64(%rsp),%xmm8,%xmm8
- vpaddd 80(%rsp),%xmm9,%xmm9
- vpaddd 96(%rsp),%xmm10,%xmm10
- vpaddd 112(%rsp),%xmm11,%xmm11
-
- vmovdqa %xmm14,32(%rsp)
- vmovdqa %xmm15,48(%rsp)
-
- vpunpckldq %xmm9,%xmm8,%xmm14
- vpunpckldq %xmm11,%xmm10,%xmm15
- vpunpckhdq %xmm9,%xmm8,%xmm8
- vpunpckhdq %xmm11,%xmm10,%xmm10
- vpunpcklqdq %xmm15,%xmm14,%xmm9
- vpunpckhqdq %xmm15,%xmm14,%xmm14
- vpunpcklqdq %xmm10,%xmm8,%xmm11
- vpunpckhqdq %xmm10,%xmm8,%xmm8
- vpaddd 128-256(%rcx),%xmm0,%xmm0
- vpaddd 144-256(%rcx),%xmm1,%xmm1
- vpaddd 160-256(%rcx),%xmm2,%xmm2
- vpaddd 176-256(%rcx),%xmm3,%xmm3
-
- vmovdqa %xmm9,0(%rsp)
- vmovdqa %xmm14,16(%rsp)
- vmovdqa 32(%rsp),%xmm9
- vmovdqa 48(%rsp),%xmm14
-
- vpunpckldq %xmm1,%xmm0,%xmm10
- vpunpckldq %xmm3,%xmm2,%xmm15
- vpunpckhdq %xmm1,%xmm0,%xmm0
- vpunpckhdq %xmm3,%xmm2,%xmm2
- vpunpcklqdq %xmm15,%xmm10,%xmm1
- vpunpckhqdq %xmm15,%xmm10,%xmm10
- vpunpcklqdq %xmm2,%xmm0,%xmm3
- vpunpckhqdq %xmm2,%xmm0,%xmm0
- vpaddd 192-256(%rcx),%xmm12,%xmm12
- vpaddd 208-256(%rcx),%xmm13,%xmm13
- vpaddd 224-256(%rcx),%xmm9,%xmm9
- vpaddd 240-256(%rcx),%xmm14,%xmm14
-
- vpunpckldq %xmm13,%xmm12,%xmm2
- vpunpckldq %xmm14,%xmm9,%xmm15
- vpunpckhdq %xmm13,%xmm12,%xmm12
- vpunpckhdq %xmm14,%xmm9,%xmm9
- vpunpcklqdq %xmm15,%xmm2,%xmm13
- vpunpckhqdq %xmm15,%xmm2,%xmm2
- vpunpcklqdq %xmm9,%xmm12,%xmm14
- vpunpckhqdq %xmm9,%xmm12,%xmm12
- vpaddd 256-256(%rcx),%xmm4,%xmm4
- vpaddd 272-256(%rcx),%xmm5,%xmm5
- vpaddd 288-256(%rcx),%xmm6,%xmm6
- vpaddd 304-256(%rcx),%xmm7,%xmm7
-
- vpunpckldq %xmm5,%xmm4,%xmm9
- vpunpckldq %xmm7,%xmm6,%xmm15
- vpunpckhdq %xmm5,%xmm4,%xmm4
- vpunpckhdq %xmm7,%xmm6,%xmm6
- vpunpcklqdq %xmm15,%xmm9,%xmm5
- vpunpckhqdq %xmm15,%xmm9,%xmm9
- vpunpcklqdq %xmm6,%xmm4,%xmm7
- vpunpckhqdq %xmm6,%xmm4,%xmm4
- vmovdqa 0(%rsp),%xmm6
- vmovdqa 16(%rsp),%xmm15
-
- cmpq $256,%rdx
- jb .Ltail4xop
-
- vpxor 0(%rsi),%xmm6,%xmm6
- vpxor 16(%rsi),%xmm1,%xmm1
- vpxor 32(%rsi),%xmm13,%xmm13
- vpxor 48(%rsi),%xmm5,%xmm5
- vpxor 64(%rsi),%xmm15,%xmm15
- vpxor 80(%rsi),%xmm10,%xmm10
- vpxor 96(%rsi),%xmm2,%xmm2
- vpxor 112(%rsi),%xmm9,%xmm9
- leaq 128(%rsi),%rsi
- vpxor 0(%rsi),%xmm11,%xmm11
- vpxor 16(%rsi),%xmm3,%xmm3
- vpxor 32(%rsi),%xmm14,%xmm14
- vpxor 48(%rsi),%xmm7,%xmm7
- vpxor 64(%rsi),%xmm8,%xmm8
- vpxor 80(%rsi),%xmm0,%xmm0
- vpxor 96(%rsi),%xmm12,%xmm12
- vpxor 112(%rsi),%xmm4,%xmm4
- leaq 128(%rsi),%rsi
-
- vmovdqu %xmm6,0(%rdi)
- vmovdqu %xmm1,16(%rdi)
- vmovdqu %xmm13,32(%rdi)
- vmovdqu %xmm5,48(%rdi)
- vmovdqu %xmm15,64(%rdi)
- vmovdqu %xmm10,80(%rdi)
- vmovdqu %xmm2,96(%rdi)
- vmovdqu %xmm9,112(%rdi)
- leaq 128(%rdi),%rdi
- vmovdqu %xmm11,0(%rdi)
- vmovdqu %xmm3,16(%rdi)
- vmovdqu %xmm14,32(%rdi)
- vmovdqu %xmm7,48(%rdi)
- vmovdqu %xmm8,64(%rdi)
- vmovdqu %xmm0,80(%rdi)
- vmovdqu %xmm12,96(%rdi)
- vmovdqu %xmm4,112(%rdi)
- leaq 128(%rdi),%rdi
-
- subq $256,%rdx
- jnz .Loop_outer4xop
-
- jmp .Ldone4xop
-
-.align 32
-.Ltail4xop:
- cmpq $192,%rdx
- jae .L192_or_more4xop
- cmpq $128,%rdx
- jae .L128_or_more4xop
- cmpq $64,%rdx
- jae .L64_or_more4xop
-
- xorq %r10,%r10
- vmovdqa %xmm6,0(%rsp)
- vmovdqa %xmm1,16(%rsp)
- vmovdqa %xmm13,32(%rsp)
- vmovdqa %xmm5,48(%rsp)
- jmp .Loop_tail4xop
-
-.align 32
-.L64_or_more4xop:
- vpxor 0(%rsi),%xmm6,%xmm6
- vpxor 16(%rsi),%xmm1,%xmm1
- vpxor 32(%rsi),%xmm13,%xmm13
- vpxor 48(%rsi),%xmm5,%xmm5
- vmovdqu %xmm6,0(%rdi)
- vmovdqu %xmm1,16(%rdi)
- vmovdqu %xmm13,32(%rdi)
- vmovdqu %xmm5,48(%rdi)
- je .Ldone4xop
-
- leaq 64(%rsi),%rsi
- vmovdqa %xmm15,0(%rsp)
- xorq %r10,%r10
- vmovdqa %xmm10,16(%rsp)
- leaq 64(%rdi),%rdi
- vmovdqa %xmm2,32(%rsp)
- subq $64,%rdx
- vmovdqa %xmm9,48(%rsp)
- jmp .Loop_tail4xop
-
-.align 32
-.L128_or_more4xop:
- vpxor 0(%rsi),%xmm6,%xmm6
- vpxor 16(%rsi),%xmm1,%xmm1
- vpxor 32(%rsi),%xmm13,%xmm13
- vpxor 48(%rsi),%xmm5,%xmm5
- vpxor 64(%rsi),%xmm15,%xmm15
- vpxor 80(%rsi),%xmm10,%xmm10
- vpxor 96(%rsi),%xmm2,%xmm2
- vpxor 112(%rsi),%xmm9,%xmm9
-
- vmovdqu %xmm6,0(%rdi)
- vmovdqu %xmm1,16(%rdi)
- vmovdqu %xmm13,32(%rdi)
- vmovdqu %xmm5,48(%rdi)
- vmovdqu %xmm15,64(%rdi)
- vmovdqu %xmm10,80(%rdi)
- vmovdqu %xmm2,96(%rdi)
- vmovdqu %xmm9,112(%rdi)
- je .Ldone4xop
-
- leaq 128(%rsi),%rsi
- vmovdqa %xmm11,0(%rsp)
- xorq %r10,%r10
- vmovdqa %xmm3,16(%rsp)
- leaq 128(%rdi),%rdi
- vmovdqa %xmm14,32(%rsp)
- subq $128,%rdx
- vmovdqa %xmm7,48(%rsp)
- jmp .Loop_tail4xop
+.L4x_epilogue:
+ ret
+ENDPROC(chacha20_ssse3)
+#endif /* CONFIG_AS_SSSE3 */
+#ifdef CONFIG_AS_AVX2
.align 32
-.L192_or_more4xop:
- vpxor 0(%rsi),%xmm6,%xmm6
- vpxor 16(%rsi),%xmm1,%xmm1
- vpxor 32(%rsi),%xmm13,%xmm13
- vpxor 48(%rsi),%xmm5,%xmm5
- vpxor 64(%rsi),%xmm15,%xmm15
- vpxor 80(%rsi),%xmm10,%xmm10
- vpxor 96(%rsi),%xmm2,%xmm2
- vpxor 112(%rsi),%xmm9,%xmm9
- leaq 128(%rsi),%rsi
- vpxor 0(%rsi),%xmm11,%xmm11
- vpxor 16(%rsi),%xmm3,%xmm3
- vpxor 32(%rsi),%xmm14,%xmm14
- vpxor 48(%rsi),%xmm7,%xmm7
-
- vmovdqu %xmm6,0(%rdi)
- vmovdqu %xmm1,16(%rdi)
- vmovdqu %xmm13,32(%rdi)
- vmovdqu %xmm5,48(%rdi)
- vmovdqu %xmm15,64(%rdi)
- vmovdqu %xmm10,80(%rdi)
- vmovdqu %xmm2,96(%rdi)
- vmovdqu %xmm9,112(%rdi)
- leaq 128(%rdi),%rdi
- vmovdqu %xmm11,0(%rdi)
- vmovdqu %xmm3,16(%rdi)
- vmovdqu %xmm14,32(%rdi)
- vmovdqu %xmm7,48(%rdi)
- je .Ldone4xop
-
- leaq 64(%rsi),%rsi
- vmovdqa %xmm8,0(%rsp)
- xorq %r10,%r10
- vmovdqa %xmm0,16(%rsp)
- leaq 64(%rdi),%rdi
- vmovdqa %xmm12,32(%rsp)
- subq $192,%rdx
- vmovdqa %xmm4,48(%rsp)
-
-.Loop_tail4xop:
- movzbl (%rsi,%r10,1),%eax
- movzbl (%rsp,%r10,1),%ecx
- leaq 1(%r10),%r10
- xorl %ecx,%eax
- movb %al,-1(%rdi,%r10,1)
- decq %rdx
- jnz .Loop_tail4xop
+ENTRY(chacha20_avx2)
+.Lchacha20_avx2:
+ cmpq $0,%rdx
+ je .L8x_epilogue
+ leaq 8(%rsp),%r10
-.Ldone4xop:
- vzeroupper
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.L4xop_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_4xop,.-ChaCha20_4xop
-.type ChaCha20_8x,@function
-.align 32
-ChaCha20_8x:
-.cfi_startproc
-.LChaCha20_8x:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
subq $0x280+8,%rsp
andq $-32,%rsp
vzeroupper
-
-
-
-
-
-
-
-
-
vbroadcasti128 .Lsigma(%rip),%ymm11
vbroadcasti128 (%rcx),%ymm3
vbroadcasti128 16(%rcx),%ymm15
vbroadcasti128 (%r8),%ymm7
leaq 256(%rsp),%rcx
leaq 512(%rsp),%rax
- leaq .Lrot16(%rip),%r10
+ leaq .Lrot16(%rip),%r9
leaq .Lrot24(%rip),%r11
vpshufd $0x00,%ymm11,%ymm8
@@ -1684,7 +889,7 @@ ChaCha20_8x:
.Loop_enter8x:
vmovdqa %ymm14,64(%rsp)
vmovdqa %ymm15,96(%rsp)
- vbroadcasti128 (%r10),%ymm15
+ vbroadcasti128 (%r9),%ymm15
vmovdqa %ymm4,512-512(%rax)
movl $10,%eax
jmp .Loop8x
@@ -1719,7 +924,7 @@ ChaCha20_8x:
vpslld $7,%ymm0,%ymm15
vpsrld $25,%ymm0,%ymm0
vpor %ymm0,%ymm15,%ymm0
- vbroadcasti128 (%r10),%ymm15
+ vbroadcasti128 (%r9),%ymm15
vpaddd %ymm5,%ymm13,%ymm13
vpxor %ymm1,%ymm13,%ymm1
vpslld $7,%ymm1,%ymm14
@@ -1757,7 +962,7 @@ ChaCha20_8x:
vpslld $7,%ymm2,%ymm15
vpsrld $25,%ymm2,%ymm2
vpor %ymm2,%ymm15,%ymm2
- vbroadcasti128 (%r10),%ymm15
+ vbroadcasti128 (%r9),%ymm15
vpaddd %ymm7,%ymm13,%ymm13
vpxor %ymm3,%ymm13,%ymm3
vpslld $7,%ymm3,%ymm14
@@ -1791,7 +996,7 @@ ChaCha20_8x:
vpslld $7,%ymm1,%ymm15
vpsrld $25,%ymm1,%ymm1
vpor %ymm1,%ymm15,%ymm1
- vbroadcasti128 (%r10),%ymm15
+ vbroadcasti128 (%r9),%ymm15
vpaddd %ymm4,%ymm13,%ymm13
vpxor %ymm2,%ymm13,%ymm2
vpslld $7,%ymm2,%ymm14
@@ -1829,7 +1034,7 @@ ChaCha20_8x:
vpslld $7,%ymm3,%ymm15
vpsrld $25,%ymm3,%ymm3
vpor %ymm3,%ymm15,%ymm3
- vbroadcasti128 (%r10),%ymm15
+ vbroadcasti128 (%r9),%ymm15
vpaddd %ymm6,%ymm13,%ymm13
vpxor %ymm0,%ymm13,%ymm0
vpslld $7,%ymm0,%ymm14
@@ -1983,7 +1188,7 @@ ChaCha20_8x:
cmpq $64,%rdx
jae .L64_or_more8x
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm6,0(%rsp)
vmovdqa %ymm8,32(%rsp)
jmp .Loop_tail8x
@@ -1997,7 +1202,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 64(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm1,0(%rsp)
leaq 64(%rdi),%rdi
subq $64,%rdx
@@ -2017,7 +1222,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 128(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm12,0(%rsp)
leaq 128(%rdi),%rdi
subq $128,%rdx
@@ -2041,7 +1246,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 192(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm10,0(%rsp)
leaq 192(%rdi),%rdi
subq $192,%rdx
@@ -2069,7 +1274,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 256(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm14,0(%rsp)
leaq 256(%rdi),%rdi
subq $256,%rdx
@@ -2101,7 +1306,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 320(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm3,0(%rsp)
leaq 320(%rdi),%rdi
subq $320,%rdx
@@ -2137,7 +1342,7 @@ ChaCha20_8x:
je .Ldone8x
leaq 384(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm11,0(%rsp)
leaq 384(%rdi),%rdi
subq $384,%rdx
@@ -2177,40 +1382,43 @@ ChaCha20_8x:
je .Ldone8x
leaq 448(%rsi),%rsi
- xorq %r10,%r10
+ xorq %r9,%r9
vmovdqa %ymm0,0(%rsp)
leaq 448(%rdi),%rdi
subq $448,%rdx
vmovdqa %ymm4,32(%rsp)
.Loop_tail8x:
- movzbl (%rsi,%r10,1),%eax
- movzbl (%rsp,%r10,1),%ecx
- leaq 1(%r10),%r10
+ movzbl (%rsi,%r9,1),%eax
+ movzbl (%rsp,%r9,1),%ecx
+ leaq 1(%r9),%r9
xorl %ecx,%eax
- movb %al,-1(%rdi,%r10,1)
+ movb %al,-1(%rdi,%r9,1)
decq %rdx
jnz .Loop_tail8x
.Ldone8x:
vzeroall
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
+ leaq -8(%r10),%rsp
+
.L8x_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_8x,.-ChaCha20_8x
-.type ChaCha20_avx512,@function
+ ret
+ENDPROC(chacha20_avx2)
+#endif /* CONFIG_AS_AVX2 */
+
+#ifdef CONFIG_AS_AVX512
.align 32
-ChaCha20_avx512:
-.cfi_startproc
-.LChaCha20_avx512:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
+ENTRY(chacha20_avx512)
+.Lchacha20_avx512:
+ cmpq $0,%rdx
+ je .Lavx512_epilogue
+ leaq 8(%rsp),%r10
+
cmpq $512,%rdx
- ja .LChaCha20_16x
+ ja .Lchacha20_16x
subq $64+8,%rsp
+ andq $-64,%rsp
vbroadcasti32x4 .Lsigma(%rip),%zmm0
vbroadcasti32x4 (%rcx),%zmm1
vbroadcasti32x4 16(%rcx),%zmm2
@@ -2385,181 +1593,25 @@ ChaCha20_avx512:
decq %rdx
jnz .Loop_tail_avx512
- vmovdqu32 %zmm16,0(%rsp)
+ vmovdqa32 %zmm16,0(%rsp)
.Ldone_avx512:
vzeroall
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.Lavx512_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_avx512,.-ChaCha20_avx512
-.type ChaCha20_avx512vl,@function
-.align 32
-ChaCha20_avx512vl:
-.cfi_startproc
-.LChaCha20_avx512vl:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
- cmpq $128,%rdx
- ja .LChaCha20_8xvl
-
- subq $64+8,%rsp
- vbroadcasti128 .Lsigma(%rip),%ymm0
- vbroadcasti128 (%rcx),%ymm1
- vbroadcasti128 16(%rcx),%ymm2
- vbroadcasti128 (%r8),%ymm3
+ leaq -8(%r10),%rsp
- vmovdqa32 %ymm0,%ymm16
- vmovdqa32 %ymm1,%ymm17
- vmovdqa32 %ymm2,%ymm18
- vpaddd .Lzeroz(%rip),%ymm3,%ymm3
- vmovdqa32 .Ltwoy(%rip),%ymm20
- movq $10,%r8
- vmovdqa32 %ymm3,%ymm19
- jmp .Loop_avx512vl
-
-.align 16
-.Loop_outer_avx512vl:
- vmovdqa32 %ymm18,%ymm2
- vpaddd %ymm20,%ymm19,%ymm3
- movq $10,%r8
- vmovdqa32 %ymm3,%ymm19
- jmp .Loop_avx512vl
+.Lavx512_epilogue:
+ ret
.align 32
-.Loop_avx512vl:
- vpaddd %ymm1,%ymm0,%ymm0
- vpxor %ymm0,%ymm3,%ymm3
- vprold $16,%ymm3,%ymm3
- vpaddd %ymm3,%ymm2,%ymm2
- vpxor %ymm2,%ymm1,%ymm1
- vprold $12,%ymm1,%ymm1
- vpaddd %ymm1,%ymm0,%ymm0
- vpxor %ymm0,%ymm3,%ymm3
- vprold $8,%ymm3,%ymm3
- vpaddd %ymm3,%ymm2,%ymm2
- vpxor %ymm2,%ymm1,%ymm1
- vprold $7,%ymm1,%ymm1
- vpshufd $78,%ymm2,%ymm2
- vpshufd $57,%ymm1,%ymm1
- vpshufd $147,%ymm3,%ymm3
- vpaddd %ymm1,%ymm0,%ymm0
- vpxor %ymm0,%ymm3,%ymm3
- vprold $16,%ymm3,%ymm3
- vpaddd %ymm3,%ymm2,%ymm2
- vpxor %ymm2,%ymm1,%ymm1
- vprold $12,%ymm1,%ymm1
- vpaddd %ymm1,%ymm0,%ymm0
- vpxor %ymm0,%ymm3,%ymm3
- vprold $8,%ymm3,%ymm3
- vpaddd %ymm3,%ymm2,%ymm2
- vpxor %ymm2,%ymm1,%ymm1
- vprold $7,%ymm1,%ymm1
- vpshufd $78,%ymm2,%ymm2
- vpshufd $147,%ymm1,%ymm1
- vpshufd $57,%ymm3,%ymm3
- decq %r8
- jnz .Loop_avx512vl
- vpaddd %ymm16,%ymm0,%ymm0
- vpaddd %ymm17,%ymm1,%ymm1
- vpaddd %ymm18,%ymm2,%ymm2
- vpaddd %ymm19,%ymm3,%ymm3
-
- subq $64,%rdx
- jb .Ltail64_avx512vl
-
- vpxor 0(%rsi),%xmm0,%xmm4
- vpxor 16(%rsi),%xmm1,%xmm5
- vpxor 32(%rsi),%xmm2,%xmm6
- vpxor 48(%rsi),%xmm3,%xmm7
- leaq 64(%rsi),%rsi
-
- vmovdqu %xmm4,0(%rdi)
- vmovdqu %xmm5,16(%rdi)
- vmovdqu %xmm6,32(%rdi)
- vmovdqu %xmm7,48(%rdi)
- leaq 64(%rdi),%rdi
-
- jz .Ldone_avx512vl
-
- vextracti128 $1,%ymm0,%xmm4
- vextracti128 $1,%ymm1,%xmm5
- vextracti128 $1,%ymm2,%xmm6
- vextracti128 $1,%ymm3,%xmm7
-
- subq $64,%rdx
- jb .Ltail_avx512vl
-
- vpxor 0(%rsi),%xmm4,%xmm4
- vpxor 16(%rsi),%xmm5,%xmm5
- vpxor 32(%rsi),%xmm6,%xmm6
- vpxor 48(%rsi),%xmm7,%xmm7
- leaq 64(%rsi),%rsi
+.Lchacha20_16x:
+ leaq 8(%rsp),%r10
- vmovdqu %xmm4,0(%rdi)
- vmovdqu %xmm5,16(%rdi)
- vmovdqu %xmm6,32(%rdi)
- vmovdqu %xmm7,48(%rdi)
- leaq 64(%rdi),%rdi
-
- vmovdqa32 %ymm16,%ymm0
- vmovdqa32 %ymm17,%ymm1
- jnz .Loop_outer_avx512vl
-
- jmp .Ldone_avx512vl
-
-.align 16
-.Ltail64_avx512vl:
- vmovdqa %xmm0,0(%rsp)
- vmovdqa %xmm1,16(%rsp)
- vmovdqa %xmm2,32(%rsp)
- vmovdqa %xmm3,48(%rsp)
- addq $64,%rdx
- jmp .Loop_tail_avx512vl
-
-.align 16
-.Ltail_avx512vl:
- vmovdqa %xmm4,0(%rsp)
- vmovdqa %xmm5,16(%rsp)
- vmovdqa %xmm6,32(%rsp)
- vmovdqa %xmm7,48(%rsp)
- addq $64,%rdx
-
-.Loop_tail_avx512vl:
- movzbl (%rsi,%r8,1),%eax
- movzbl (%rsp,%r8,1),%ecx
- leaq 1(%r8),%r8
- xorl %ecx,%eax
- movb %al,-1(%rdi,%r8,1)
- decq %rdx
- jnz .Loop_tail_avx512vl
-
- vmovdqu32 %ymm16,0(%rsp)
- vmovdqu32 %ymm16,32(%rsp)
-
-.Ldone_avx512vl:
- vzeroall
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
-.Lavx512vl_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_avx512vl,.-ChaCha20_avx512vl
-.type ChaCha20_16x,@function
-.align 32
-ChaCha20_16x:
-.cfi_startproc
-.LChaCha20_16x:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
subq $64+8,%rsp
andq $-64,%rsp
vzeroupper
- leaq .Lsigma(%rip),%r10
- vbroadcasti32x4 (%r10),%zmm3
+ leaq .Lsigma(%rip),%r9
+ vbroadcasti32x4 (%r9),%zmm3
vbroadcasti32x4 (%rcx),%zmm7
vbroadcasti32x4 16(%rcx),%zmm11
vbroadcasti32x4 (%r8),%zmm15
@@ -2606,10 +1658,10 @@ ChaCha20_16x:
.align 32
.Loop_outer16x:
- vpbroadcastd 0(%r10),%zmm0
- vpbroadcastd 4(%r10),%zmm1
- vpbroadcastd 8(%r10),%zmm2
- vpbroadcastd 12(%r10),%zmm3
+ vpbroadcastd 0(%r9),%zmm0
+ vpbroadcastd 4(%r9),%zmm1
+ vpbroadcastd 8(%r9),%zmm2
+ vpbroadcastd 12(%r9),%zmm3
vpaddd .Lsixteen(%rip),%zmm28,%zmm28
vmovdqa64 %zmm20,%zmm4
vmovdqa64 %zmm21,%zmm5
@@ -2865,7 +1917,7 @@ ChaCha20_16x:
.align 32
.Ltail16x:
- xorq %r10,%r10
+ xorq %r9,%r9
subq %rsi,%rdi
cmpq $64,%rdx
jb .Less_than_64_16x
@@ -2993,11 +2045,11 @@ ChaCha20_16x:
andq $63,%rdx
.Loop_tail16x:
- movzbl (%rsi,%r10,1),%eax
- movzbl (%rsp,%r10,1),%ecx
- leaq 1(%r10),%r10
+ movzbl (%rsi,%r9,1),%eax
+ movzbl (%rsp,%r9,1),%ecx
+ leaq 1(%r9),%r9
xorl %ecx,%eax
- movb %al,-1(%rdi,%r10,1)
+ movb %al,-1(%rdi,%r9,1)
decq %rdx
jnz .Loop_tail16x
@@ -3006,25 +2058,172 @@ ChaCha20_16x:
.Ldone16x:
vzeroall
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
+ leaq -8(%r10),%rsp
+
.L16x_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_16x,.-ChaCha20_16x
-.type ChaCha20_8xvl,@function
+ ret
+ENDPROC(chacha20_avx512)
+
.align 32
-ChaCha20_8xvl:
-.cfi_startproc
-.LChaCha20_8xvl:
- movq %rsp,%r9
-.cfi_def_cfa_register %r9
+ENTRY(chacha20_avx512vl)
+ cmpq $0,%rdx
+ je .Lavx512vl_epilogue
+
+ leaq 8(%rsp),%r10
+
+ cmpq $128,%rdx
+ ja .Lchacha20_8xvl
+
+ subq $64+8,%rsp
+ andq $-64,%rsp
+ vbroadcasti128 .Lsigma(%rip),%ymm0
+ vbroadcasti128 (%rcx),%ymm1
+ vbroadcasti128 16(%rcx),%ymm2
+ vbroadcasti128 (%r8),%ymm3
+
+ vmovdqa32 %ymm0,%ymm16
+ vmovdqa32 %ymm1,%ymm17
+ vmovdqa32 %ymm2,%ymm18
+ vpaddd .Lzeroz(%rip),%ymm3,%ymm3
+ vmovdqa32 .Ltwoy(%rip),%ymm20
+ movq $10,%r8
+ vmovdqa32 %ymm3,%ymm19
+ jmp .Loop_avx512vl
+
+.align 16
+.Loop_outer_avx512vl:
+ vmovdqa32 %ymm18,%ymm2
+ vpaddd %ymm20,%ymm19,%ymm3
+ movq $10,%r8
+ vmovdqa32 %ymm3,%ymm19
+ jmp .Loop_avx512vl
+
+.align 32
+.Loop_avx512vl:
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $16,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $12,%ymm1,%ymm1
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $8,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $7,%ymm1,%ymm1
+ vpshufd $78,%ymm2,%ymm2
+ vpshufd $57,%ymm1,%ymm1
+ vpshufd $147,%ymm3,%ymm3
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $16,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $12,%ymm1,%ymm1
+ vpaddd %ymm1,%ymm0,%ymm0
+ vpxor %ymm0,%ymm3,%ymm3
+ vprold $8,%ymm3,%ymm3
+ vpaddd %ymm3,%ymm2,%ymm2
+ vpxor %ymm2,%ymm1,%ymm1
+ vprold $7,%ymm1,%ymm1
+ vpshufd $78,%ymm2,%ymm2
+ vpshufd $147,%ymm1,%ymm1
+ vpshufd $57,%ymm3,%ymm3
+ decq %r8
+ jnz .Loop_avx512vl
+ vpaddd %ymm16,%ymm0,%ymm0
+ vpaddd %ymm17,%ymm1,%ymm1
+ vpaddd %ymm18,%ymm2,%ymm2
+ vpaddd %ymm19,%ymm3,%ymm3
+
+ subq $64,%rdx
+ jb .Ltail64_avx512vl
+
+ vpxor 0(%rsi),%xmm0,%xmm4
+ vpxor 16(%rsi),%xmm1,%xmm5
+ vpxor 32(%rsi),%xmm2,%xmm6
+ vpxor 48(%rsi),%xmm3,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ jz .Ldone_avx512vl
+
+ vextracti128 $1,%ymm0,%xmm4
+ vextracti128 $1,%ymm1,%xmm5
+ vextracti128 $1,%ymm2,%xmm6
+ vextracti128 $1,%ymm3,%xmm7
+
+ subq $64,%rdx
+ jb .Ltail_avx512vl
+
+ vpxor 0(%rsi),%xmm4,%xmm4
+ vpxor 16(%rsi),%xmm5,%xmm5
+ vpxor 32(%rsi),%xmm6,%xmm6
+ vpxor 48(%rsi),%xmm7,%xmm7
+ leaq 64(%rsi),%rsi
+
+ vmovdqu %xmm4,0(%rdi)
+ vmovdqu %xmm5,16(%rdi)
+ vmovdqu %xmm6,32(%rdi)
+ vmovdqu %xmm7,48(%rdi)
+ leaq 64(%rdi),%rdi
+
+ vmovdqa32 %ymm16,%ymm0
+ vmovdqa32 %ymm17,%ymm1
+ jnz .Loop_outer_avx512vl
+
+ jmp .Ldone_avx512vl
+
+.align 16
+.Ltail64_avx512vl:
+ vmovdqa %xmm0,0(%rsp)
+ vmovdqa %xmm1,16(%rsp)
+ vmovdqa %xmm2,32(%rsp)
+ vmovdqa %xmm3,48(%rsp)
+ addq $64,%rdx
+ jmp .Loop_tail_avx512vl
+
+.align 16
+.Ltail_avx512vl:
+ vmovdqa %xmm4,0(%rsp)
+ vmovdqa %xmm5,16(%rsp)
+ vmovdqa %xmm6,32(%rsp)
+ vmovdqa %xmm7,48(%rsp)
+ addq $64,%rdx
+
+.Loop_tail_avx512vl:
+ movzbl (%rsi,%r8,1),%eax
+ movzbl (%rsp,%r8,1),%ecx
+ leaq 1(%r8),%r8
+ xorl %ecx,%eax
+ movb %al,-1(%rdi,%r8,1)
+ decq %rdx
+ jnz .Loop_tail_avx512vl
+
+ vmovdqa32 %ymm16,0(%rsp)
+ vmovdqa32 %ymm16,32(%rsp)
+
+.Ldone_avx512vl:
+ vzeroall
+ leaq -8(%r10),%rsp
+.Lavx512vl_epilogue:
+ ret
+
+.align 32
+.Lchacha20_8xvl:
+ leaq 8(%rsp),%r10
subq $64+8,%rsp
andq $-64,%rsp
vzeroupper
- leaq .Lsigma(%rip),%r10
- vbroadcasti128 (%r10),%ymm3
+ leaq .Lsigma(%rip),%r9
+ vbroadcasti128 (%r9),%ymm3
vbroadcasti128 (%rcx),%ymm7
vbroadcasti128 16(%rcx),%ymm11
vbroadcasti128 (%r8),%ymm15
@@ -3073,8 +2272,8 @@ ChaCha20_8xvl:
.Loop_outer8xvl:
- vpbroadcastd 8(%r10),%ymm2
- vpbroadcastd 12(%r10),%ymm3
+ vpbroadcastd 8(%r9),%ymm2
+ vpbroadcastd 12(%r9),%ymm3
vpaddd .Leight(%rip),%ymm28,%ymm28
vmovdqa64 %ymm20,%ymm4
vmovdqa64 %ymm21,%ymm5
@@ -3314,8 +2513,8 @@ ChaCha20_8xvl:
vmovdqu %ymm12,96(%rdi)
leaq (%rdi,%rax,1),%rdi
- vpbroadcastd 0(%r10),%ymm0
- vpbroadcastd 4(%r10),%ymm1
+ vpbroadcastd 0(%r9),%ymm0
+ vpbroadcastd 4(%r9),%ymm1
subq $512,%rdx
jnz .Loop_outer8xvl
@@ -3325,7 +2524,7 @@ ChaCha20_8xvl:
.align 32
.Ltail8xvl:
vmovdqa64 %ymm19,%ymm8
- xorq %r10,%r10
+ xorq %r9,%r9
subq %rsi,%rdi
cmpq $64,%rdx
jb .Less_than_64_8xvl
@@ -3411,11 +2610,11 @@ ChaCha20_8xvl:
andq $63,%rdx
.Loop_tail8xvl:
- movzbl (%rsi,%r10,1),%eax
- movzbl (%rsp,%r10,1),%ecx
- leaq 1(%r10),%r10
+ movzbl (%rsi,%r9,1),%eax
+ movzbl (%rsp,%r9,1),%ecx
+ leaq 1(%r9),%r9
xorl %ecx,%eax
- movb %al,-1(%rdi,%r10,1)
+ movb %al,-1(%rdi,%r9,1)
decq %rdx
jnz .Loop_tail8xvl
@@ -3425,9 +2624,9 @@ ChaCha20_8xvl:
.Ldone8xvl:
vzeroall
- leaq (%r9),%rsp
-.cfi_def_cfa_register %rsp
+ leaq -8(%r10),%rsp
.L8xvl_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size ChaCha20_8xvl,.-ChaCha20_8xvl
+ ret
+ENDPROC(chacha20_avx512vl)
+
+#endif /* CONFIG_AS_AVX512 */
diff --git a/lib/zinc/chacha20/chacha20.c b/lib/zinc/chacha20/chacha20.c
index 03209c15d1ca..22a21431c221 100644
--- a/lib/zinc/chacha20/chacha20.c
+++ b/lib/zinc/chacha20/chacha20.c
@@ -16,6 +16,9 @@
#include <linux/vmalloc.h>
#include <crypto/algapi.h> // For crypto_xor_cpy.
+#if defined(CONFIG_ZINC_ARCH_X86_64)
+#include "chacha20-x86_64-glue.c"
+#else
static bool *const chacha20_nobs[] __initconst = { };
static void __init chacha20_fpu_init(void)
{
@@ -33,6 +36,7 @@ static inline bool hchacha20_arch(u32 derived_key[CHACHA20_KEY_WORDS],
{
return false;
}
+#endif
#define QUARTER_ROUND(x, a, b, c, d) ( \
x[a] += x[b], \
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 07/28] zinc: import Andy Polyakov's ChaCha20 ARM and ARM64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (3 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 06/28] zinc: " Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 08/28] zinc: port " Jason A. Donenfeld
` (18 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Andy Polyakov, Russell King, linux-arm-kernel,
Samuel Neves, Jean-Philippe Aumasson, Andy Lutomirski,
Andrew Morton, Linus Torvalds, kernel-hardening, linux-crypto
These NEON and non-NEON implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
be the same as OpenSSL's commit 87cc649f30aaf69b351701875b9dac07c29ce8a2
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Based-on-code-from: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/chacha20/chacha20-arm-cryptogams.S | 1440 ++++++++++++
lib/zinc/chacha20/chacha20-arm64-cryptogams.S | 1973 +++++++++++++++++
2 files changed, 3413 insertions(+)
create mode 100644 lib/zinc/chacha20/chacha20-arm-cryptogams.S
create mode 100644 lib/zinc/chacha20/chacha20-arm64-cryptogams.S
diff --git a/lib/zinc/chacha20/chacha20-arm-cryptogams.S b/lib/zinc/chacha20/chacha20-arm-cryptogams.S
new file mode 100644
index 000000000000..05a3a9e6e93f
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-arm-cryptogams.S
@@ -0,0 +1,1440 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+#include "arm_arch.h"
+
+.text
+#if defined(__thumb2__) || defined(__clang__)
+.syntax unified
+#endif
+#if defined(__thumb2__)
+.thumb
+#else
+.code 32
+#endif
+
+#if defined(__thumb2__) || defined(__clang__)
+#define ldrhsb ldrbhs
+#endif
+
+.align 5
+.Lsigma:
+.long 0x61707865,0x3320646e,0x79622d32,0x6b206574 @ endian-neutral
+.Lone:
+.long 1,0,0,0
+.Lrot8:
+.long 0x02010003,0x06050407
+#if __ARM_MAX_ARCH__>=7
+.LOPENSSL_armcap:
+.word OPENSSL_armcap_P-.LChaCha20_ctr32
+#else
+.word -1
+#endif
+
+.globl ChaCha20_ctr32
+.type ChaCha20_ctr32,%function
+.align 5
+ChaCha20_ctr32:
+.LChaCha20_ctr32:
+ ldr r12,[sp,#0] @ pull pointer to counter and nonce
+ stmdb sp!,{r0-r2,r4-r11,lr}
+#if __ARM_ARCH__<7 && !defined(__thumb2__)
+ sub r14,pc,#16 @ ChaCha20_ctr32
+#else
+ adr r14,.LChaCha20_ctr32
+#endif
+ cmp r2,#0 @ len==0?
+#ifdef __thumb2__
+ itt eq
+#endif
+ addeq sp,sp,#4*3
+ beq .Lno_data
+#if __ARM_MAX_ARCH__>=7
+ cmp r2,#192 @ test len
+ bls .Lshort
+ ldr r4,[r14,#-24]
+ ldr r4,[r14,r4]
+# ifdef __APPLE__
+ ldr r4,[r4]
+# endif
+ tst r4,#ARMV7_NEON
+ bne .LChaCha20_neon
+.Lshort:
+#endif
+ ldmia r12,{r4-r7} @ load counter and nonce
+ sub sp,sp,#4*(16) @ off-load area
+ sub r14,r14,#64 @ .Lsigma
+ stmdb sp!,{r4-r7} @ copy counter and nonce
+ ldmia r3,{r4-r11} @ load key
+ ldmia r14,{r0-r3} @ load sigma
+ stmdb sp!,{r4-r11} @ copy key
+ stmdb sp!,{r0-r3} @ copy sigma
+ str r10,[sp,#4*(16+10)] @ off-load "rx"
+ str r11,[sp,#4*(16+11)] @ off-load "rx"
+ b .Loop_outer_enter
+
+.align 4
+.Loop_outer:
+ ldmia sp,{r0-r9} @ load key material
+ str r11,[sp,#4*(32+2)] @ save len
+ str r12, [sp,#4*(32+1)] @ save inp
+ str r14, [sp,#4*(32+0)] @ save out
+.Loop_outer_enter:
+ ldr r11, [sp,#4*(15)]
+ mov r4,r4,ror#19 @ twist b[0..3]
+ ldr r12,[sp,#4*(12)] @ modulo-scheduled load
+ mov r5,r5,ror#19
+ ldr r10, [sp,#4*(13)]
+ mov r6,r6,ror#19
+ ldr r14,[sp,#4*(14)]
+ mov r7,r7,ror#19
+ mov r11,r11,ror#8 @ twist d[0..3]
+ mov r12,r12,ror#8
+ mov r10,r10,ror#8
+ mov r14,r14,ror#8
+ str r11, [sp,#4*(16+15)]
+ mov r11,#10
+ b .Loop
+
+.align 4
+.Loop:
+ subs r11,r11,#1
+ add r0,r0,r4,ror#13
+ add r1,r1,r5,ror#13
+ eor r12,r0,r12,ror#24
+ eor r10,r1,r10,ror#24
+ add r8,r8,r12,ror#16
+ add r9,r9,r10,ror#16
+ eor r4,r8,r4,ror#13
+ eor r5,r9,r5,ror#13
+ add r0,r0,r4,ror#20
+ add r1,r1,r5,ror#20
+ eor r12,r0,r12,ror#16
+ eor r10,r1,r10,ror#16
+ add r8,r8,r12,ror#24
+ str r10,[sp,#4*(16+13)]
+ add r9,r9,r10,ror#24
+ ldr r10,[sp,#4*(16+15)]
+ str r8,[sp,#4*(16+8)]
+ eor r4,r4,r8,ror#12
+ str r9,[sp,#4*(16+9)]
+ eor r5,r5,r9,ror#12
+ ldr r8,[sp,#4*(16+10)]
+ add r2,r2,r6,ror#13
+ ldr r9,[sp,#4*(16+11)]
+ add r3,r3,r7,ror#13
+ eor r14,r2,r14,ror#24
+ eor r10,r3,r10,ror#24
+ add r8,r8,r14,ror#16
+ add r9,r9,r10,ror#16
+ eor r6,r8,r6,ror#13
+ eor r7,r9,r7,ror#13
+ add r2,r2,r6,ror#20
+ add r3,r3,r7,ror#20
+ eor r14,r2,r14,ror#16
+ eor r10,r3,r10,ror#16
+ add r8,r8,r14,ror#24
+ add r9,r9,r10,ror#24
+ eor r6,r6,r8,ror#12
+ eor r7,r7,r9,ror#12
+ add r0,r0,r5,ror#13
+ add r1,r1,r6,ror#13
+ eor r10,r0,r10,ror#24
+ eor r12,r1,r12,ror#24
+ add r8,r8,r10,ror#16
+ add r9,r9,r12,ror#16
+ eor r5,r8,r5,ror#13
+ eor r6,r9,r6,ror#13
+ add r0,r0,r5,ror#20
+ add r1,r1,r6,ror#20
+ eor r10,r0,r10,ror#16
+ eor r12,r1,r12,ror#16
+ str r10,[sp,#4*(16+15)]
+ add r8,r8,r10,ror#24
+ ldr r10,[sp,#4*(16+13)]
+ add r9,r9,r12,ror#24
+ str r8,[sp,#4*(16+10)]
+ eor r5,r5,r8,ror#12
+ str r9,[sp,#4*(16+11)]
+ eor r6,r6,r9,ror#12
+ ldr r8,[sp,#4*(16+8)]
+ add r2,r2,r7,ror#13
+ ldr r9,[sp,#4*(16+9)]
+ add r3,r3,r4,ror#13
+ eor r10,r2,r10,ror#24
+ eor r14,r3,r14,ror#24
+ add r8,r8,r10,ror#16
+ add r9,r9,r14,ror#16
+ eor r7,r8,r7,ror#13
+ eor r4,r9,r4,ror#13
+ add r2,r2,r7,ror#20
+ add r3,r3,r4,ror#20
+ eor r10,r2,r10,ror#16
+ eor r14,r3,r14,ror#16
+ add r8,r8,r10,ror#24
+ add r9,r9,r14,ror#24
+ eor r7,r7,r8,ror#12
+ eor r4,r4,r9,ror#12
+ bne .Loop
+
+ ldr r11,[sp,#4*(32+2)] @ load len
+
+ str r8, [sp,#4*(16+8)] @ modulo-scheduled store
+ str r9, [sp,#4*(16+9)]
+ str r12,[sp,#4*(16+12)]
+ str r10, [sp,#4*(16+13)]
+ str r14,[sp,#4*(16+14)]
+
+ @ at this point we have first half of 512-bit result in
+ @ rx and second half at sp+4*(16+8)
+
+ cmp r11,#64 @ done yet?
+#ifdef __thumb2__
+ itete lo
+#endif
+ addlo r12,sp,#4*(0) @ shortcut or ...
+ ldrhs r12,[sp,#4*(32+1)] @ ... load inp
+ addlo r14,sp,#4*(0) @ shortcut or ...
+ ldrhs r14,[sp,#4*(32+0)] @ ... load out
+
+ ldr r8,[sp,#4*(0)] @ load key material
+ ldr r9,[sp,#4*(1)]
+
+#if __ARM_ARCH__>=6 || !defined(__ARMEB__)
+# if __ARM_ARCH__<7
+ orr r10,r12,r14
+ tst r10,#3 @ are input and output aligned?
+ ldr r10,[sp,#4*(2)]
+ bne .Lunaligned
+ cmp r11,#64 @ restore flags
+# else
+ ldr r10,[sp,#4*(2)]
+# endif
+ ldr r11,[sp,#4*(3)]
+
+ add r0,r0,r8 @ accumulate key material
+ add r1,r1,r9
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r8,[r12],#16 @ load input
+ ldrhs r9,[r12,#-12]
+
+ add r2,r2,r10
+ add r3,r3,r11
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r10,[r12,#-8]
+ ldrhs r11,[r12,#-4]
+# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+# endif
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r0,r0,r8 @ xor with input
+ eorhs r1,r1,r9
+ add r8,sp,#4*(4)
+ str r0,[r14],#16 @ store output
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r2,r2,r10
+ eorhs r3,r3,r11
+ ldmia r8,{r8-r11} @ load key material
+ str r1,[r14,#-12]
+ str r2,[r14,#-8]
+ str r3,[r14,#-4]
+
+ add r4,r8,r4,ror#13 @ accumulate key material
+ add r5,r9,r5,ror#13
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r8,[r12],#16 @ load input
+ ldrhs r9,[r12,#-12]
+ add r6,r10,r6,ror#13
+ add r7,r11,r7,ror#13
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r10,[r12,#-8]
+ ldrhs r11,[r12,#-4]
+# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r4,r4,r8
+ eorhs r5,r5,r9
+ add r8,sp,#4*(8)
+ str r4,[r14],#16 @ store output
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r6,r6,r10
+ eorhs r7,r7,r11
+ str r5,[r14,#-12]
+ ldmia r8,{r8-r11} @ load key material
+ str r6,[r14,#-8]
+ add r0,sp,#4*(16+8)
+ str r7,[r14,#-4]
+
+ ldmia r0,{r0-r7} @ load second half
+
+ add r0,r0,r8 @ accumulate key material
+ add r1,r1,r9
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r8,[r12],#16 @ load input
+ ldrhs r9,[r12,#-12]
+# ifdef __thumb2__
+ itt hi
+# endif
+ strhi r10,[sp,#4*(16+10)] @ copy "rx" while at it
+ strhi r11,[sp,#4*(16+11)] @ copy "rx" while at it
+ add r2,r2,r10
+ add r3,r3,r11
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r10,[r12,#-8]
+ ldrhs r11,[r12,#-4]
+# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+# endif
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r0,r0,r8
+ eorhs r1,r1,r9
+ add r8,sp,#4*(12)
+ str r0,[r14],#16 @ store output
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r2,r2,r10
+ eorhs r3,r3,r11
+ str r1,[r14,#-12]
+ ldmia r8,{r8-r11} @ load key material
+ str r2,[r14,#-8]
+ str r3,[r14,#-4]
+
+ add r4,r8,r4,ror#24 @ accumulate key material
+ add r5,r9,r5,ror#24
+# ifdef __thumb2__
+ itt hi
+# endif
+ addhi r8,r8,#1 @ next counter value
+ strhi r8,[sp,#4*(12)] @ save next counter value
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r8,[r12],#16 @ load input
+ ldrhs r9,[r12,#-12]
+ add r6,r10,r6,ror#24
+ add r7,r11,r7,ror#24
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhs r10,[r12,#-8]
+ ldrhs r11,[r12,#-4]
+# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r4,r4,r8
+ eorhs r5,r5,r9
+# ifdef __thumb2__
+ it ne
+# endif
+ ldrne r8,[sp,#4*(32+2)] @ re-load len
+# ifdef __thumb2__
+ itt hs
+# endif
+ eorhs r6,r6,r10
+ eorhs r7,r7,r11
+ str r4,[r14],#16 @ store output
+ str r5,[r14,#-12]
+# ifdef __thumb2__
+ it hs
+# endif
+ subhs r11,r8,#64 @ len-=64
+ str r6,[r14,#-8]
+ str r7,[r14,#-4]
+ bhi .Loop_outer
+
+ beq .Ldone
+# if __ARM_ARCH__<7
+ b .Ltail
+
+.align 4
+.Lunaligned: @ unaligned endian-neutral path
+ cmp r11,#64 @ restore flags
+# endif
+#endif
+#if __ARM_ARCH__<7
+ ldr r11,[sp,#4*(3)]
+ add r0,r8,r0 @ accumulate key material
+ add r1,r9,r1
+ add r2,r10,r2
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r8,r8,r8 @ zero or ...
+ ldrhsb r8,[r12],#16 @ ... load input
+ eorlo r9,r9,r9
+ ldrhsb r9,[r12,#-12]
+
+ add r3,r11,r3
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r10,r10,r10
+ ldrhsb r10,[r12,#-8]
+ eorlo r11,r11,r11
+ ldrhsb r11,[r12,#-4]
+
+ eor r0,r8,r0 @ xor with input (or zero)
+ eor r1,r9,r1
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-15] @ load more input
+ ldrhsb r9,[r12,#-11]
+ eor r2,r10,r2
+ strb r0,[r14],#16 @ store output
+ eor r3,r11,r3
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-7]
+ ldrhsb r11,[r12,#-3]
+ strb r1,[r14,#-12]
+ eor r0,r8,r0,lsr#8
+ strb r2,[r14,#-8]
+ eor r1,r9,r1,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-14] @ load more input
+ ldrhsb r9,[r12,#-10]
+ strb r3,[r14,#-4]
+ eor r2,r10,r2,lsr#8
+ strb r0,[r14,#-15]
+ eor r3,r11,r3,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-6]
+ ldrhsb r11,[r12,#-2]
+ strb r1,[r14,#-11]
+ eor r0,r8,r0,lsr#8
+ strb r2,[r14,#-7]
+ eor r1,r9,r1,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-13] @ load more input
+ ldrhsb r9,[r12,#-9]
+ strb r3,[r14,#-3]
+ eor r2,r10,r2,lsr#8
+ strb r0,[r14,#-14]
+ eor r3,r11,r3,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-5]
+ ldrhsb r11,[r12,#-1]
+ strb r1,[r14,#-10]
+ strb r2,[r14,#-6]
+ eor r0,r8,r0,lsr#8
+ strb r3,[r14,#-2]
+ eor r1,r9,r1,lsr#8
+ strb r0,[r14,#-13]
+ eor r2,r10,r2,lsr#8
+ strb r1,[r14,#-9]
+ eor r3,r11,r3,lsr#8
+ strb r2,[r14,#-5]
+ strb r3,[r14,#-1]
+ add r8,sp,#4*(4+0)
+ ldmia r8,{r8-r11} @ load key material
+ add r0,sp,#4*(16+8)
+ add r4,r8,r4,ror#13 @ accumulate key material
+ add r5,r9,r5,ror#13
+ add r6,r10,r6,ror#13
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r8,r8,r8 @ zero or ...
+ ldrhsb r8,[r12],#16 @ ... load input
+ eorlo r9,r9,r9
+ ldrhsb r9,[r12,#-12]
+
+ add r7,r11,r7,ror#13
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r10,r10,r10
+ ldrhsb r10,[r12,#-8]
+ eorlo r11,r11,r11
+ ldrhsb r11,[r12,#-4]
+
+ eor r4,r8,r4 @ xor with input (or zero)
+ eor r5,r9,r5
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-15] @ load more input
+ ldrhsb r9,[r12,#-11]
+ eor r6,r10,r6
+ strb r4,[r14],#16 @ store output
+ eor r7,r11,r7
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-7]
+ ldrhsb r11,[r12,#-3]
+ strb r5,[r14,#-12]
+ eor r4,r8,r4,lsr#8
+ strb r6,[r14,#-8]
+ eor r5,r9,r5,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-14] @ load more input
+ ldrhsb r9,[r12,#-10]
+ strb r7,[r14,#-4]
+ eor r6,r10,r6,lsr#8
+ strb r4,[r14,#-15]
+ eor r7,r11,r7,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-6]
+ ldrhsb r11,[r12,#-2]
+ strb r5,[r14,#-11]
+ eor r4,r8,r4,lsr#8
+ strb r6,[r14,#-7]
+ eor r5,r9,r5,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-13] @ load more input
+ ldrhsb r9,[r12,#-9]
+ strb r7,[r14,#-3]
+ eor r6,r10,r6,lsr#8
+ strb r4,[r14,#-14]
+ eor r7,r11,r7,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-5]
+ ldrhsb r11,[r12,#-1]
+ strb r5,[r14,#-10]
+ strb r6,[r14,#-6]
+ eor r4,r8,r4,lsr#8
+ strb r7,[r14,#-2]
+ eor r5,r9,r5,lsr#8
+ strb r4,[r14,#-13]
+ eor r6,r10,r6,lsr#8
+ strb r5,[r14,#-9]
+ eor r7,r11,r7,lsr#8
+ strb r6,[r14,#-5]
+ strb r7,[r14,#-1]
+ add r8,sp,#4*(4+4)
+ ldmia r8,{r8-r11} @ load key material
+ ldmia r0,{r0-r7} @ load second half
+# ifdef __thumb2__
+ itt hi
+# endif
+ strhi r10,[sp,#4*(16+10)] @ copy "rx"
+ strhi r11,[sp,#4*(16+11)] @ copy "rx"
+ add r0,r8,r0 @ accumulate key material
+ add r1,r9,r1
+ add r2,r10,r2
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r8,r8,r8 @ zero or ...
+ ldrhsb r8,[r12],#16 @ ... load input
+ eorlo r9,r9,r9
+ ldrhsb r9,[r12,#-12]
+
+ add r3,r11,r3
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r10,r10,r10
+ ldrhsb r10,[r12,#-8]
+ eorlo r11,r11,r11
+ ldrhsb r11,[r12,#-4]
+
+ eor r0,r8,r0 @ xor with input (or zero)
+ eor r1,r9,r1
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-15] @ load more input
+ ldrhsb r9,[r12,#-11]
+ eor r2,r10,r2
+ strb r0,[r14],#16 @ store output
+ eor r3,r11,r3
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-7]
+ ldrhsb r11,[r12,#-3]
+ strb r1,[r14,#-12]
+ eor r0,r8,r0,lsr#8
+ strb r2,[r14,#-8]
+ eor r1,r9,r1,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-14] @ load more input
+ ldrhsb r9,[r12,#-10]
+ strb r3,[r14,#-4]
+ eor r2,r10,r2,lsr#8
+ strb r0,[r14,#-15]
+ eor r3,r11,r3,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-6]
+ ldrhsb r11,[r12,#-2]
+ strb r1,[r14,#-11]
+ eor r0,r8,r0,lsr#8
+ strb r2,[r14,#-7]
+ eor r1,r9,r1,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-13] @ load more input
+ ldrhsb r9,[r12,#-9]
+ strb r3,[r14,#-3]
+ eor r2,r10,r2,lsr#8
+ strb r0,[r14,#-14]
+ eor r3,r11,r3,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-5]
+ ldrhsb r11,[r12,#-1]
+ strb r1,[r14,#-10]
+ strb r2,[r14,#-6]
+ eor r0,r8,r0,lsr#8
+ strb r3,[r14,#-2]
+ eor r1,r9,r1,lsr#8
+ strb r0,[r14,#-13]
+ eor r2,r10,r2,lsr#8
+ strb r1,[r14,#-9]
+ eor r3,r11,r3,lsr#8
+ strb r2,[r14,#-5]
+ strb r3,[r14,#-1]
+ add r8,sp,#4*(4+8)
+ ldmia r8,{r8-r11} @ load key material
+ add r4,r8,r4,ror#24 @ accumulate key material
+# ifdef __thumb2__
+ itt hi
+# endif
+ addhi r8,r8,#1 @ next counter value
+ strhi r8,[sp,#4*(12)] @ save next counter value
+ add r5,r9,r5,ror#24
+ add r6,r10,r6,ror#24
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r8,r8,r8 @ zero or ...
+ ldrhsb r8,[r12],#16 @ ... load input
+ eorlo r9,r9,r9
+ ldrhsb r9,[r12,#-12]
+
+ add r7,r11,r7,ror#24
+# ifdef __thumb2__
+ itete lo
+# endif
+ eorlo r10,r10,r10
+ ldrhsb r10,[r12,#-8]
+ eorlo r11,r11,r11
+ ldrhsb r11,[r12,#-4]
+
+ eor r4,r8,r4 @ xor with input (or zero)
+ eor r5,r9,r5
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-15] @ load more input
+ ldrhsb r9,[r12,#-11]
+ eor r6,r10,r6
+ strb r4,[r14],#16 @ store output
+ eor r7,r11,r7
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-7]
+ ldrhsb r11,[r12,#-3]
+ strb r5,[r14,#-12]
+ eor r4,r8,r4,lsr#8
+ strb r6,[r14,#-8]
+ eor r5,r9,r5,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-14] @ load more input
+ ldrhsb r9,[r12,#-10]
+ strb r7,[r14,#-4]
+ eor r6,r10,r6,lsr#8
+ strb r4,[r14,#-15]
+ eor r7,r11,r7,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-6]
+ ldrhsb r11,[r12,#-2]
+ strb r5,[r14,#-11]
+ eor r4,r8,r4,lsr#8
+ strb r6,[r14,#-7]
+ eor r5,r9,r5,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r8,[r12,#-13] @ load more input
+ ldrhsb r9,[r12,#-9]
+ strb r7,[r14,#-3]
+ eor r6,r10,r6,lsr#8
+ strb r4,[r14,#-14]
+ eor r7,r11,r7,lsr#8
+# ifdef __thumb2__
+ itt hs
+# endif
+ ldrhsb r10,[r12,#-5]
+ ldrhsb r11,[r12,#-1]
+ strb r5,[r14,#-10]
+ strb r6,[r14,#-6]
+ eor r4,r8,r4,lsr#8
+ strb r7,[r14,#-2]
+ eor r5,r9,r5,lsr#8
+ strb r4,[r14,#-13]
+ eor r6,r10,r6,lsr#8
+ strb r5,[r14,#-9]
+ eor r7,r11,r7,lsr#8
+ strb r6,[r14,#-5]
+ strb r7,[r14,#-1]
+# ifdef __thumb2__
+ it ne
+# endif
+ ldrne r8,[sp,#4*(32+2)] @ re-load len
+# ifdef __thumb2__
+ it hs
+# endif
+ subhs r11,r8,#64 @ len-=64
+ bhi .Loop_outer
+
+ beq .Ldone
+#endif
+
+.Ltail:
+ ldr r12,[sp,#4*(32+1)] @ load inp
+ add r9,sp,#4*(0)
+ ldr r14,[sp,#4*(32+0)] @ load out
+
+.Loop_tail:
+ ldrb r10,[r9],#1 @ read buffer on stack
+ ldrb r11,[r12],#1 @ read input
+ subs r8,r8,#1
+ eor r11,r11,r10
+ strb r11,[r14],#1 @ store output
+ bne .Loop_tail
+
+.Ldone:
+ add sp,sp,#4*(32+3)
+.Lno_data:
+ ldmia sp!,{r4-r11,pc}
+.size ChaCha20_ctr32,.-ChaCha20_ctr32
+#if __ARM_MAX_ARCH__>=7
+.arch armv7-a
+.fpu neon
+
+.type ChaCha20_neon,%function
+.align 5
+ChaCha20_neon:
+ ldr r12,[sp,#0] @ pull pointer to counter and nonce
+ stmdb sp!,{r0-r2,r4-r11,lr}
+.LChaCha20_neon:
+ adr r14,.Lsigma
+ vstmdb sp!,{d8-d15} @ ABI spec says so
+ stmdb sp!,{r0-r3}
+
+ vld1.32 {q1-q2},[r3] @ load key
+ ldmia r3,{r4-r11} @ load key
+
+ sub sp,sp,#4*(16+16)
+ vld1.32 {q3},[r12] @ load counter and nonce
+ add r12,sp,#4*8
+ ldmia r14,{r0-r3} @ load sigma
+ vld1.32 {q0},[r14]! @ load sigma
+ vld1.32 {q12},[r14]! @ one
+ @ vld1.32 {d30},[r14] @ rot8
+ vst1.32 {q2-q3},[r12] @ copy 1/2key|counter|nonce
+ vst1.32 {q0-q1},[sp] @ copy sigma|1/2key
+
+ str r10,[sp,#4*(16+10)] @ off-load "rx"
+ str r11,[sp,#4*(16+11)] @ off-load "rx"
+ vshl.i32 d26,d24,#1 @ two
+ vstr d24,[sp,#4*(16+0)]
+ vshl.i32 d28,d24,#2 @ four
+ vstr d26,[sp,#4*(16+2)]
+ vmov q4,q0
+ vstr d28,[sp,#4*(16+4)]
+ vmov q8,q0
+ @ vstr d30,[sp,#4*(16+6)]
+ vmov q5,q1
+ vmov q9,q1
+ b .Loop_neon_enter
+
+.align 4
+.Loop_neon_outer:
+ ldmia sp,{r0-r9} @ load key material
+ cmp r11,#64*2 @ if len<=64*2
+ bls .Lbreak_neon @ switch to integer-only
+ @ vldr d30,[sp,#4*(16+6)] @ rot8
+ vmov q4,q0
+ str r11,[sp,#4*(32+2)] @ save len
+ vmov q8,q0
+ str r12, [sp,#4*(32+1)] @ save inp
+ vmov q5,q1
+ str r14, [sp,#4*(32+0)] @ save out
+ vmov q9,q1
+.Loop_neon_enter:
+ ldr r11, [sp,#4*(15)]
+ mov r4,r4,ror#19 @ twist b[0..3]
+ vadd.i32 q7,q3,q12 @ counter+1
+ ldr r12,[sp,#4*(12)] @ modulo-scheduled load
+ mov r5,r5,ror#19
+ vmov q6,q2
+ ldr r10, [sp,#4*(13)]
+ mov r6,r6,ror#19
+ vmov q10,q2
+ ldr r14,[sp,#4*(14)]
+ mov r7,r7,ror#19
+ vadd.i32 q11,q7,q12 @ counter+2
+ add r12,r12,#3 @ counter+3
+ mov r11,r11,ror#8 @ twist d[0..3]
+ mov r12,r12,ror#8
+ mov r10,r10,ror#8
+ mov r14,r14,ror#8
+ str r11, [sp,#4*(16+15)]
+ mov r11,#10
+ b .Loop_neon
+
+.align 4
+.Loop_neon:
+ subs r11,r11,#1
+ vadd.i32 q0,q0,q1
+ add r0,r0,r4,ror#13
+ vadd.i32 q4,q4,q5
+ add r1,r1,r5,ror#13
+ vadd.i32 q8,q8,q9
+ eor r12,r0,r12,ror#24
+ veor q3,q3,q0
+ eor r10,r1,r10,ror#24
+ veor q7,q7,q4
+ add r8,r8,r12,ror#16
+ veor q11,q11,q8
+ add r9,r9,r10,ror#16
+ vrev32.16 q3,q3
+ eor r4,r8,r4,ror#13
+ vrev32.16 q7,q7
+ eor r5,r9,r5,ror#13
+ vrev32.16 q11,q11
+ add r0,r0,r4,ror#20
+ vadd.i32 q2,q2,q3
+ add r1,r1,r5,ror#20
+ vadd.i32 q6,q6,q7
+ eor r12,r0,r12,ror#16
+ vadd.i32 q10,q10,q11
+ eor r10,r1,r10,ror#16
+ veor q12,q1,q2
+ add r8,r8,r12,ror#24
+ veor q13,q5,q6
+ str r10,[sp,#4*(16+13)]
+ veor q14,q9,q10
+ add r9,r9,r10,ror#24
+ vshr.u32 q1,q12,#20
+ ldr r10,[sp,#4*(16+15)]
+ vshr.u32 q5,q13,#20
+ str r8,[sp,#4*(16+8)]
+ vshr.u32 q9,q14,#20
+ eor r4,r4,r8,ror#12
+ vsli.32 q1,q12,#12
+ str r9,[sp,#4*(16+9)]
+ vsli.32 q5,q13,#12
+ eor r5,r5,r9,ror#12
+ vsli.32 q9,q14,#12
+ ldr r8,[sp,#4*(16+10)]
+ vadd.i32 q0,q0,q1
+ add r2,r2,r6,ror#13
+ vadd.i32 q4,q4,q5
+ ldr r9,[sp,#4*(16+11)]
+ vadd.i32 q8,q8,q9
+ add r3,r3,r7,ror#13
+ veor q12,q3,q0
+ eor r14,r2,r14,ror#24
+ veor q13,q7,q4
+ eor r10,r3,r10,ror#24
+ veor q14,q11,q8
+ add r8,r8,r14,ror#16
+ vshr.u32 q3,q12,#24
+ add r9,r9,r10,ror#16
+ vshr.u32 q7,q13,#24
+ eor r6,r8,r6,ror#13
+ vshr.u32 q11,q14,#24
+ eor r7,r9,r7,ror#13
+ vsli.32 q3,q12,#8
+ add r2,r2,r6,ror#20
+ vsli.32 q7,q13,#8
+ add r3,r3,r7,ror#20
+ vsli.32 q11,q14,#8
+ eor r14,r2,r14,ror#16
+ vadd.i32 q2,q2,q3
+ eor r10,r3,r10,ror#16
+ vadd.i32 q6,q6,q7
+ add r8,r8,r14,ror#24
+ vadd.i32 q10,q10,q11
+ add r9,r9,r10,ror#24
+ veor q12,q1,q2
+ eor r6,r6,r8,ror#12
+ veor q13,q5,q6
+ eor r7,r7,r9,ror#12
+ veor q14,q9,q10
+ vshr.u32 q1,q12,#25
+ vshr.u32 q5,q13,#25
+ vshr.u32 q9,q14,#25
+ vsli.32 q1,q12,#7
+ vsli.32 q5,q13,#7
+ vsli.32 q9,q14,#7
+ vext.8 q2,q2,q2,#8
+ vext.8 q6,q6,q6,#8
+ vext.8 q10,q10,q10,#8
+ vext.8 q1,q1,q1,#4
+ vext.8 q5,q5,q5,#4
+ vext.8 q9,q9,q9,#4
+ vext.8 q3,q3,q3,#12
+ vext.8 q7,q7,q7,#12
+ vext.8 q11,q11,q11,#12
+ vadd.i32 q0,q0,q1
+ add r0,r0,r5,ror#13
+ vadd.i32 q4,q4,q5
+ add r1,r1,r6,ror#13
+ vadd.i32 q8,q8,q9
+ eor r10,r0,r10,ror#24
+ veor q3,q3,q0
+ eor r12,r1,r12,ror#24
+ veor q7,q7,q4
+ add r8,r8,r10,ror#16
+ veor q11,q11,q8
+ add r9,r9,r12,ror#16
+ vrev32.16 q3,q3
+ eor r5,r8,r5,ror#13
+ vrev32.16 q7,q7
+ eor r6,r9,r6,ror#13
+ vrev32.16 q11,q11
+ add r0,r0,r5,ror#20
+ vadd.i32 q2,q2,q3
+ add r1,r1,r6,ror#20
+ vadd.i32 q6,q6,q7
+ eor r10,r0,r10,ror#16
+ vadd.i32 q10,q10,q11
+ eor r12,r1,r12,ror#16
+ veor q12,q1,q2
+ str r10,[sp,#4*(16+15)]
+ veor q13,q5,q6
+ add r8,r8,r10,ror#24
+ veor q14,q9,q10
+ ldr r10,[sp,#4*(16+13)]
+ vshr.u32 q1,q12,#20
+ add r9,r9,r12,ror#24
+ vshr.u32 q5,q13,#20
+ str r8,[sp,#4*(16+10)]
+ vshr.u32 q9,q14,#20
+ eor r5,r5,r8,ror#12
+ vsli.32 q1,q12,#12
+ str r9,[sp,#4*(16+11)]
+ vsli.32 q5,q13,#12
+ eor r6,r6,r9,ror#12
+ vsli.32 q9,q14,#12
+ ldr r8,[sp,#4*(16+8)]
+ vadd.i32 q0,q0,q1
+ add r2,r2,r7,ror#13
+ vadd.i32 q4,q4,q5
+ ldr r9,[sp,#4*(16+9)]
+ vadd.i32 q8,q8,q9
+ add r3,r3,r4,ror#13
+ veor q12,q3,q0
+ eor r10,r2,r10,ror#24
+ veor q13,q7,q4
+ eor r14,r3,r14,ror#24
+ veor q14,q11,q8
+ add r8,r8,r10,ror#16
+ vshr.u32 q3,q12,#24
+ add r9,r9,r14,ror#16
+ vshr.u32 q7,q13,#24
+ eor r7,r8,r7,ror#13
+ vshr.u32 q11,q14,#24
+ eor r4,r9,r4,ror#13
+ vsli.32 q3,q12,#8
+ add r2,r2,r7,ror#20
+ vsli.32 q7,q13,#8
+ add r3,r3,r4,ror#20
+ vsli.32 q11,q14,#8
+ eor r10,r2,r10,ror#16
+ vadd.i32 q2,q2,q3
+ eor r14,r3,r14,ror#16
+ vadd.i32 q6,q6,q7
+ add r8,r8,r10,ror#24
+ vadd.i32 q10,q10,q11
+ add r9,r9,r14,ror#24
+ veor q12,q1,q2
+ eor r7,r7,r8,ror#12
+ veor q13,q5,q6
+ eor r4,r4,r9,ror#12
+ veor q14,q9,q10
+ vshr.u32 q1,q12,#25
+ vshr.u32 q5,q13,#25
+ vshr.u32 q9,q14,#25
+ vsli.32 q1,q12,#7
+ vsli.32 q5,q13,#7
+ vsli.32 q9,q14,#7
+ vext.8 q2,q2,q2,#8
+ vext.8 q6,q6,q6,#8
+ vext.8 q10,q10,q10,#8
+ vext.8 q1,q1,q1,#12
+ vext.8 q5,q5,q5,#12
+ vext.8 q9,q9,q9,#12
+ vext.8 q3,q3,q3,#4
+ vext.8 q7,q7,q7,#4
+ vext.8 q11,q11,q11,#4
+ bne .Loop_neon
+
+ add r11,sp,#32
+ vld1.32 {q12-q13},[sp] @ load key material
+ vld1.32 {q14-q15},[r11]
+
+ ldr r11,[sp,#4*(32+2)] @ load len
+
+ str r8, [sp,#4*(16+8)] @ modulo-scheduled store
+ str r9, [sp,#4*(16+9)]
+ str r12,[sp,#4*(16+12)]
+ str r10, [sp,#4*(16+13)]
+ str r14,[sp,#4*(16+14)]
+
+ @ at this point we have first half of 512-bit result in
+ @ rx and second half at sp+4*(16+8)
+
+ ldr r12,[sp,#4*(32+1)] @ load inp
+ ldr r14,[sp,#4*(32+0)] @ load out
+
+ vadd.i32 q0,q0,q12 @ accumulate key material
+ vadd.i32 q4,q4,q12
+ vadd.i32 q8,q8,q12
+ vldr d24,[sp,#4*(16+0)] @ one
+
+ vadd.i32 q1,q1,q13
+ vadd.i32 q5,q5,q13
+ vadd.i32 q9,q9,q13
+ vldr d26,[sp,#4*(16+2)] @ two
+
+ vadd.i32 q2,q2,q14
+ vadd.i32 q6,q6,q14
+ vadd.i32 q10,q10,q14
+ vadd.i32 d14,d14,d24 @ counter+1
+ vadd.i32 d22,d22,d26 @ counter+2
+
+ vadd.i32 q3,q3,q15
+ vadd.i32 q7,q7,q15
+ vadd.i32 q11,q11,q15
+
+ cmp r11,#64*4
+ blo .Ltail_neon
+
+ vld1.8 {q12-q13},[r12]! @ load input
+ mov r11,sp
+ vld1.8 {q14-q15},[r12]!
+ veor q0,q0,q12 @ xor with input
+ veor q1,q1,q13
+ vld1.8 {q12-q13},[r12]!
+ veor q2,q2,q14
+ veor q3,q3,q15
+ vld1.8 {q14-q15},[r12]!
+
+ veor q4,q4,q12
+ vst1.8 {q0-q1},[r14]! @ store output
+ veor q5,q5,q13
+ vld1.8 {q12-q13},[r12]!
+ veor q6,q6,q14
+ vst1.8 {q2-q3},[r14]!
+ veor q7,q7,q15
+ vld1.8 {q14-q15},[r12]!
+
+ veor q8,q8,q12
+ vld1.32 {q0-q1},[r11]! @ load for next iteration
+ veor d25,d25,d25
+ vldr d24,[sp,#4*(16+4)] @ four
+ veor q9,q9,q13
+ vld1.32 {q2-q3},[r11]
+ veor q10,q10,q14
+ vst1.8 {q4-q5},[r14]!
+ veor q11,q11,q15
+ vst1.8 {q6-q7},[r14]!
+
+ vadd.i32 d6,d6,d24 @ next counter value
+ vldr d24,[sp,#4*(16+0)] @ one
+
+ ldmia sp,{r8-r11} @ load key material
+ add r0,r0,r8 @ accumulate key material
+ ldr r8,[r12],#16 @ load input
+ vst1.8 {q8-q9},[r14]!
+ add r1,r1,r9
+ ldr r9,[r12,#-12]
+ vst1.8 {q10-q11},[r14]!
+ add r2,r2,r10
+ ldr r10,[r12,#-8]
+ add r3,r3,r11
+ ldr r11,[r12,#-4]
+# ifdef __ARMEB__
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+# endif
+ eor r0,r0,r8 @ xor with input
+ add r8,sp,#4*(4)
+ eor r1,r1,r9
+ str r0,[r14],#16 @ store output
+ eor r2,r2,r10
+ str r1,[r14,#-12]
+ eor r3,r3,r11
+ ldmia r8,{r8-r11} @ load key material
+ str r2,[r14,#-8]
+ str r3,[r14,#-4]
+
+ add r4,r8,r4,ror#13 @ accumulate key material
+ ldr r8,[r12],#16 @ load input
+ add r5,r9,r5,ror#13
+ ldr r9,[r12,#-12]
+ add r6,r10,r6,ror#13
+ ldr r10,[r12,#-8]
+ add r7,r11,r7,ror#13
+ ldr r11,[r12,#-4]
+# ifdef __ARMEB__
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+ eor r4,r4,r8
+ add r8,sp,#4*(8)
+ eor r5,r5,r9
+ str r4,[r14],#16 @ store output
+ eor r6,r6,r10
+ str r5,[r14,#-12]
+ eor r7,r7,r11
+ ldmia r8,{r8-r11} @ load key material
+ str r6,[r14,#-8]
+ add r0,sp,#4*(16+8)
+ str r7,[r14,#-4]
+
+ ldmia r0,{r0-r7} @ load second half
+
+ add r0,r0,r8 @ accumulate key material
+ ldr r8,[r12],#16 @ load input
+ add r1,r1,r9
+ ldr r9,[r12,#-12]
+# ifdef __thumb2__
+ it hi
+# endif
+ strhi r10,[sp,#4*(16+10)] @ copy "rx" while at it
+ add r2,r2,r10
+ ldr r10,[r12,#-8]
+# ifdef __thumb2__
+ it hi
+# endif
+ strhi r11,[sp,#4*(16+11)] @ copy "rx" while at it
+ add r3,r3,r11
+ ldr r11,[r12,#-4]
+# ifdef __ARMEB__
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+# endif
+ eor r0,r0,r8
+ add r8,sp,#4*(12)
+ eor r1,r1,r9
+ str r0,[r14],#16 @ store output
+ eor r2,r2,r10
+ str r1,[r14,#-12]
+ eor r3,r3,r11
+ ldmia r8,{r8-r11} @ load key material
+ str r2,[r14,#-8]
+ str r3,[r14,#-4]
+
+ add r4,r8,r4,ror#24 @ accumulate key material
+ add r8,r8,#4 @ next counter value
+ add r5,r9,r5,ror#24
+ str r8,[sp,#4*(12)] @ save next counter value
+ ldr r8,[r12],#16 @ load input
+ add r6,r10,r6,ror#24
+ add r4,r4,#3 @ counter+3
+ ldr r9,[r12,#-12]
+ add r7,r11,r7,ror#24
+ ldr r10,[r12,#-8]
+ ldr r11,[r12,#-4]
+# ifdef __ARMEB__
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+ eor r4,r4,r8
+# ifdef __thumb2__
+ it hi
+# endif
+ ldrhi r8,[sp,#4*(32+2)] @ re-load len
+ eor r5,r5,r9
+ eor r6,r6,r10
+ str r4,[r14],#16 @ store output
+ eor r7,r7,r11
+ str r5,[r14,#-12]
+ sub r11,r8,#64*4 @ len-=64*4
+ str r6,[r14,#-8]
+ str r7,[r14,#-4]
+ bhi .Loop_neon_outer
+
+ b .Ldone_neon
+
+.align 4
+.Lbreak_neon:
+ @ harmonize NEON and integer-only stack frames: load data
+ @ from NEON frame, but save to integer-only one; distance
+ @ between the two is 4*(32+4+16-32)=4*(20).
+
+ str r11, [sp,#4*(20+32+2)] @ save len
+ add r11,sp,#4*(32+4)
+ str r12, [sp,#4*(20+32+1)] @ save inp
+ str r14, [sp,#4*(20+32+0)] @ save out
+
+ ldr r12,[sp,#4*(16+10)]
+ ldr r14,[sp,#4*(16+11)]
+ vldmia r11,{d8-d15} @ fulfill ABI requirement
+ str r12,[sp,#4*(20+16+10)] @ copy "rx"
+ str r14,[sp,#4*(20+16+11)] @ copy "rx"
+
+ ldr r11, [sp,#4*(15)]
+ mov r4,r4,ror#19 @ twist b[0..3]
+ ldr r12,[sp,#4*(12)] @ modulo-scheduled load
+ mov r5,r5,ror#19
+ ldr r10, [sp,#4*(13)]
+ mov r6,r6,ror#19
+ ldr r14,[sp,#4*(14)]
+ mov r7,r7,ror#19
+ mov r11,r11,ror#8 @ twist d[0..3]
+ mov r12,r12,ror#8
+ mov r10,r10,ror#8
+ mov r14,r14,ror#8
+ str r11, [sp,#4*(20+16+15)]
+ add r11,sp,#4*(20)
+ vst1.32 {q0-q1},[r11]! @ copy key
+ add sp,sp,#4*(20) @ switch frame
+ vst1.32 {q2-q3},[r11]
+ mov r11,#10
+ b .Loop @ go integer-only
+
+.align 4
+.Ltail_neon:
+ cmp r11,#64*3
+ bhs .L192_or_more_neon
+ cmp r11,#64*2
+ bhs .L128_or_more_neon
+ cmp r11,#64*1
+ bhs .L64_or_more_neon
+
+ add r8,sp,#4*(8)
+ vst1.8 {q0-q1},[sp]
+ add r10,sp,#4*(0)
+ vst1.8 {q2-q3},[r8]
+ b .Loop_tail_neon
+
+.align 4
+.L64_or_more_neon:
+ vld1.8 {q12-q13},[r12]!
+ vld1.8 {q14-q15},[r12]!
+ veor q0,q0,q12
+ veor q1,q1,q13
+ veor q2,q2,q14
+ veor q3,q3,q15
+ vst1.8 {q0-q1},[r14]!
+ vst1.8 {q2-q3},[r14]!
+
+ beq .Ldone_neon
+
+ add r8,sp,#4*(8)
+ vst1.8 {q4-q5},[sp]
+ add r10,sp,#4*(0)
+ vst1.8 {q6-q7},[r8]
+ sub r11,r11,#64*1 @ len-=64*1
+ b .Loop_tail_neon
+
+.align 4
+.L128_or_more_neon:
+ vld1.8 {q12-q13},[r12]!
+ vld1.8 {q14-q15},[r12]!
+ veor q0,q0,q12
+ veor q1,q1,q13
+ vld1.8 {q12-q13},[r12]!
+ veor q2,q2,q14
+ veor q3,q3,q15
+ vld1.8 {q14-q15},[r12]!
+
+ veor q4,q4,q12
+ veor q5,q5,q13
+ vst1.8 {q0-q1},[r14]!
+ veor q6,q6,q14
+ vst1.8 {q2-q3},[r14]!
+ veor q7,q7,q15
+ vst1.8 {q4-q5},[r14]!
+ vst1.8 {q6-q7},[r14]!
+
+ beq .Ldone_neon
+
+ add r8,sp,#4*(8)
+ vst1.8 {q8-q9},[sp]
+ add r10,sp,#4*(0)
+ vst1.8 {q10-q11},[r8]
+ sub r11,r11,#64*2 @ len-=64*2
+ b .Loop_tail_neon
+
+.align 4
+.L192_or_more_neon:
+ vld1.8 {q12-q13},[r12]!
+ vld1.8 {q14-q15},[r12]!
+ veor q0,q0,q12
+ veor q1,q1,q13
+ vld1.8 {q12-q13},[r12]!
+ veor q2,q2,q14
+ veor q3,q3,q15
+ vld1.8 {q14-q15},[r12]!
+
+ veor q4,q4,q12
+ veor q5,q5,q13
+ vld1.8 {q12-q13},[r12]!
+ veor q6,q6,q14
+ vst1.8 {q0-q1},[r14]!
+ veor q7,q7,q15
+ vld1.8 {q14-q15},[r12]!
+
+ veor q8,q8,q12
+ vst1.8 {q2-q3},[r14]!
+ veor q9,q9,q13
+ vst1.8 {q4-q5},[r14]!
+ veor q10,q10,q14
+ vst1.8 {q6-q7},[r14]!
+ veor q11,q11,q15
+ vst1.8 {q8-q9},[r14]!
+ vst1.8 {q10-q11},[r14]!
+
+ beq .Ldone_neon
+
+ ldmia sp,{r8-r11} @ load key material
+ add r0,r0,r8 @ accumulate key material
+ add r8,sp,#4*(4)
+ add r1,r1,r9
+ add r2,r2,r10
+ add r3,r3,r11
+ ldmia r8,{r8-r11} @ load key material
+
+ add r4,r8,r4,ror#13 @ accumulate key material
+ add r8,sp,#4*(8)
+ add r5,r9,r5,ror#13
+ add r6,r10,r6,ror#13
+ add r7,r11,r7,ror#13
+ ldmia r8,{r8-r11} @ load key material
+# ifdef __ARMEB__
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+ stmia sp,{r0-r7}
+ add r0,sp,#4*(16+8)
+
+ ldmia r0,{r0-r7} @ load second half
+
+ add r0,r0,r8 @ accumulate key material
+ add r8,sp,#4*(12)
+ add r1,r1,r9
+ add r2,r2,r10
+ add r3,r3,r11
+ ldmia r8,{r8-r11} @ load key material
+
+ add r4,r8,r4,ror#24 @ accumulate key material
+ add r8,sp,#4*(8)
+ add r5,r9,r5,ror#24
+ add r4,r4,#3 @ counter+3
+ add r6,r10,r6,ror#24
+ add r7,r11,r7,ror#24
+ ldr r11,[sp,#4*(32+2)] @ re-load len
+# ifdef __ARMEB__
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+ rev r7,r7
+# endif
+ stmia r8,{r0-r7}
+ add r10,sp,#4*(0)
+ sub r11,r11,#64*3 @ len-=64*3
+
+.Loop_tail_neon:
+ ldrb r8,[r10],#1 @ read buffer on stack
+ ldrb r9,[r12],#1 @ read input
+ subs r11,r11,#1
+ eor r8,r8,r9
+ strb r8,[r14],#1 @ store output
+ bne .Loop_tail_neon
+
+.Ldone_neon:
+ add sp,sp,#4*(32+4)
+ vldmia sp,{d8-d15}
+ add sp,sp,#4*(16+3)
+ ldmia sp!,{r4-r11,pc}
+.size ChaCha20_neon,.-ChaCha20_neon
+.comm OPENSSL_armcap_P,4,4
+#endif
diff --git a/lib/zinc/chacha20/chacha20-arm64-cryptogams.S b/lib/zinc/chacha20/chacha20-arm64-cryptogams.S
new file mode 100644
index 000000000000..4d029bfdad3a
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-arm64-cryptogams.S
@@ -0,0 +1,1973 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+#include "arm_arch.h"
+
+.text
+
+
+
+.align 5
+.Lsigma:
+.quad 0x3320646e61707865,0x6b20657479622d32 // endian-neutral
+.Lone:
+.long 1,0,0,0
+.LOPENSSL_armcap_P:
+#ifdef __ILP32__
+.long OPENSSL_armcap_P-.
+#else
+.quad OPENSSL_armcap_P-.
+#endif
+.byte 67,104,97,67,104,97,50,48,32,102,111,114,32,65,82,77,118,56,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
+.align 2
+
+.globl ChaCha20_ctr32
+.type ChaCha20_ctr32,%function
+.align 5
+ChaCha20_ctr32:
+ cbz x2,.Labort
+ adr x5,.LOPENSSL_armcap_P
+ cmp x2,#192
+ b.lo .Lshort
+#ifdef __ILP32__
+ ldrsw x6,[x5]
+#else
+ ldr x6,[x5]
+#endif
+ ldr w17,[x6,x5]
+ tst w17,#ARMV7_NEON
+ b.ne ChaCha20_neon
+
+.Lshort:
+ stp x29,x30,[sp,#-96]!
+ add x29,sp,#0
+
+ adr x5,.Lsigma
+ stp x19,x20,[sp,#16]
+ stp x21,x22,[sp,#32]
+ stp x23,x24,[sp,#48]
+ stp x25,x26,[sp,#64]
+ stp x27,x28,[sp,#80]
+ sub sp,sp,#64
+
+ ldp x22,x23,[x5] // load sigma
+ ldp x24,x25,[x3] // load key
+ ldp x26,x27,[x3,#16]
+ ldp x28,x30,[x4] // load counter
+#ifdef __ARMEB__
+ ror x24,x24,#32
+ ror x25,x25,#32
+ ror x26,x26,#32
+ ror x27,x27,#32
+ ror x28,x28,#32
+ ror x30,x30,#32
+#endif
+
+.Loop_outer:
+ mov w5,w22 // unpack key block
+ lsr x6,x22,#32
+ mov w7,w23
+ lsr x8,x23,#32
+ mov w9,w24
+ lsr x10,x24,#32
+ mov w11,w25
+ lsr x12,x25,#32
+ mov w13,w26
+ lsr x14,x26,#32
+ mov w15,w27
+ lsr x16,x27,#32
+ mov w17,w28
+ lsr x19,x28,#32
+ mov w20,w30
+ lsr x21,x30,#32
+
+ mov x4,#10
+ subs x2,x2,#64
+.Loop:
+ sub x4,x4,#1
+ add w5,w5,w9
+ add w6,w6,w10
+ add w7,w7,w11
+ add w8,w8,w12
+ eor w17,w17,w5
+ eor w19,w19,w6
+ eor w20,w20,w7
+ eor w21,w21,w8
+ ror w17,w17,#16
+ ror w19,w19,#16
+ ror w20,w20,#16
+ ror w21,w21,#16
+ add w13,w13,w17
+ add w14,w14,w19
+ add w15,w15,w20
+ add w16,w16,w21
+ eor w9,w9,w13
+ eor w10,w10,w14
+ eor w11,w11,w15
+ eor w12,w12,w16
+ ror w9,w9,#20
+ ror w10,w10,#20
+ ror w11,w11,#20
+ ror w12,w12,#20
+ add w5,w5,w9
+ add w6,w6,w10
+ add w7,w7,w11
+ add w8,w8,w12
+ eor w17,w17,w5
+ eor w19,w19,w6
+ eor w20,w20,w7
+ eor w21,w21,w8
+ ror w17,w17,#24
+ ror w19,w19,#24
+ ror w20,w20,#24
+ ror w21,w21,#24
+ add w13,w13,w17
+ add w14,w14,w19
+ add w15,w15,w20
+ add w16,w16,w21
+ eor w9,w9,w13
+ eor w10,w10,w14
+ eor w11,w11,w15
+ eor w12,w12,w16
+ ror w9,w9,#25
+ ror w10,w10,#25
+ ror w11,w11,#25
+ ror w12,w12,#25
+ add w5,w5,w10
+ add w6,w6,w11
+ add w7,w7,w12
+ add w8,w8,w9
+ eor w21,w21,w5
+ eor w17,w17,w6
+ eor w19,w19,w7
+ eor w20,w20,w8
+ ror w21,w21,#16
+ ror w17,w17,#16
+ ror w19,w19,#16
+ ror w20,w20,#16
+ add w15,w15,w21
+ add w16,w16,w17
+ add w13,w13,w19
+ add w14,w14,w20
+ eor w10,w10,w15
+ eor w11,w11,w16
+ eor w12,w12,w13
+ eor w9,w9,w14
+ ror w10,w10,#20
+ ror w11,w11,#20
+ ror w12,w12,#20
+ ror w9,w9,#20
+ add w5,w5,w10
+ add w6,w6,w11
+ add w7,w7,w12
+ add w8,w8,w9
+ eor w21,w21,w5
+ eor w17,w17,w6
+ eor w19,w19,w7
+ eor w20,w20,w8
+ ror w21,w21,#24
+ ror w17,w17,#24
+ ror w19,w19,#24
+ ror w20,w20,#24
+ add w15,w15,w21
+ add w16,w16,w17
+ add w13,w13,w19
+ add w14,w14,w20
+ eor w10,w10,w15
+ eor w11,w11,w16
+ eor w12,w12,w13
+ eor w9,w9,w14
+ ror w10,w10,#25
+ ror w11,w11,#25
+ ror w12,w12,#25
+ ror w9,w9,#25
+ cbnz x4,.Loop
+
+ add w5,w5,w22 // accumulate key block
+ add x6,x6,x22,lsr#32
+ add w7,w7,w23
+ add x8,x8,x23,lsr#32
+ add w9,w9,w24
+ add x10,x10,x24,lsr#32
+ add w11,w11,w25
+ add x12,x12,x25,lsr#32
+ add w13,w13,w26
+ add x14,x14,x26,lsr#32
+ add w15,w15,w27
+ add x16,x16,x27,lsr#32
+ add w17,w17,w28
+ add x19,x19,x28,lsr#32
+ add w20,w20,w30
+ add x21,x21,x30,lsr#32
+
+ b.lo .Ltail
+
+ add x5,x5,x6,lsl#32 // pack
+ add x7,x7,x8,lsl#32
+ ldp x6,x8,[x1,#0] // load input
+ add x9,x9,x10,lsl#32
+ add x11,x11,x12,lsl#32
+ ldp x10,x12,[x1,#16]
+ add x13,x13,x14,lsl#32
+ add x15,x15,x16,lsl#32
+ ldp x14,x16,[x1,#32]
+ add x17,x17,x19,lsl#32
+ add x20,x20,x21,lsl#32
+ ldp x19,x21,[x1,#48]
+ add x1,x1,#64
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ eor x5,x5,x6
+ eor x7,x7,x8
+ eor x9,x9,x10
+ eor x11,x11,x12
+ eor x13,x13,x14
+ eor x15,x15,x16
+ eor x17,x17,x19
+ eor x20,x20,x21
+
+ stp x5,x7,[x0,#0] // store output
+ add x28,x28,#1 // increment counter
+ stp x9,x11,[x0,#16]
+ stp x13,x15,[x0,#32]
+ stp x17,x20,[x0,#48]
+ add x0,x0,#64
+
+ b.hi .Loop_outer
+
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#64
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#96
+.Labort:
+ ret
+
+.align 4
+.Ltail:
+ add x2,x2,#64
+.Less_than_64:
+ sub x0,x0,#1
+ add x1,x1,x2
+ add x0,x0,x2
+ add x4,sp,x2
+ neg x2,x2
+
+ add x5,x5,x6,lsl#32 // pack
+ add x7,x7,x8,lsl#32
+ add x9,x9,x10,lsl#32
+ add x11,x11,x12,lsl#32
+ add x13,x13,x14,lsl#32
+ add x15,x15,x16,lsl#32
+ add x17,x17,x19,lsl#32
+ add x20,x20,x21,lsl#32
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ stp x5,x7,[sp,#0]
+ stp x9,x11,[sp,#16]
+ stp x13,x15,[sp,#32]
+ stp x17,x20,[sp,#48]
+
+.Loop_tail:
+ ldrb w10,[x1,x2]
+ ldrb w11,[x4,x2]
+ add x2,x2,#1
+ eor w10,w10,w11
+ strb w10,[x0,x2]
+ cbnz x2,.Loop_tail
+
+ stp xzr,xzr,[sp,#0]
+ stp xzr,xzr,[sp,#16]
+ stp xzr,xzr,[sp,#32]
+ stp xzr,xzr,[sp,#48]
+
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#64
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#96
+ ret
+.size ChaCha20_ctr32,.-ChaCha20_ctr32
+
+.type ChaCha20_neon,%function
+.align 5
+ChaCha20_neon:
+ stp x29,x30,[sp,#-96]!
+ add x29,sp,#0
+
+ adr x5,.Lsigma
+ stp x19,x20,[sp,#16]
+ stp x21,x22,[sp,#32]
+ stp x23,x24,[sp,#48]
+ stp x25,x26,[sp,#64]
+ stp x27,x28,[sp,#80]
+ cmp x2,#512
+ b.hs .L512_or_more_neon
+
+ sub sp,sp,#64
+
+ ldp x22,x23,[x5] // load sigma
+ ld1 {v24.4s},[x5],#16
+ ldp x24,x25,[x3] // load key
+ ldp x26,x27,[x3,#16]
+ ld1 {v25.4s,v26.4s},[x3]
+ ldp x28,x30,[x4] // load counter
+ ld1 {v27.4s},[x4]
+ ld1 {v31.4s},[x5]
+#ifdef __ARMEB__
+ rev64 v24.4s,v24.4s
+ ror x24,x24,#32
+ ror x25,x25,#32
+ ror x26,x26,#32
+ ror x27,x27,#32
+ ror x28,x28,#32
+ ror x30,x30,#32
+#endif
+ add v27.4s,v27.4s,v31.4s // += 1
+ add v28.4s,v27.4s,v31.4s
+ add v29.4s,v28.4s,v31.4s
+ shl v31.4s,v31.4s,#2 // 1 -> 4
+
+.Loop_outer_neon:
+ mov w5,w22 // unpack key block
+ lsr x6,x22,#32
+ mov v0.16b,v24.16b
+ mov w7,w23
+ lsr x8,x23,#32
+ mov v4.16b,v24.16b
+ mov w9,w24
+ lsr x10,x24,#32
+ mov v16.16b,v24.16b
+ mov w11,w25
+ mov v1.16b,v25.16b
+ lsr x12,x25,#32
+ mov v5.16b,v25.16b
+ mov w13,w26
+ mov v17.16b,v25.16b
+ lsr x14,x26,#32
+ mov v3.16b,v27.16b
+ mov w15,w27
+ mov v7.16b,v28.16b
+ lsr x16,x27,#32
+ mov v19.16b,v29.16b
+ mov w17,w28
+ mov v2.16b,v26.16b
+ lsr x19,x28,#32
+ mov v6.16b,v26.16b
+ mov w20,w30
+ mov v18.16b,v26.16b
+ lsr x21,x30,#32
+
+ mov x4,#10
+ subs x2,x2,#256
+.Loop_neon:
+ sub x4,x4,#1
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w9
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w10
+ add v16.4s,v16.4s,v17.4s
+ add w7,w7,w11
+ eor v3.16b,v3.16b,v0.16b
+ add w8,w8,w12
+ eor v7.16b,v7.16b,v4.16b
+ eor w17,w17,w5
+ eor v19.16b,v19.16b,v16.16b
+ eor w19,w19,w6
+ rev32 v3.8h,v3.8h
+ eor w20,w20,w7
+ rev32 v7.8h,v7.8h
+ eor w21,w21,w8
+ rev32 v19.8h,v19.8h
+ ror w17,w17,#16
+ add v2.4s,v2.4s,v3.4s
+ ror w19,w19,#16
+ add v6.4s,v6.4s,v7.4s
+ ror w20,w20,#16
+ add v18.4s,v18.4s,v19.4s
+ ror w21,w21,#16
+ eor v20.16b,v1.16b,v2.16b
+ add w13,w13,w17
+ eor v21.16b,v5.16b,v6.16b
+ add w14,w14,w19
+ eor v22.16b,v17.16b,v18.16b
+ add w15,w15,w20
+ ushr v1.4s,v20.4s,#20
+ add w16,w16,w21
+ ushr v5.4s,v21.4s,#20
+ eor w9,w9,w13
+ ushr v17.4s,v22.4s,#20
+ eor w10,w10,w14
+ sli v1.4s,v20.4s,#12
+ eor w11,w11,w15
+ sli v5.4s,v21.4s,#12
+ eor w12,w12,w16
+ sli v17.4s,v22.4s,#12
+ ror w9,w9,#20
+ add v0.4s,v0.4s,v1.4s
+ ror w10,w10,#20
+ add v4.4s,v4.4s,v5.4s
+ ror w11,w11,#20
+ add v16.4s,v16.4s,v17.4s
+ ror w12,w12,#20
+ eor v20.16b,v3.16b,v0.16b
+ add w5,w5,w9
+ eor v21.16b,v7.16b,v4.16b
+ add w6,w6,w10
+ eor v22.16b,v19.16b,v16.16b
+ add w7,w7,w11
+ ushr v3.4s,v20.4s,#24
+ add w8,w8,w12
+ ushr v7.4s,v21.4s,#24
+ eor w17,w17,w5
+ ushr v19.4s,v22.4s,#24
+ eor w19,w19,w6
+ sli v3.4s,v20.4s,#8
+ eor w20,w20,w7
+ sli v7.4s,v21.4s,#8
+ eor w21,w21,w8
+ sli v19.4s,v22.4s,#8
+ ror w17,w17,#24
+ add v2.4s,v2.4s,v3.4s
+ ror w19,w19,#24
+ add v6.4s,v6.4s,v7.4s
+ ror w20,w20,#24
+ add v18.4s,v18.4s,v19.4s
+ ror w21,w21,#24
+ eor v20.16b,v1.16b,v2.16b
+ add w13,w13,w17
+ eor v21.16b,v5.16b,v6.16b
+ add w14,w14,w19
+ eor v22.16b,v17.16b,v18.16b
+ add w15,w15,w20
+ ushr v1.4s,v20.4s,#25
+ add w16,w16,w21
+ ushr v5.4s,v21.4s,#25
+ eor w9,w9,w13
+ ushr v17.4s,v22.4s,#25
+ eor w10,w10,w14
+ sli v1.4s,v20.4s,#7
+ eor w11,w11,w15
+ sli v5.4s,v21.4s,#7
+ eor w12,w12,w16
+ sli v17.4s,v22.4s,#7
+ ror w9,w9,#25
+ ext v2.16b,v2.16b,v2.16b,#8
+ ror w10,w10,#25
+ ext v6.16b,v6.16b,v6.16b,#8
+ ror w11,w11,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w12,w12,#25
+ ext v3.16b,v3.16b,v3.16b,#12
+ ext v7.16b,v7.16b,v7.16b,#12
+ ext v19.16b,v19.16b,v19.16b,#12
+ ext v1.16b,v1.16b,v1.16b,#4
+ ext v5.16b,v5.16b,v5.16b,#4
+ ext v17.16b,v17.16b,v17.16b,#4
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w10
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w11
+ add v16.4s,v16.4s,v17.4s
+ add w7,w7,w12
+ eor v3.16b,v3.16b,v0.16b
+ add w8,w8,w9
+ eor v7.16b,v7.16b,v4.16b
+ eor w21,w21,w5
+ eor v19.16b,v19.16b,v16.16b
+ eor w17,w17,w6
+ rev32 v3.8h,v3.8h
+ eor w19,w19,w7
+ rev32 v7.8h,v7.8h
+ eor w20,w20,w8
+ rev32 v19.8h,v19.8h
+ ror w21,w21,#16
+ add v2.4s,v2.4s,v3.4s
+ ror w17,w17,#16
+ add v6.4s,v6.4s,v7.4s
+ ror w19,w19,#16
+ add v18.4s,v18.4s,v19.4s
+ ror w20,w20,#16
+ eor v20.16b,v1.16b,v2.16b
+ add w15,w15,w21
+ eor v21.16b,v5.16b,v6.16b
+ add w16,w16,w17
+ eor v22.16b,v17.16b,v18.16b
+ add w13,w13,w19
+ ushr v1.4s,v20.4s,#20
+ add w14,w14,w20
+ ushr v5.4s,v21.4s,#20
+ eor w10,w10,w15
+ ushr v17.4s,v22.4s,#20
+ eor w11,w11,w16
+ sli v1.4s,v20.4s,#12
+ eor w12,w12,w13
+ sli v5.4s,v21.4s,#12
+ eor w9,w9,w14
+ sli v17.4s,v22.4s,#12
+ ror w10,w10,#20
+ add v0.4s,v0.4s,v1.4s
+ ror w11,w11,#20
+ add v4.4s,v4.4s,v5.4s
+ ror w12,w12,#20
+ add v16.4s,v16.4s,v17.4s
+ ror w9,w9,#20
+ eor v20.16b,v3.16b,v0.16b
+ add w5,w5,w10
+ eor v21.16b,v7.16b,v4.16b
+ add w6,w6,w11
+ eor v22.16b,v19.16b,v16.16b
+ add w7,w7,w12
+ ushr v3.4s,v20.4s,#24
+ add w8,w8,w9
+ ushr v7.4s,v21.4s,#24
+ eor w21,w21,w5
+ ushr v19.4s,v22.4s,#24
+ eor w17,w17,w6
+ sli v3.4s,v20.4s,#8
+ eor w19,w19,w7
+ sli v7.4s,v21.4s,#8
+ eor w20,w20,w8
+ sli v19.4s,v22.4s,#8
+ ror w21,w21,#24
+ add v2.4s,v2.4s,v3.4s
+ ror w17,w17,#24
+ add v6.4s,v6.4s,v7.4s
+ ror w19,w19,#24
+ add v18.4s,v18.4s,v19.4s
+ ror w20,w20,#24
+ eor v20.16b,v1.16b,v2.16b
+ add w15,w15,w21
+ eor v21.16b,v5.16b,v6.16b
+ add w16,w16,w17
+ eor v22.16b,v17.16b,v18.16b
+ add w13,w13,w19
+ ushr v1.4s,v20.4s,#25
+ add w14,w14,w20
+ ushr v5.4s,v21.4s,#25
+ eor w10,w10,w15
+ ushr v17.4s,v22.4s,#25
+ eor w11,w11,w16
+ sli v1.4s,v20.4s,#7
+ eor w12,w12,w13
+ sli v5.4s,v21.4s,#7
+ eor w9,w9,w14
+ sli v17.4s,v22.4s,#7
+ ror w10,w10,#25
+ ext v2.16b,v2.16b,v2.16b,#8
+ ror w11,w11,#25
+ ext v6.16b,v6.16b,v6.16b,#8
+ ror w12,w12,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w9,w9,#25
+ ext v3.16b,v3.16b,v3.16b,#4
+ ext v7.16b,v7.16b,v7.16b,#4
+ ext v19.16b,v19.16b,v19.16b,#4
+ ext v1.16b,v1.16b,v1.16b,#12
+ ext v5.16b,v5.16b,v5.16b,#12
+ ext v17.16b,v17.16b,v17.16b,#12
+ cbnz x4,.Loop_neon
+
+ add w5,w5,w22 // accumulate key block
+ add v0.4s,v0.4s,v24.4s
+ add x6,x6,x22,lsr#32
+ add v4.4s,v4.4s,v24.4s
+ add w7,w7,w23
+ add v16.4s,v16.4s,v24.4s
+ add x8,x8,x23,lsr#32
+ add v2.4s,v2.4s,v26.4s
+ add w9,w9,w24
+ add v6.4s,v6.4s,v26.4s
+ add x10,x10,x24,lsr#32
+ add v18.4s,v18.4s,v26.4s
+ add w11,w11,w25
+ add v3.4s,v3.4s,v27.4s
+ add x12,x12,x25,lsr#32
+ add w13,w13,w26
+ add v7.4s,v7.4s,v28.4s
+ add x14,x14,x26,lsr#32
+ add w15,w15,w27
+ add v19.4s,v19.4s,v29.4s
+ add x16,x16,x27,lsr#32
+ add w17,w17,w28
+ add v1.4s,v1.4s,v25.4s
+ add x19,x19,x28,lsr#32
+ add w20,w20,w30
+ add v5.4s,v5.4s,v25.4s
+ add x21,x21,x30,lsr#32
+ add v17.4s,v17.4s,v25.4s
+
+ b.lo .Ltail_neon
+
+ add x5,x5,x6,lsl#32 // pack
+ add x7,x7,x8,lsl#32
+ ldp x6,x8,[x1,#0] // load input
+ add x9,x9,x10,lsl#32
+ add x11,x11,x12,lsl#32
+ ldp x10,x12,[x1,#16]
+ add x13,x13,x14,lsl#32
+ add x15,x15,x16,lsl#32
+ ldp x14,x16,[x1,#32]
+ add x17,x17,x19,lsl#32
+ add x20,x20,x21,lsl#32
+ ldp x19,x21,[x1,#48]
+ add x1,x1,#64
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ ld1 {v20.16b,v21.16b,v22.16b,v23.16b},[x1],#64
+ eor x5,x5,x6
+ eor x7,x7,x8
+ eor x9,x9,x10
+ eor x11,x11,x12
+ eor x13,x13,x14
+ eor v0.16b,v0.16b,v20.16b
+ eor x15,x15,x16
+ eor v1.16b,v1.16b,v21.16b
+ eor x17,x17,x19
+ eor v2.16b,v2.16b,v22.16b
+ eor x20,x20,x21
+ eor v3.16b,v3.16b,v23.16b
+ ld1 {v20.16b,v21.16b,v22.16b,v23.16b},[x1],#64
+
+ stp x5,x7,[x0,#0] // store output
+ add x28,x28,#4 // increment counter
+ stp x9,x11,[x0,#16]
+ add v27.4s,v27.4s,v31.4s // += 4
+ stp x13,x15,[x0,#32]
+ add v28.4s,v28.4s,v31.4s
+ stp x17,x20,[x0,#48]
+ add v29.4s,v29.4s,v31.4s
+ add x0,x0,#64
+
+ st1 {v0.16b,v1.16b,v2.16b,v3.16b},[x0],#64
+ ld1 {v0.16b,v1.16b,v2.16b,v3.16b},[x1],#64
+
+ eor v4.16b,v4.16b,v20.16b
+ eor v5.16b,v5.16b,v21.16b
+ eor v6.16b,v6.16b,v22.16b
+ eor v7.16b,v7.16b,v23.16b
+ st1 {v4.16b,v5.16b,v6.16b,v7.16b},[x0],#64
+
+ eor v16.16b,v16.16b,v0.16b
+ eor v17.16b,v17.16b,v1.16b
+ eor v18.16b,v18.16b,v2.16b
+ eor v19.16b,v19.16b,v3.16b
+ st1 {v16.16b,v17.16b,v18.16b,v19.16b},[x0],#64
+
+ b.hi .Loop_outer_neon
+
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#64
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#96
+ ret
+
+.Ltail_neon:
+ add x2,x2,#256
+ cmp x2,#64
+ b.lo .Less_than_64
+
+ add x5,x5,x6,lsl#32 // pack
+ add x7,x7,x8,lsl#32
+ ldp x6,x8,[x1,#0] // load input
+ add x9,x9,x10,lsl#32
+ add x11,x11,x12,lsl#32
+ ldp x10,x12,[x1,#16]
+ add x13,x13,x14,lsl#32
+ add x15,x15,x16,lsl#32
+ ldp x14,x16,[x1,#32]
+ add x17,x17,x19,lsl#32
+ add x20,x20,x21,lsl#32
+ ldp x19,x21,[x1,#48]
+ add x1,x1,#64
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ eor x5,x5,x6
+ eor x7,x7,x8
+ eor x9,x9,x10
+ eor x11,x11,x12
+ eor x13,x13,x14
+ eor x15,x15,x16
+ eor x17,x17,x19
+ eor x20,x20,x21
+
+ stp x5,x7,[x0,#0] // store output
+ add x28,x28,#4 // increment counter
+ stp x9,x11,[x0,#16]
+ stp x13,x15,[x0,#32]
+ stp x17,x20,[x0,#48]
+ add x0,x0,#64
+ b.eq .Ldone_neon
+ sub x2,x2,#64
+ cmp x2,#64
+ b.lo .Less_than_128
+
+ ld1 {v20.16b,v21.16b,v22.16b,v23.16b},[x1],#64
+ eor v0.16b,v0.16b,v20.16b
+ eor v1.16b,v1.16b,v21.16b
+ eor v2.16b,v2.16b,v22.16b
+ eor v3.16b,v3.16b,v23.16b
+ st1 {v0.16b,v1.16b,v2.16b,v3.16b},[x0],#64
+ b.eq .Ldone_neon
+ sub x2,x2,#64
+ cmp x2,#64
+ b.lo .Less_than_192
+
+ ld1 {v20.16b,v21.16b,v22.16b,v23.16b},[x1],#64
+ eor v4.16b,v4.16b,v20.16b
+ eor v5.16b,v5.16b,v21.16b
+ eor v6.16b,v6.16b,v22.16b
+ eor v7.16b,v7.16b,v23.16b
+ st1 {v4.16b,v5.16b,v6.16b,v7.16b},[x0],#64
+ b.eq .Ldone_neon
+ sub x2,x2,#64
+
+ st1 {v16.16b,v17.16b,v18.16b,v19.16b},[sp]
+ b .Last_neon
+
+.Less_than_128:
+ st1 {v0.16b,v1.16b,v2.16b,v3.16b},[sp]
+ b .Last_neon
+.Less_than_192:
+ st1 {v4.16b,v5.16b,v6.16b,v7.16b},[sp]
+ b .Last_neon
+
+.align 4
+.Last_neon:
+ sub x0,x0,#1
+ add x1,x1,x2
+ add x0,x0,x2
+ add x4,sp,x2
+ neg x2,x2
+
+.Loop_tail_neon:
+ ldrb w10,[x1,x2]
+ ldrb w11,[x4,x2]
+ add x2,x2,#1
+ eor w10,w10,w11
+ strb w10,[x0,x2]
+ cbnz x2,.Loop_tail_neon
+
+ stp xzr,xzr,[sp,#0]
+ stp xzr,xzr,[sp,#16]
+ stp xzr,xzr,[sp,#32]
+ stp xzr,xzr,[sp,#48]
+
+.Ldone_neon:
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#64
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#96
+ ret
+.size ChaCha20_neon,.-ChaCha20_neon
+.type ChaCha20_512_neon,%function
+.align 5
+ChaCha20_512_neon:
+ stp x29,x30,[sp,#-96]!
+ add x29,sp,#0
+
+ adr x5,.Lsigma
+ stp x19,x20,[sp,#16]
+ stp x21,x22,[sp,#32]
+ stp x23,x24,[sp,#48]
+ stp x25,x26,[sp,#64]
+ stp x27,x28,[sp,#80]
+
+.L512_or_more_neon:
+ sub sp,sp,#128+64
+
+ ldp x22,x23,[x5] // load sigma
+ ld1 {v24.4s},[x5],#16
+ ldp x24,x25,[x3] // load key
+ ldp x26,x27,[x3,#16]
+ ld1 {v25.4s,v26.4s},[x3]
+ ldp x28,x30,[x4] // load counter
+ ld1 {v27.4s},[x4]
+ ld1 {v31.4s},[x5]
+#ifdef __ARMEB__
+ rev64 v24.4s,v24.4s
+ ror x24,x24,#32
+ ror x25,x25,#32
+ ror x26,x26,#32
+ ror x27,x27,#32
+ ror x28,x28,#32
+ ror x30,x30,#32
+#endif
+ add v27.4s,v27.4s,v31.4s // += 1
+ stp q24,q25,[sp,#0] // off-load key block, invariant part
+ add v27.4s,v27.4s,v31.4s // not typo
+ str q26,[sp,#32]
+ add v28.4s,v27.4s,v31.4s
+ add v29.4s,v28.4s,v31.4s
+ add v30.4s,v29.4s,v31.4s
+ shl v31.4s,v31.4s,#2 // 1 -> 4
+
+ stp d8,d9,[sp,#128+0] // meet ABI requirements
+ stp d10,d11,[sp,#128+16]
+ stp d12,d13,[sp,#128+32]
+ stp d14,d15,[sp,#128+48]
+
+ sub x2,x2,#512 // not typo
+
+.Loop_outer_512_neon:
+ mov v0.16b,v24.16b
+ mov v4.16b,v24.16b
+ mov v8.16b,v24.16b
+ mov v12.16b,v24.16b
+ mov v16.16b,v24.16b
+ mov v20.16b,v24.16b
+ mov v1.16b,v25.16b
+ mov w5,w22 // unpack key block
+ mov v5.16b,v25.16b
+ lsr x6,x22,#32
+ mov v9.16b,v25.16b
+ mov w7,w23
+ mov v13.16b,v25.16b
+ lsr x8,x23,#32
+ mov v17.16b,v25.16b
+ mov w9,w24
+ mov v21.16b,v25.16b
+ lsr x10,x24,#32
+ mov v3.16b,v27.16b
+ mov w11,w25
+ mov v7.16b,v28.16b
+ lsr x12,x25,#32
+ mov v11.16b,v29.16b
+ mov w13,w26
+ mov v15.16b,v30.16b
+ lsr x14,x26,#32
+ mov v2.16b,v26.16b
+ mov w15,w27
+ mov v6.16b,v26.16b
+ lsr x16,x27,#32
+ add v19.4s,v3.4s,v31.4s // +4
+ mov w17,w28
+ add v23.4s,v7.4s,v31.4s // +4
+ lsr x19,x28,#32
+ mov v10.16b,v26.16b
+ mov w20,w30
+ mov v14.16b,v26.16b
+ lsr x21,x30,#32
+ mov v18.16b,v26.16b
+ stp q27,q28,[sp,#48] // off-load key block, variable part
+ mov v22.16b,v26.16b
+ str q29,[sp,#80]
+
+ mov x4,#5
+ subs x2,x2,#512
+.Loop_upper_neon:
+ sub x4,x4,#1
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w9
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w10
+ add v8.4s,v8.4s,v9.4s
+ add w7,w7,w11
+ add v12.4s,v12.4s,v13.4s
+ add w8,w8,w12
+ add v16.4s,v16.4s,v17.4s
+ eor w17,w17,w5
+ add v20.4s,v20.4s,v21.4s
+ eor w19,w19,w6
+ eor v3.16b,v3.16b,v0.16b
+ eor w20,w20,w7
+ eor v7.16b,v7.16b,v4.16b
+ eor w21,w21,w8
+ eor v11.16b,v11.16b,v8.16b
+ ror w17,w17,#16
+ eor v15.16b,v15.16b,v12.16b
+ ror w19,w19,#16
+ eor v19.16b,v19.16b,v16.16b
+ ror w20,w20,#16
+ eor v23.16b,v23.16b,v20.16b
+ ror w21,w21,#16
+ rev32 v3.8h,v3.8h
+ add w13,w13,w17
+ rev32 v7.8h,v7.8h
+ add w14,w14,w19
+ rev32 v11.8h,v11.8h
+ add w15,w15,w20
+ rev32 v15.8h,v15.8h
+ add w16,w16,w21
+ rev32 v19.8h,v19.8h
+ eor w9,w9,w13
+ rev32 v23.8h,v23.8h
+ eor w10,w10,w14
+ add v2.4s,v2.4s,v3.4s
+ eor w11,w11,w15
+ add v6.4s,v6.4s,v7.4s
+ eor w12,w12,w16
+ add v10.4s,v10.4s,v11.4s
+ ror w9,w9,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w10,w10,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w11,w11,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w12,w12,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w9
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w10
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w11
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w12
+ eor v28.16b,v17.16b,v18.16b
+ eor w17,w17,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w19,w19,w6
+ ushr v1.4s,v24.4s,#20
+ eor w20,w20,w7
+ ushr v5.4s,v25.4s,#20
+ eor w21,w21,w8
+ ushr v9.4s,v26.4s,#20
+ ror w17,w17,#24
+ ushr v13.4s,v27.4s,#20
+ ror w19,w19,#24
+ ushr v17.4s,v28.4s,#20
+ ror w20,w20,#24
+ ushr v21.4s,v29.4s,#20
+ ror w21,w21,#24
+ sli v1.4s,v24.4s,#12
+ add w13,w13,w17
+ sli v5.4s,v25.4s,#12
+ add w14,w14,w19
+ sli v9.4s,v26.4s,#12
+ add w15,w15,w20
+ sli v13.4s,v27.4s,#12
+ add w16,w16,w21
+ sli v17.4s,v28.4s,#12
+ eor w9,w9,w13
+ sli v21.4s,v29.4s,#12
+ eor w10,w10,w14
+ add v0.4s,v0.4s,v1.4s
+ eor w11,w11,w15
+ add v4.4s,v4.4s,v5.4s
+ eor w12,w12,w16
+ add v8.4s,v8.4s,v9.4s
+ ror w9,w9,#25
+ add v12.4s,v12.4s,v13.4s
+ ror w10,w10,#25
+ add v16.4s,v16.4s,v17.4s
+ ror w11,w11,#25
+ add v20.4s,v20.4s,v21.4s
+ ror w12,w12,#25
+ eor v24.16b,v3.16b,v0.16b
+ add w5,w5,w10
+ eor v25.16b,v7.16b,v4.16b
+ add w6,w6,w11
+ eor v26.16b,v11.16b,v8.16b
+ add w7,w7,w12
+ eor v27.16b,v15.16b,v12.16b
+ add w8,w8,w9
+ eor v28.16b,v19.16b,v16.16b
+ eor w21,w21,w5
+ eor v29.16b,v23.16b,v20.16b
+ eor w17,w17,w6
+ ushr v3.4s,v24.4s,#24
+ eor w19,w19,w7
+ ushr v7.4s,v25.4s,#24
+ eor w20,w20,w8
+ ushr v11.4s,v26.4s,#24
+ ror w21,w21,#16
+ ushr v15.4s,v27.4s,#24
+ ror w17,w17,#16
+ ushr v19.4s,v28.4s,#24
+ ror w19,w19,#16
+ ushr v23.4s,v29.4s,#24
+ ror w20,w20,#16
+ sli v3.4s,v24.4s,#8
+ add w15,w15,w21
+ sli v7.4s,v25.4s,#8
+ add w16,w16,w17
+ sli v11.4s,v26.4s,#8
+ add w13,w13,w19
+ sli v15.4s,v27.4s,#8
+ add w14,w14,w20
+ sli v19.4s,v28.4s,#8
+ eor w10,w10,w15
+ sli v23.4s,v29.4s,#8
+ eor w11,w11,w16
+ add v2.4s,v2.4s,v3.4s
+ eor w12,w12,w13
+ add v6.4s,v6.4s,v7.4s
+ eor w9,w9,w14
+ add v10.4s,v10.4s,v11.4s
+ ror w10,w10,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w11,w11,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w12,w12,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w9,w9,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w10
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w11
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w12
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w9
+ eor v28.16b,v17.16b,v18.16b
+ eor w21,w21,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w17,w17,w6
+ ushr v1.4s,v24.4s,#25
+ eor w19,w19,w7
+ ushr v5.4s,v25.4s,#25
+ eor w20,w20,w8
+ ushr v9.4s,v26.4s,#25
+ ror w21,w21,#24
+ ushr v13.4s,v27.4s,#25
+ ror w17,w17,#24
+ ushr v17.4s,v28.4s,#25
+ ror w19,w19,#24
+ ushr v21.4s,v29.4s,#25
+ ror w20,w20,#24
+ sli v1.4s,v24.4s,#7
+ add w15,w15,w21
+ sli v5.4s,v25.4s,#7
+ add w16,w16,w17
+ sli v9.4s,v26.4s,#7
+ add w13,w13,w19
+ sli v13.4s,v27.4s,#7
+ add w14,w14,w20
+ sli v17.4s,v28.4s,#7
+ eor w10,w10,w15
+ sli v21.4s,v29.4s,#7
+ eor w11,w11,w16
+ ext v2.16b,v2.16b,v2.16b,#8
+ eor w12,w12,w13
+ ext v6.16b,v6.16b,v6.16b,#8
+ eor w9,w9,w14
+ ext v10.16b,v10.16b,v10.16b,#8
+ ror w10,w10,#25
+ ext v14.16b,v14.16b,v14.16b,#8
+ ror w11,w11,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w12,w12,#25
+ ext v22.16b,v22.16b,v22.16b,#8
+ ror w9,w9,#25
+ ext v3.16b,v3.16b,v3.16b,#12
+ ext v7.16b,v7.16b,v7.16b,#12
+ ext v11.16b,v11.16b,v11.16b,#12
+ ext v15.16b,v15.16b,v15.16b,#12
+ ext v19.16b,v19.16b,v19.16b,#12
+ ext v23.16b,v23.16b,v23.16b,#12
+ ext v1.16b,v1.16b,v1.16b,#4
+ ext v5.16b,v5.16b,v5.16b,#4
+ ext v9.16b,v9.16b,v9.16b,#4
+ ext v13.16b,v13.16b,v13.16b,#4
+ ext v17.16b,v17.16b,v17.16b,#4
+ ext v21.16b,v21.16b,v21.16b,#4
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w9
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w10
+ add v8.4s,v8.4s,v9.4s
+ add w7,w7,w11
+ add v12.4s,v12.4s,v13.4s
+ add w8,w8,w12
+ add v16.4s,v16.4s,v17.4s
+ eor w17,w17,w5
+ add v20.4s,v20.4s,v21.4s
+ eor w19,w19,w6
+ eor v3.16b,v3.16b,v0.16b
+ eor w20,w20,w7
+ eor v7.16b,v7.16b,v4.16b
+ eor w21,w21,w8
+ eor v11.16b,v11.16b,v8.16b
+ ror w17,w17,#16
+ eor v15.16b,v15.16b,v12.16b
+ ror w19,w19,#16
+ eor v19.16b,v19.16b,v16.16b
+ ror w20,w20,#16
+ eor v23.16b,v23.16b,v20.16b
+ ror w21,w21,#16
+ rev32 v3.8h,v3.8h
+ add w13,w13,w17
+ rev32 v7.8h,v7.8h
+ add w14,w14,w19
+ rev32 v11.8h,v11.8h
+ add w15,w15,w20
+ rev32 v15.8h,v15.8h
+ add w16,w16,w21
+ rev32 v19.8h,v19.8h
+ eor w9,w9,w13
+ rev32 v23.8h,v23.8h
+ eor w10,w10,w14
+ add v2.4s,v2.4s,v3.4s
+ eor w11,w11,w15
+ add v6.4s,v6.4s,v7.4s
+ eor w12,w12,w16
+ add v10.4s,v10.4s,v11.4s
+ ror w9,w9,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w10,w10,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w11,w11,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w12,w12,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w9
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w10
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w11
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w12
+ eor v28.16b,v17.16b,v18.16b
+ eor w17,w17,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w19,w19,w6
+ ushr v1.4s,v24.4s,#20
+ eor w20,w20,w7
+ ushr v5.4s,v25.4s,#20
+ eor w21,w21,w8
+ ushr v9.4s,v26.4s,#20
+ ror w17,w17,#24
+ ushr v13.4s,v27.4s,#20
+ ror w19,w19,#24
+ ushr v17.4s,v28.4s,#20
+ ror w20,w20,#24
+ ushr v21.4s,v29.4s,#20
+ ror w21,w21,#24
+ sli v1.4s,v24.4s,#12
+ add w13,w13,w17
+ sli v5.4s,v25.4s,#12
+ add w14,w14,w19
+ sli v9.4s,v26.4s,#12
+ add w15,w15,w20
+ sli v13.4s,v27.4s,#12
+ add w16,w16,w21
+ sli v17.4s,v28.4s,#12
+ eor w9,w9,w13
+ sli v21.4s,v29.4s,#12
+ eor w10,w10,w14
+ add v0.4s,v0.4s,v1.4s
+ eor w11,w11,w15
+ add v4.4s,v4.4s,v5.4s
+ eor w12,w12,w16
+ add v8.4s,v8.4s,v9.4s
+ ror w9,w9,#25
+ add v12.4s,v12.4s,v13.4s
+ ror w10,w10,#25
+ add v16.4s,v16.4s,v17.4s
+ ror w11,w11,#25
+ add v20.4s,v20.4s,v21.4s
+ ror w12,w12,#25
+ eor v24.16b,v3.16b,v0.16b
+ add w5,w5,w10
+ eor v25.16b,v7.16b,v4.16b
+ add w6,w6,w11
+ eor v26.16b,v11.16b,v8.16b
+ add w7,w7,w12
+ eor v27.16b,v15.16b,v12.16b
+ add w8,w8,w9
+ eor v28.16b,v19.16b,v16.16b
+ eor w21,w21,w5
+ eor v29.16b,v23.16b,v20.16b
+ eor w17,w17,w6
+ ushr v3.4s,v24.4s,#24
+ eor w19,w19,w7
+ ushr v7.4s,v25.4s,#24
+ eor w20,w20,w8
+ ushr v11.4s,v26.4s,#24
+ ror w21,w21,#16
+ ushr v15.4s,v27.4s,#24
+ ror w17,w17,#16
+ ushr v19.4s,v28.4s,#24
+ ror w19,w19,#16
+ ushr v23.4s,v29.4s,#24
+ ror w20,w20,#16
+ sli v3.4s,v24.4s,#8
+ add w15,w15,w21
+ sli v7.4s,v25.4s,#8
+ add w16,w16,w17
+ sli v11.4s,v26.4s,#8
+ add w13,w13,w19
+ sli v15.4s,v27.4s,#8
+ add w14,w14,w20
+ sli v19.4s,v28.4s,#8
+ eor w10,w10,w15
+ sli v23.4s,v29.4s,#8
+ eor w11,w11,w16
+ add v2.4s,v2.4s,v3.4s
+ eor w12,w12,w13
+ add v6.4s,v6.4s,v7.4s
+ eor w9,w9,w14
+ add v10.4s,v10.4s,v11.4s
+ ror w10,w10,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w11,w11,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w12,w12,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w9,w9,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w10
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w11
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w12
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w9
+ eor v28.16b,v17.16b,v18.16b
+ eor w21,w21,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w17,w17,w6
+ ushr v1.4s,v24.4s,#25
+ eor w19,w19,w7
+ ushr v5.4s,v25.4s,#25
+ eor w20,w20,w8
+ ushr v9.4s,v26.4s,#25
+ ror w21,w21,#24
+ ushr v13.4s,v27.4s,#25
+ ror w17,w17,#24
+ ushr v17.4s,v28.4s,#25
+ ror w19,w19,#24
+ ushr v21.4s,v29.4s,#25
+ ror w20,w20,#24
+ sli v1.4s,v24.4s,#7
+ add w15,w15,w21
+ sli v5.4s,v25.4s,#7
+ add w16,w16,w17
+ sli v9.4s,v26.4s,#7
+ add w13,w13,w19
+ sli v13.4s,v27.4s,#7
+ add w14,w14,w20
+ sli v17.4s,v28.4s,#7
+ eor w10,w10,w15
+ sli v21.4s,v29.4s,#7
+ eor w11,w11,w16
+ ext v2.16b,v2.16b,v2.16b,#8
+ eor w12,w12,w13
+ ext v6.16b,v6.16b,v6.16b,#8
+ eor w9,w9,w14
+ ext v10.16b,v10.16b,v10.16b,#8
+ ror w10,w10,#25
+ ext v14.16b,v14.16b,v14.16b,#8
+ ror w11,w11,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w12,w12,#25
+ ext v22.16b,v22.16b,v22.16b,#8
+ ror w9,w9,#25
+ ext v3.16b,v3.16b,v3.16b,#4
+ ext v7.16b,v7.16b,v7.16b,#4
+ ext v11.16b,v11.16b,v11.16b,#4
+ ext v15.16b,v15.16b,v15.16b,#4
+ ext v19.16b,v19.16b,v19.16b,#4
+ ext v23.16b,v23.16b,v23.16b,#4
+ ext v1.16b,v1.16b,v1.16b,#12
+ ext v5.16b,v5.16b,v5.16b,#12
+ ext v9.16b,v9.16b,v9.16b,#12
+ ext v13.16b,v13.16b,v13.16b,#12
+ ext v17.16b,v17.16b,v17.16b,#12
+ ext v21.16b,v21.16b,v21.16b,#12
+ cbnz x4,.Loop_upper_neon
+
+ add w5,w5,w22 // accumulate key block
+ add x6,x6,x22,lsr#32
+ add w7,w7,w23
+ add x8,x8,x23,lsr#32
+ add w9,w9,w24
+ add x10,x10,x24,lsr#32
+ add w11,w11,w25
+ add x12,x12,x25,lsr#32
+ add w13,w13,w26
+ add x14,x14,x26,lsr#32
+ add w15,w15,w27
+ add x16,x16,x27,lsr#32
+ add w17,w17,w28
+ add x19,x19,x28,lsr#32
+ add w20,w20,w30
+ add x21,x21,x30,lsr#32
+
+ add x5,x5,x6,lsl#32 // pack
+ add x7,x7,x8,lsl#32
+ ldp x6,x8,[x1,#0] // load input
+ add x9,x9,x10,lsl#32
+ add x11,x11,x12,lsl#32
+ ldp x10,x12,[x1,#16]
+ add x13,x13,x14,lsl#32
+ add x15,x15,x16,lsl#32
+ ldp x14,x16,[x1,#32]
+ add x17,x17,x19,lsl#32
+ add x20,x20,x21,lsl#32
+ ldp x19,x21,[x1,#48]
+ add x1,x1,#64
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ eor x5,x5,x6
+ eor x7,x7,x8
+ eor x9,x9,x10
+ eor x11,x11,x12
+ eor x13,x13,x14
+ eor x15,x15,x16
+ eor x17,x17,x19
+ eor x20,x20,x21
+
+ stp x5,x7,[x0,#0] // store output
+ add x28,x28,#1 // increment counter
+ mov w5,w22 // unpack key block
+ lsr x6,x22,#32
+ stp x9,x11,[x0,#16]
+ mov w7,w23
+ lsr x8,x23,#32
+ stp x13,x15,[x0,#32]
+ mov w9,w24
+ lsr x10,x24,#32
+ stp x17,x20,[x0,#48]
+ add x0,x0,#64
+ mov w11,w25
+ lsr x12,x25,#32
+ mov w13,w26
+ lsr x14,x26,#32
+ mov w15,w27
+ lsr x16,x27,#32
+ mov w17,w28
+ lsr x19,x28,#32
+ mov w20,w30
+ lsr x21,x30,#32
+
+ mov x4,#5
+.Loop_lower_neon:
+ sub x4,x4,#1
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w9
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w10
+ add v8.4s,v8.4s,v9.4s
+ add w7,w7,w11
+ add v12.4s,v12.4s,v13.4s
+ add w8,w8,w12
+ add v16.4s,v16.4s,v17.4s
+ eor w17,w17,w5
+ add v20.4s,v20.4s,v21.4s
+ eor w19,w19,w6
+ eor v3.16b,v3.16b,v0.16b
+ eor w20,w20,w7
+ eor v7.16b,v7.16b,v4.16b
+ eor w21,w21,w8
+ eor v11.16b,v11.16b,v8.16b
+ ror w17,w17,#16
+ eor v15.16b,v15.16b,v12.16b
+ ror w19,w19,#16
+ eor v19.16b,v19.16b,v16.16b
+ ror w20,w20,#16
+ eor v23.16b,v23.16b,v20.16b
+ ror w21,w21,#16
+ rev32 v3.8h,v3.8h
+ add w13,w13,w17
+ rev32 v7.8h,v7.8h
+ add w14,w14,w19
+ rev32 v11.8h,v11.8h
+ add w15,w15,w20
+ rev32 v15.8h,v15.8h
+ add w16,w16,w21
+ rev32 v19.8h,v19.8h
+ eor w9,w9,w13
+ rev32 v23.8h,v23.8h
+ eor w10,w10,w14
+ add v2.4s,v2.4s,v3.4s
+ eor w11,w11,w15
+ add v6.4s,v6.4s,v7.4s
+ eor w12,w12,w16
+ add v10.4s,v10.4s,v11.4s
+ ror w9,w9,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w10,w10,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w11,w11,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w12,w12,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w9
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w10
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w11
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w12
+ eor v28.16b,v17.16b,v18.16b
+ eor w17,w17,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w19,w19,w6
+ ushr v1.4s,v24.4s,#20
+ eor w20,w20,w7
+ ushr v5.4s,v25.4s,#20
+ eor w21,w21,w8
+ ushr v9.4s,v26.4s,#20
+ ror w17,w17,#24
+ ushr v13.4s,v27.4s,#20
+ ror w19,w19,#24
+ ushr v17.4s,v28.4s,#20
+ ror w20,w20,#24
+ ushr v21.4s,v29.4s,#20
+ ror w21,w21,#24
+ sli v1.4s,v24.4s,#12
+ add w13,w13,w17
+ sli v5.4s,v25.4s,#12
+ add w14,w14,w19
+ sli v9.4s,v26.4s,#12
+ add w15,w15,w20
+ sli v13.4s,v27.4s,#12
+ add w16,w16,w21
+ sli v17.4s,v28.4s,#12
+ eor w9,w9,w13
+ sli v21.4s,v29.4s,#12
+ eor w10,w10,w14
+ add v0.4s,v0.4s,v1.4s
+ eor w11,w11,w15
+ add v4.4s,v4.4s,v5.4s
+ eor w12,w12,w16
+ add v8.4s,v8.4s,v9.4s
+ ror w9,w9,#25
+ add v12.4s,v12.4s,v13.4s
+ ror w10,w10,#25
+ add v16.4s,v16.4s,v17.4s
+ ror w11,w11,#25
+ add v20.4s,v20.4s,v21.4s
+ ror w12,w12,#25
+ eor v24.16b,v3.16b,v0.16b
+ add w5,w5,w10
+ eor v25.16b,v7.16b,v4.16b
+ add w6,w6,w11
+ eor v26.16b,v11.16b,v8.16b
+ add w7,w7,w12
+ eor v27.16b,v15.16b,v12.16b
+ add w8,w8,w9
+ eor v28.16b,v19.16b,v16.16b
+ eor w21,w21,w5
+ eor v29.16b,v23.16b,v20.16b
+ eor w17,w17,w6
+ ushr v3.4s,v24.4s,#24
+ eor w19,w19,w7
+ ushr v7.4s,v25.4s,#24
+ eor w20,w20,w8
+ ushr v11.4s,v26.4s,#24
+ ror w21,w21,#16
+ ushr v15.4s,v27.4s,#24
+ ror w17,w17,#16
+ ushr v19.4s,v28.4s,#24
+ ror w19,w19,#16
+ ushr v23.4s,v29.4s,#24
+ ror w20,w20,#16
+ sli v3.4s,v24.4s,#8
+ add w15,w15,w21
+ sli v7.4s,v25.4s,#8
+ add w16,w16,w17
+ sli v11.4s,v26.4s,#8
+ add w13,w13,w19
+ sli v15.4s,v27.4s,#8
+ add w14,w14,w20
+ sli v19.4s,v28.4s,#8
+ eor w10,w10,w15
+ sli v23.4s,v29.4s,#8
+ eor w11,w11,w16
+ add v2.4s,v2.4s,v3.4s
+ eor w12,w12,w13
+ add v6.4s,v6.4s,v7.4s
+ eor w9,w9,w14
+ add v10.4s,v10.4s,v11.4s
+ ror w10,w10,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w11,w11,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w12,w12,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w9,w9,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w10
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w11
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w12
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w9
+ eor v28.16b,v17.16b,v18.16b
+ eor w21,w21,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w17,w17,w6
+ ushr v1.4s,v24.4s,#25
+ eor w19,w19,w7
+ ushr v5.4s,v25.4s,#25
+ eor w20,w20,w8
+ ushr v9.4s,v26.4s,#25
+ ror w21,w21,#24
+ ushr v13.4s,v27.4s,#25
+ ror w17,w17,#24
+ ushr v17.4s,v28.4s,#25
+ ror w19,w19,#24
+ ushr v21.4s,v29.4s,#25
+ ror w20,w20,#24
+ sli v1.4s,v24.4s,#7
+ add w15,w15,w21
+ sli v5.4s,v25.4s,#7
+ add w16,w16,w17
+ sli v9.4s,v26.4s,#7
+ add w13,w13,w19
+ sli v13.4s,v27.4s,#7
+ add w14,w14,w20
+ sli v17.4s,v28.4s,#7
+ eor w10,w10,w15
+ sli v21.4s,v29.4s,#7
+ eor w11,w11,w16
+ ext v2.16b,v2.16b,v2.16b,#8
+ eor w12,w12,w13
+ ext v6.16b,v6.16b,v6.16b,#8
+ eor w9,w9,w14
+ ext v10.16b,v10.16b,v10.16b,#8
+ ror w10,w10,#25
+ ext v14.16b,v14.16b,v14.16b,#8
+ ror w11,w11,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w12,w12,#25
+ ext v22.16b,v22.16b,v22.16b,#8
+ ror w9,w9,#25
+ ext v3.16b,v3.16b,v3.16b,#12
+ ext v7.16b,v7.16b,v7.16b,#12
+ ext v11.16b,v11.16b,v11.16b,#12
+ ext v15.16b,v15.16b,v15.16b,#12
+ ext v19.16b,v19.16b,v19.16b,#12
+ ext v23.16b,v23.16b,v23.16b,#12
+ ext v1.16b,v1.16b,v1.16b,#4
+ ext v5.16b,v5.16b,v5.16b,#4
+ ext v9.16b,v9.16b,v9.16b,#4
+ ext v13.16b,v13.16b,v13.16b,#4
+ ext v17.16b,v17.16b,v17.16b,#4
+ ext v21.16b,v21.16b,v21.16b,#4
+ add v0.4s,v0.4s,v1.4s
+ add w5,w5,w9
+ add v4.4s,v4.4s,v5.4s
+ add w6,w6,w10
+ add v8.4s,v8.4s,v9.4s
+ add w7,w7,w11
+ add v12.4s,v12.4s,v13.4s
+ add w8,w8,w12
+ add v16.4s,v16.4s,v17.4s
+ eor w17,w17,w5
+ add v20.4s,v20.4s,v21.4s
+ eor w19,w19,w6
+ eor v3.16b,v3.16b,v0.16b
+ eor w20,w20,w7
+ eor v7.16b,v7.16b,v4.16b
+ eor w21,w21,w8
+ eor v11.16b,v11.16b,v8.16b
+ ror w17,w17,#16
+ eor v15.16b,v15.16b,v12.16b
+ ror w19,w19,#16
+ eor v19.16b,v19.16b,v16.16b
+ ror w20,w20,#16
+ eor v23.16b,v23.16b,v20.16b
+ ror w21,w21,#16
+ rev32 v3.8h,v3.8h
+ add w13,w13,w17
+ rev32 v7.8h,v7.8h
+ add w14,w14,w19
+ rev32 v11.8h,v11.8h
+ add w15,w15,w20
+ rev32 v15.8h,v15.8h
+ add w16,w16,w21
+ rev32 v19.8h,v19.8h
+ eor w9,w9,w13
+ rev32 v23.8h,v23.8h
+ eor w10,w10,w14
+ add v2.4s,v2.4s,v3.4s
+ eor w11,w11,w15
+ add v6.4s,v6.4s,v7.4s
+ eor w12,w12,w16
+ add v10.4s,v10.4s,v11.4s
+ ror w9,w9,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w10,w10,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w11,w11,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w12,w12,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w9
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w10
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w11
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w12
+ eor v28.16b,v17.16b,v18.16b
+ eor w17,w17,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w19,w19,w6
+ ushr v1.4s,v24.4s,#20
+ eor w20,w20,w7
+ ushr v5.4s,v25.4s,#20
+ eor w21,w21,w8
+ ushr v9.4s,v26.4s,#20
+ ror w17,w17,#24
+ ushr v13.4s,v27.4s,#20
+ ror w19,w19,#24
+ ushr v17.4s,v28.4s,#20
+ ror w20,w20,#24
+ ushr v21.4s,v29.4s,#20
+ ror w21,w21,#24
+ sli v1.4s,v24.4s,#12
+ add w13,w13,w17
+ sli v5.4s,v25.4s,#12
+ add w14,w14,w19
+ sli v9.4s,v26.4s,#12
+ add w15,w15,w20
+ sli v13.4s,v27.4s,#12
+ add w16,w16,w21
+ sli v17.4s,v28.4s,#12
+ eor w9,w9,w13
+ sli v21.4s,v29.4s,#12
+ eor w10,w10,w14
+ add v0.4s,v0.4s,v1.4s
+ eor w11,w11,w15
+ add v4.4s,v4.4s,v5.4s
+ eor w12,w12,w16
+ add v8.4s,v8.4s,v9.4s
+ ror w9,w9,#25
+ add v12.4s,v12.4s,v13.4s
+ ror w10,w10,#25
+ add v16.4s,v16.4s,v17.4s
+ ror w11,w11,#25
+ add v20.4s,v20.4s,v21.4s
+ ror w12,w12,#25
+ eor v24.16b,v3.16b,v0.16b
+ add w5,w5,w10
+ eor v25.16b,v7.16b,v4.16b
+ add w6,w6,w11
+ eor v26.16b,v11.16b,v8.16b
+ add w7,w7,w12
+ eor v27.16b,v15.16b,v12.16b
+ add w8,w8,w9
+ eor v28.16b,v19.16b,v16.16b
+ eor w21,w21,w5
+ eor v29.16b,v23.16b,v20.16b
+ eor w17,w17,w6
+ ushr v3.4s,v24.4s,#24
+ eor w19,w19,w7
+ ushr v7.4s,v25.4s,#24
+ eor w20,w20,w8
+ ushr v11.4s,v26.4s,#24
+ ror w21,w21,#16
+ ushr v15.4s,v27.4s,#24
+ ror w17,w17,#16
+ ushr v19.4s,v28.4s,#24
+ ror w19,w19,#16
+ ushr v23.4s,v29.4s,#24
+ ror w20,w20,#16
+ sli v3.4s,v24.4s,#8
+ add w15,w15,w21
+ sli v7.4s,v25.4s,#8
+ add w16,w16,w17
+ sli v11.4s,v26.4s,#8
+ add w13,w13,w19
+ sli v15.4s,v27.4s,#8
+ add w14,w14,w20
+ sli v19.4s,v28.4s,#8
+ eor w10,w10,w15
+ sli v23.4s,v29.4s,#8
+ eor w11,w11,w16
+ add v2.4s,v2.4s,v3.4s
+ eor w12,w12,w13
+ add v6.4s,v6.4s,v7.4s
+ eor w9,w9,w14
+ add v10.4s,v10.4s,v11.4s
+ ror w10,w10,#20
+ add v14.4s,v14.4s,v15.4s
+ ror w11,w11,#20
+ add v18.4s,v18.4s,v19.4s
+ ror w12,w12,#20
+ add v22.4s,v22.4s,v23.4s
+ ror w9,w9,#20
+ eor v24.16b,v1.16b,v2.16b
+ add w5,w5,w10
+ eor v25.16b,v5.16b,v6.16b
+ add w6,w6,w11
+ eor v26.16b,v9.16b,v10.16b
+ add w7,w7,w12
+ eor v27.16b,v13.16b,v14.16b
+ add w8,w8,w9
+ eor v28.16b,v17.16b,v18.16b
+ eor w21,w21,w5
+ eor v29.16b,v21.16b,v22.16b
+ eor w17,w17,w6
+ ushr v1.4s,v24.4s,#25
+ eor w19,w19,w7
+ ushr v5.4s,v25.4s,#25
+ eor w20,w20,w8
+ ushr v9.4s,v26.4s,#25
+ ror w21,w21,#24
+ ushr v13.4s,v27.4s,#25
+ ror w17,w17,#24
+ ushr v17.4s,v28.4s,#25
+ ror w19,w19,#24
+ ushr v21.4s,v29.4s,#25
+ ror w20,w20,#24
+ sli v1.4s,v24.4s,#7
+ add w15,w15,w21
+ sli v5.4s,v25.4s,#7
+ add w16,w16,w17
+ sli v9.4s,v26.4s,#7
+ add w13,w13,w19
+ sli v13.4s,v27.4s,#7
+ add w14,w14,w20
+ sli v17.4s,v28.4s,#7
+ eor w10,w10,w15
+ sli v21.4s,v29.4s,#7
+ eor w11,w11,w16
+ ext v2.16b,v2.16b,v2.16b,#8
+ eor w12,w12,w13
+ ext v6.16b,v6.16b,v6.16b,#8
+ eor w9,w9,w14
+ ext v10.16b,v10.16b,v10.16b,#8
+ ror w10,w10,#25
+ ext v14.16b,v14.16b,v14.16b,#8
+ ror w11,w11,#25
+ ext v18.16b,v18.16b,v18.16b,#8
+ ror w12,w12,#25
+ ext v22.16b,v22.16b,v22.16b,#8
+ ror w9,w9,#25
+ ext v3.16b,v3.16b,v3.16b,#4
+ ext v7.16b,v7.16b,v7.16b,#4
+ ext v11.16b,v11.16b,v11.16b,#4
+ ext v15.16b,v15.16b,v15.16b,#4
+ ext v19.16b,v19.16b,v19.16b,#4
+ ext v23.16b,v23.16b,v23.16b,#4
+ ext v1.16b,v1.16b,v1.16b,#12
+ ext v5.16b,v5.16b,v5.16b,#12
+ ext v9.16b,v9.16b,v9.16b,#12
+ ext v13.16b,v13.16b,v13.16b,#12
+ ext v17.16b,v17.16b,v17.16b,#12
+ ext v21.16b,v21.16b,v21.16b,#12
+ cbnz x4,.Loop_lower_neon
+
+ add w5,w5,w22 // accumulate key block
+ ldp q24,q25,[sp,#0]
+ add x6,x6,x22,lsr#32
+ ldp q26,q27,[sp,#32]
+ add w7,w7,w23
+ ldp q28,q29,[sp,#64]
+ add x8,x8,x23,lsr#32
+ add v0.4s,v0.4s,v24.4s
+ add w9,w9,w24
+ add v4.4s,v4.4s,v24.4s
+ add x10,x10,x24,lsr#32
+ add v8.4s,v8.4s,v24.4s
+ add w11,w11,w25
+ add v12.4s,v12.4s,v24.4s
+ add x12,x12,x25,lsr#32
+ add v16.4s,v16.4s,v24.4s
+ add w13,w13,w26
+ add v20.4s,v20.4s,v24.4s
+ add x14,x14,x26,lsr#32
+ add v2.4s,v2.4s,v26.4s
+ add w15,w15,w27
+ add v6.4s,v6.4s,v26.4s
+ add x16,x16,x27,lsr#32
+ add v10.4s,v10.4s,v26.4s
+ add w17,w17,w28
+ add v14.4s,v14.4s,v26.4s
+ add x19,x19,x28,lsr#32
+ add v18.4s,v18.4s,v26.4s
+ add w20,w20,w30
+ add v22.4s,v22.4s,v26.4s
+ add x21,x21,x30,lsr#32
+ add v19.4s,v19.4s,v31.4s // +4
+ add x5,x5,x6,lsl#32 // pack
+ add v23.4s,v23.4s,v31.4s // +4
+ add x7,x7,x8,lsl#32
+ add v3.4s,v3.4s,v27.4s
+ ldp x6,x8,[x1,#0] // load input
+ add v7.4s,v7.4s,v28.4s
+ add x9,x9,x10,lsl#32
+ add v11.4s,v11.4s,v29.4s
+ add x11,x11,x12,lsl#32
+ add v15.4s,v15.4s,v30.4s
+ ldp x10,x12,[x1,#16]
+ add v19.4s,v19.4s,v27.4s
+ add x13,x13,x14,lsl#32
+ add v23.4s,v23.4s,v28.4s
+ add x15,x15,x16,lsl#32
+ add v1.4s,v1.4s,v25.4s
+ ldp x14,x16,[x1,#32]
+ add v5.4s,v5.4s,v25.4s
+ add x17,x17,x19,lsl#32
+ add v9.4s,v9.4s,v25.4s
+ add x20,x20,x21,lsl#32
+ add v13.4s,v13.4s,v25.4s
+ ldp x19,x21,[x1,#48]
+ add v17.4s,v17.4s,v25.4s
+ add x1,x1,#64
+ add v21.4s,v21.4s,v25.4s
+
+#ifdef __ARMEB__
+ rev x5,x5
+ rev x7,x7
+ rev x9,x9
+ rev x11,x11
+ rev x13,x13
+ rev x15,x15
+ rev x17,x17
+ rev x20,x20
+#endif
+ ld1 {v24.16b,v25.16b,v26.16b,v27.16b},[x1],#64
+ eor x5,x5,x6
+ eor x7,x7,x8
+ eor x9,x9,x10
+ eor x11,x11,x12
+ eor x13,x13,x14
+ eor v0.16b,v0.16b,v24.16b
+ eor x15,x15,x16
+ eor v1.16b,v1.16b,v25.16b
+ eor x17,x17,x19
+ eor v2.16b,v2.16b,v26.16b
+ eor x20,x20,x21
+ eor v3.16b,v3.16b,v27.16b
+ ld1 {v24.16b,v25.16b,v26.16b,v27.16b},[x1],#64
+
+ stp x5,x7,[x0,#0] // store output
+ add x28,x28,#7 // increment counter
+ stp x9,x11,[x0,#16]
+ stp x13,x15,[x0,#32]
+ stp x17,x20,[x0,#48]
+ add x0,x0,#64
+ st1 {v0.16b,v1.16b,v2.16b,v3.16b},[x0],#64
+
+ ld1 {v0.16b,v1.16b,v2.16b,v3.16b},[x1],#64
+ eor v4.16b,v4.16b,v24.16b
+ eor v5.16b,v5.16b,v25.16b
+ eor v6.16b,v6.16b,v26.16b
+ eor v7.16b,v7.16b,v27.16b
+ st1 {v4.16b,v5.16b,v6.16b,v7.16b},[x0],#64
+
+ ld1 {v4.16b,v5.16b,v6.16b,v7.16b},[x1],#64
+ eor v8.16b,v8.16b,v0.16b
+ ldp q24,q25,[sp,#0]
+ eor v9.16b,v9.16b,v1.16b
+ ldp q26,q27,[sp,#32]
+ eor v10.16b,v10.16b,v2.16b
+ eor v11.16b,v11.16b,v3.16b
+ st1 {v8.16b,v9.16b,v10.16b,v11.16b},[x0],#64
+
+ ld1 {v8.16b,v9.16b,v10.16b,v11.16b},[x1],#64
+ eor v12.16b,v12.16b,v4.16b
+ eor v13.16b,v13.16b,v5.16b
+ eor v14.16b,v14.16b,v6.16b
+ eor v15.16b,v15.16b,v7.16b
+ st1 {v12.16b,v13.16b,v14.16b,v15.16b},[x0],#64
+
+ ld1 {v12.16b,v13.16b,v14.16b,v15.16b},[x1],#64
+ eor v16.16b,v16.16b,v8.16b
+ eor v17.16b,v17.16b,v9.16b
+ eor v18.16b,v18.16b,v10.16b
+ eor v19.16b,v19.16b,v11.16b
+ st1 {v16.16b,v17.16b,v18.16b,v19.16b},[x0],#64
+
+ shl v0.4s,v31.4s,#1 // 4 -> 8
+ eor v20.16b,v20.16b,v12.16b
+ eor v21.16b,v21.16b,v13.16b
+ eor v22.16b,v22.16b,v14.16b
+ eor v23.16b,v23.16b,v15.16b
+ st1 {v20.16b,v21.16b,v22.16b,v23.16b},[x0],#64
+
+ add v27.4s,v27.4s,v0.4s // += 8
+ add v28.4s,v28.4s,v0.4s
+ add v29.4s,v29.4s,v0.4s
+ add v30.4s,v30.4s,v0.4s
+
+ b.hs .Loop_outer_512_neon
+
+ adds x2,x2,#512
+ ushr v0.4s,v31.4s,#2 // 4 -> 1
+
+ ldp d8,d9,[sp,#128+0] // meet ABI requirements
+ ldp d10,d11,[sp,#128+16]
+ ldp d12,d13,[sp,#128+32]
+ ldp d14,d15,[sp,#128+48]
+
+ stp q24,q31,[sp,#0] // wipe off-load area
+ stp q24,q31,[sp,#32]
+ stp q24,q31,[sp,#64]
+
+ b.eq .Ldone_512_neon
+
+ cmp x2,#192
+ sub v27.4s,v27.4s,v0.4s // -= 1
+ sub v28.4s,v28.4s,v0.4s
+ sub v29.4s,v29.4s,v0.4s
+ add sp,sp,#128
+ b.hs .Loop_outer_neon
+
+ eor v25.16b,v25.16b,v25.16b
+ eor v26.16b,v26.16b,v26.16b
+ eor v27.16b,v27.16b,v27.16b
+ eor v28.16b,v28.16b,v28.16b
+ eor v29.16b,v29.16b,v29.16b
+ eor v30.16b,v30.16b,v30.16b
+ b .Loop_outer
+
+.Ldone_512_neon:
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#128+64
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#96
+ ret
+.size ChaCha20_512_neon,.-ChaCha20_512_neon
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 08/28] zinc: port Andy Polyakov's ChaCha20 ARM and ARM64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (4 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 07/28] zinc: import Andy Polyakov's ChaCha20 ARM and ARM64 implementations Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 09/28] zinc: " Jason A. Donenfeld
` (17 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Russell King, linux-arm-kernel, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
These port and prepare Andy Polyakov's implementations for the kernel,
but don't actually wire up any of the code yet. The wiring will be done
in a subsequent commit, since we'll need to merge these implementations
with another one. We make a few small changes to the assembly:
- Entries and exits use the proper kernel convention macro.
- CPU feature checking is done in C by the glue code, so that has been
removed from the assembly.
- The function names have been renamed to fit kernel conventions.
- Labels have been renamed (prefixed with .L) to fit kernel conventions.
- Constants have been rearranged so that they are closer to the code
that is using them. [ARM only]
- The neon code can jump to the scalar code when it makes sense to do
so.
- The neon_512 function as a separate function has been removed, leaving
the decision up to the main neon entry point. [ARM64 only]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/chacha20/chacha20-arm-cryptogams.S | 367 +++++++++---------
lib/zinc/chacha20/chacha20-arm64-cryptogams.S | 75 ++--
2 files changed, 202 insertions(+), 240 deletions(-)
diff --git a/lib/zinc/chacha20/chacha20-arm-cryptogams.S b/lib/zinc/chacha20/chacha20-arm-cryptogams.S
index 05a3a9e6e93f..770bab469171 100644
--- a/lib/zinc/chacha20/chacha20-arm-cryptogams.S
+++ b/lib/zinc/chacha20/chacha20-arm-cryptogams.S
@@ -1,9 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-#include "arm_arch.h"
+#include <linux/linkage.h>
.text
#if defined(__thumb2__) || defined(__clang__)
@@ -24,48 +27,25 @@
.long 0x61707865,0x3320646e,0x79622d32,0x6b206574 @ endian-neutral
.Lone:
.long 1,0,0,0
-.Lrot8:
-.long 0x02010003,0x06050407
-#if __ARM_MAX_ARCH__>=7
-.LOPENSSL_armcap:
-.word OPENSSL_armcap_P-.LChaCha20_ctr32
-#else
.word -1
-#endif
-.globl ChaCha20_ctr32
-.type ChaCha20_ctr32,%function
.align 5
-ChaCha20_ctr32:
-.LChaCha20_ctr32:
+ENTRY(chacha20_arm)
ldr r12,[sp,#0] @ pull pointer to counter and nonce
stmdb sp!,{r0-r2,r4-r11,lr}
-#if __ARM_ARCH__<7 && !defined(__thumb2__)
- sub r14,pc,#16 @ ChaCha20_ctr32
-#else
- adr r14,.LChaCha20_ctr32
-#endif
cmp r2,#0 @ len==0?
-#ifdef __thumb2__
+#ifdef __thumb2__
itt eq
#endif
addeq sp,sp,#4*3
- beq .Lno_data
-#if __ARM_MAX_ARCH__>=7
- cmp r2,#192 @ test len
- bls .Lshort
- ldr r4,[r14,#-24]
- ldr r4,[r14,r4]
-# ifdef __APPLE__
- ldr r4,[r4]
-# endif
- tst r4,#ARMV7_NEON
- bne .LChaCha20_neon
-.Lshort:
-#endif
+ beq .Lno_data_arm
ldmia r12,{r4-r7} @ load counter and nonce
sub sp,sp,#4*(16) @ off-load area
- sub r14,r14,#64 @ .Lsigma
+#if __LINUX_ARM_ARCH__ < 7 && !defined(__thumb2__)
+ sub r14,pc,#100 @ .Lsigma
+#else
+ adr r14,.Lsigma @ .Lsigma
+#endif
stmdb sp!,{r4-r7} @ copy counter and nonce
ldmia r3,{r4-r11} @ load key
ldmia r14,{r0-r3} @ load sigma
@@ -191,7 +171,7 @@ ChaCha20_ctr32:
@ rx and second half at sp+4*(16+8)
cmp r11,#64 @ done yet?
-#ifdef __thumb2__
+#ifdef __thumb2__
itete lo
#endif
addlo r12,sp,#4*(0) @ shortcut or ...
@@ -202,49 +182,49 @@ ChaCha20_ctr32:
ldr r8,[sp,#4*(0)] @ load key material
ldr r9,[sp,#4*(1)]
-#if __ARM_ARCH__>=6 || !defined(__ARMEB__)
-# if __ARM_ARCH__<7
+#if __LINUX_ARM_ARCH__ >= 6 || !defined(__ARMEB__)
+#if __LINUX_ARM_ARCH__ < 7
orr r10,r12,r14
tst r10,#3 @ are input and output aligned?
ldr r10,[sp,#4*(2)]
bne .Lunaligned
cmp r11,#64 @ restore flags
-# else
+#else
ldr r10,[sp,#4*(2)]
-# endif
+#endif
ldr r11,[sp,#4*(3)]
add r0,r0,r8 @ accumulate key material
add r1,r1,r9
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r8,[r12],#16 @ load input
ldrhs r9,[r12,#-12]
add r2,r2,r10
add r3,r3,r11
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r10,[r12,#-8]
ldrhs r11,[r12,#-4]
-# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+#if __LINUX_ARM_ARCH__ >= 6 && defined(__ARMEB__)
rev r0,r0
rev r1,r1
rev r2,r2
rev r3,r3
-# endif
-# ifdef __thumb2__
+#endif
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r0,r0,r8 @ xor with input
eorhs r1,r1,r9
add r8,sp,#4*(4)
str r0,[r14],#16 @ store output
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r2,r2,r10
eorhs r3,r3,r11
ldmia r8,{r8-r11} @ load key material
@@ -254,34 +234,34 @@ ChaCha20_ctr32:
add r4,r8,r4,ror#13 @ accumulate key material
add r5,r9,r5,ror#13
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r8,[r12],#16 @ load input
ldrhs r9,[r12,#-12]
add r6,r10,r6,ror#13
add r7,r11,r7,ror#13
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r10,[r12,#-8]
ldrhs r11,[r12,#-4]
-# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+#if __LINUX_ARM_ARCH__ >= 6 && defined(__ARMEB__)
rev r4,r4
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
-# ifdef __thumb2__
+#endif
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r4,r4,r8
eorhs r5,r5,r9
add r8,sp,#4*(8)
str r4,[r14],#16 @ store output
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r6,r6,r10
eorhs r7,r7,r11
str r5,[r14,#-12]
@@ -294,39 +274,39 @@ ChaCha20_ctr32:
add r0,r0,r8 @ accumulate key material
add r1,r1,r9
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r8,[r12],#16 @ load input
ldrhs r9,[r12,#-12]
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hi
-# endif
+#endif
strhi r10,[sp,#4*(16+10)] @ copy "rx" while at it
strhi r11,[sp,#4*(16+11)] @ copy "rx" while at it
add r2,r2,r10
add r3,r3,r11
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r10,[r12,#-8]
ldrhs r11,[r12,#-4]
-# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+#if __LINUX_ARM_ARCH__ >= 6 && defined(__ARMEB__)
rev r0,r0
rev r1,r1
rev r2,r2
rev r3,r3
-# endif
-# ifdef __thumb2__
+#endif
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r0,r0,r8
eorhs r1,r1,r9
add r8,sp,#4*(12)
str r0,[r14],#16 @ store output
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r2,r2,r10
eorhs r3,r3,r11
str r1,[r14,#-12]
@@ -336,79 +316,79 @@ ChaCha20_ctr32:
add r4,r8,r4,ror#24 @ accumulate key material
add r5,r9,r5,ror#24
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hi
-# endif
+#endif
addhi r8,r8,#1 @ next counter value
strhi r8,[sp,#4*(12)] @ save next counter value
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r8,[r12],#16 @ load input
ldrhs r9,[r12,#-12]
add r6,r10,r6,ror#24
add r7,r11,r7,ror#24
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhs r10,[r12,#-8]
ldrhs r11,[r12,#-4]
-# if __ARM_ARCH__>=6 && defined(__ARMEB__)
+#if __LINUX_ARM_ARCH__ >= 6 && defined(__ARMEB__)
rev r4,r4
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
-# ifdef __thumb2__
+#endif
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r4,r4,r8
eorhs r5,r5,r9
-# ifdef __thumb2__
+#ifdef __thumb2__
it ne
-# endif
+#endif
ldrne r8,[sp,#4*(32+2)] @ re-load len
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
eorhs r6,r6,r10
eorhs r7,r7,r11
str r4,[r14],#16 @ store output
str r5,[r14,#-12]
-# ifdef __thumb2__
+#ifdef __thumb2__
it hs
-# endif
+#endif
subhs r11,r8,#64 @ len-=64
str r6,[r14,#-8]
str r7,[r14,#-4]
bhi .Loop_outer
beq .Ldone
-# if __ARM_ARCH__<7
+#if __LINUX_ARM_ARCH__ < 7
b .Ltail
.align 4
.Lunaligned: @ unaligned endian-neutral path
cmp r11,#64 @ restore flags
-# endif
#endif
-#if __ARM_ARCH__<7
+#endif
+#if __LINUX_ARM_ARCH__ < 7
ldr r11,[sp,#4*(3)]
add r0,r8,r0 @ accumulate key material
add r1,r9,r1
add r2,r10,r2
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r8,r8,r8 @ zero or ...
ldrhsb r8,[r12],#16 @ ... load input
eorlo r9,r9,r9
ldrhsb r9,[r12,#-12]
add r3,r11,r3
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r10,r10,r10
ldrhsb r10,[r12,#-8]
eorlo r11,r11,r11
@@ -416,53 +396,53 @@ ChaCha20_ctr32:
eor r0,r8,r0 @ xor with input (or zero)
eor r1,r9,r1
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-15] @ load more input
ldrhsb r9,[r12,#-11]
eor r2,r10,r2
strb r0,[r14],#16 @ store output
eor r3,r11,r3
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-7]
ldrhsb r11,[r12,#-3]
strb r1,[r14,#-12]
eor r0,r8,r0,lsr#8
strb r2,[r14,#-8]
eor r1,r9,r1,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-14] @ load more input
ldrhsb r9,[r12,#-10]
strb r3,[r14,#-4]
eor r2,r10,r2,lsr#8
strb r0,[r14,#-15]
eor r3,r11,r3,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-6]
ldrhsb r11,[r12,#-2]
strb r1,[r14,#-11]
eor r0,r8,r0,lsr#8
strb r2,[r14,#-7]
eor r1,r9,r1,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-13] @ load more input
ldrhsb r9,[r12,#-9]
strb r3,[r14,#-3]
eor r2,r10,r2,lsr#8
strb r0,[r14,#-14]
eor r3,r11,r3,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-5]
ldrhsb r11,[r12,#-1]
strb r1,[r14,#-10]
@@ -482,18 +462,18 @@ ChaCha20_ctr32:
add r4,r8,r4,ror#13 @ accumulate key material
add r5,r9,r5,ror#13
add r6,r10,r6,ror#13
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r8,r8,r8 @ zero or ...
ldrhsb r8,[r12],#16 @ ... load input
eorlo r9,r9,r9
ldrhsb r9,[r12,#-12]
add r7,r11,r7,ror#13
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r10,r10,r10
ldrhsb r10,[r12,#-8]
eorlo r11,r11,r11
@@ -501,53 +481,53 @@ ChaCha20_ctr32:
eor r4,r8,r4 @ xor with input (or zero)
eor r5,r9,r5
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-15] @ load more input
ldrhsb r9,[r12,#-11]
eor r6,r10,r6
strb r4,[r14],#16 @ store output
eor r7,r11,r7
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-7]
ldrhsb r11,[r12,#-3]
strb r5,[r14,#-12]
eor r4,r8,r4,lsr#8
strb r6,[r14,#-8]
eor r5,r9,r5,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-14] @ load more input
ldrhsb r9,[r12,#-10]
strb r7,[r14,#-4]
eor r6,r10,r6,lsr#8
strb r4,[r14,#-15]
eor r7,r11,r7,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-6]
ldrhsb r11,[r12,#-2]
strb r5,[r14,#-11]
eor r4,r8,r4,lsr#8
strb r6,[r14,#-7]
eor r5,r9,r5,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-13] @ load more input
ldrhsb r9,[r12,#-9]
strb r7,[r14,#-3]
eor r6,r10,r6,lsr#8
strb r4,[r14,#-14]
eor r7,r11,r7,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-5]
ldrhsb r11,[r12,#-1]
strb r5,[r14,#-10]
@@ -564,26 +544,26 @@ ChaCha20_ctr32:
add r8,sp,#4*(4+4)
ldmia r8,{r8-r11} @ load key material
ldmia r0,{r0-r7} @ load second half
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hi
-# endif
+#endif
strhi r10,[sp,#4*(16+10)] @ copy "rx"
strhi r11,[sp,#4*(16+11)] @ copy "rx"
add r0,r8,r0 @ accumulate key material
add r1,r9,r1
add r2,r10,r2
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r8,r8,r8 @ zero or ...
ldrhsb r8,[r12],#16 @ ... load input
eorlo r9,r9,r9
ldrhsb r9,[r12,#-12]
add r3,r11,r3
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r10,r10,r10
ldrhsb r10,[r12,#-8]
eorlo r11,r11,r11
@@ -591,53 +571,53 @@ ChaCha20_ctr32:
eor r0,r8,r0 @ xor with input (or zero)
eor r1,r9,r1
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-15] @ load more input
ldrhsb r9,[r12,#-11]
eor r2,r10,r2
strb r0,[r14],#16 @ store output
eor r3,r11,r3
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-7]
ldrhsb r11,[r12,#-3]
strb r1,[r14,#-12]
eor r0,r8,r0,lsr#8
strb r2,[r14,#-8]
eor r1,r9,r1,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-14] @ load more input
ldrhsb r9,[r12,#-10]
strb r3,[r14,#-4]
eor r2,r10,r2,lsr#8
strb r0,[r14,#-15]
eor r3,r11,r3,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-6]
ldrhsb r11,[r12,#-2]
strb r1,[r14,#-11]
eor r0,r8,r0,lsr#8
strb r2,[r14,#-7]
eor r1,r9,r1,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-13] @ load more input
ldrhsb r9,[r12,#-9]
strb r3,[r14,#-3]
eor r2,r10,r2,lsr#8
strb r0,[r14,#-14]
eor r3,r11,r3,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-5]
ldrhsb r11,[r12,#-1]
strb r1,[r14,#-10]
@@ -654,25 +634,25 @@ ChaCha20_ctr32:
add r8,sp,#4*(4+8)
ldmia r8,{r8-r11} @ load key material
add r4,r8,r4,ror#24 @ accumulate key material
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hi
-# endif
+#endif
addhi r8,r8,#1 @ next counter value
strhi r8,[sp,#4*(12)] @ save next counter value
add r5,r9,r5,ror#24
add r6,r10,r6,ror#24
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r8,r8,r8 @ zero or ...
ldrhsb r8,[r12],#16 @ ... load input
eorlo r9,r9,r9
ldrhsb r9,[r12,#-12]
add r7,r11,r7,ror#24
-# ifdef __thumb2__
+#ifdef __thumb2__
itete lo
-# endif
+#endif
eorlo r10,r10,r10
ldrhsb r10,[r12,#-8]
eorlo r11,r11,r11
@@ -680,53 +660,53 @@ ChaCha20_ctr32:
eor r4,r8,r4 @ xor with input (or zero)
eor r5,r9,r5
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-15] @ load more input
ldrhsb r9,[r12,#-11]
eor r6,r10,r6
strb r4,[r14],#16 @ store output
eor r7,r11,r7
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-7]
ldrhsb r11,[r12,#-3]
strb r5,[r14,#-12]
eor r4,r8,r4,lsr#8
strb r6,[r14,#-8]
eor r5,r9,r5,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-14] @ load more input
ldrhsb r9,[r12,#-10]
strb r7,[r14,#-4]
eor r6,r10,r6,lsr#8
strb r4,[r14,#-15]
eor r7,r11,r7,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-6]
ldrhsb r11,[r12,#-2]
strb r5,[r14,#-11]
eor r4,r8,r4,lsr#8
strb r6,[r14,#-7]
eor r5,r9,r5,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r8,[r12,#-13] @ load more input
ldrhsb r9,[r12,#-9]
strb r7,[r14,#-3]
eor r6,r10,r6,lsr#8
strb r4,[r14,#-14]
eor r7,r11,r7,lsr#8
-# ifdef __thumb2__
+#ifdef __thumb2__
itt hs
-# endif
+#endif
ldrhsb r10,[r12,#-5]
ldrhsb r11,[r12,#-1]
strb r5,[r14,#-10]
@@ -740,13 +720,13 @@ ChaCha20_ctr32:
eor r7,r11,r7,lsr#8
strb r6,[r14,#-5]
strb r7,[r14,#-1]
-# ifdef __thumb2__
+#ifdef __thumb2__
it ne
-# endif
+#endif
ldrne r8,[sp,#4*(32+2)] @ re-load len
-# ifdef __thumb2__
+#ifdef __thumb2__
it hs
-# endif
+#endif
subhs r11,r8,#64 @ len-=64
bhi .Loop_outer
@@ -768,20 +748,33 @@ ChaCha20_ctr32:
.Ldone:
add sp,sp,#4*(32+3)
-.Lno_data:
+.Lno_data_arm:
ldmia sp!,{r4-r11,pc}
-.size ChaCha20_ctr32,.-ChaCha20_ctr32
-#if __ARM_MAX_ARCH__>=7
+ENDPROC(chacha20_arm)
+
+#ifdef CONFIG_KERNEL_MODE_NEON
+.align 5
+.Lsigma2:
+.long 0x61707865,0x3320646e,0x79622d32,0x6b206574 @ endian-neutral
+.Lone2:
+.long 1,0,0,0
+.word -1
+
.arch armv7-a
.fpu neon
-.type ChaCha20_neon,%function
.align 5
-ChaCha20_neon:
+ENTRY(chacha20_neon)
ldr r12,[sp,#0] @ pull pointer to counter and nonce
stmdb sp!,{r0-r2,r4-r11,lr}
-.LChaCha20_neon:
- adr r14,.Lsigma
+ cmp r2,#0 @ len==0?
+#ifdef __thumb2__
+ itt eq
+#endif
+ addeq sp,sp,#4*3
+ beq .Lno_data_neon
+.Lchacha20_neon_begin:
+ adr r14,.Lsigma2
vstmdb sp!,{d8-d15} @ ABI spec says so
stmdb sp!,{r0-r3}
@@ -1121,12 +1114,12 @@ ChaCha20_neon:
ldr r10,[r12,#-8]
add r3,r3,r11
ldr r11,[r12,#-4]
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r0,r0
rev r1,r1
rev r2,r2
rev r3,r3
-# endif
+#endif
eor r0,r0,r8 @ xor with input
add r8,sp,#4*(4)
eor r1,r1,r9
@@ -1146,12 +1139,12 @@ ChaCha20_neon:
ldr r10,[r12,#-8]
add r7,r11,r7,ror#13
ldr r11,[r12,#-4]
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r4,r4
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
+#endif
eor r4,r4,r8
add r8,sp,#4*(8)
eor r5,r5,r9
@@ -1170,24 +1163,24 @@ ChaCha20_neon:
ldr r8,[r12],#16 @ load input
add r1,r1,r9
ldr r9,[r12,#-12]
-# ifdef __thumb2__
+#ifdef __thumb2__
it hi
-# endif
+#endif
strhi r10,[sp,#4*(16+10)] @ copy "rx" while at it
add r2,r2,r10
ldr r10,[r12,#-8]
-# ifdef __thumb2__
+#ifdef __thumb2__
it hi
-# endif
+#endif
strhi r11,[sp,#4*(16+11)] @ copy "rx" while at it
add r3,r3,r11
ldr r11,[r12,#-4]
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r0,r0
rev r1,r1
rev r2,r2
rev r3,r3
-# endif
+#endif
eor r0,r0,r8
add r8,sp,#4*(12)
eor r1,r1,r9
@@ -1210,16 +1203,16 @@ ChaCha20_neon:
add r7,r11,r7,ror#24
ldr r10,[r12,#-8]
ldr r11,[r12,#-4]
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r4,r4
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
+#endif
eor r4,r4,r8
-# ifdef __thumb2__
+#ifdef __thumb2__
it hi
-# endif
+#endif
ldrhi r8,[sp,#4*(32+2)] @ re-load len
eor r5,r5,r9
eor r6,r6,r10
@@ -1379,7 +1372,7 @@ ChaCha20_neon:
add r6,r10,r6,ror#13
add r7,r11,r7,ror#13
ldmia r8,{r8-r11} @ load key material
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r0,r0
rev r1,r1
rev r2,r2
@@ -1388,7 +1381,7 @@ ChaCha20_neon:
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
+#endif
stmia sp,{r0-r7}
add r0,sp,#4*(16+8)
@@ -1408,7 +1401,7 @@ ChaCha20_neon:
add r6,r10,r6,ror#24
add r7,r11,r7,ror#24
ldr r11,[sp,#4*(32+2)] @ re-load len
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r0,r0
rev r1,r1
rev r2,r2
@@ -1417,7 +1410,7 @@ ChaCha20_neon:
rev r5,r5
rev r6,r6
rev r7,r7
-# endif
+#endif
stmia r8,{r0-r7}
add r10,sp,#4*(0)
sub r11,r11,#64*3 @ len-=64*3
@@ -1434,7 +1427,7 @@ ChaCha20_neon:
add sp,sp,#4*(32+4)
vldmia sp,{d8-d15}
add sp,sp,#4*(16+3)
+.Lno_data_neon:
ldmia sp!,{r4-r11,pc}
-.size ChaCha20_neon,.-ChaCha20_neon
-.comm OPENSSL_armcap_P,4,4
+ENDPROC(chacha20_neon)
#endif
diff --git a/lib/zinc/chacha20/chacha20-arm64-cryptogams.S b/lib/zinc/chacha20/chacha20-arm64-cryptogams.S
index 4d029bfdad3a..1ae11a5c5a14 100644
--- a/lib/zinc/chacha20/chacha20-arm64-cryptogams.S
+++ b/lib/zinc/chacha20/chacha20-arm64-cryptogams.S
@@ -1,46 +1,24 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-#include "arm_arch.h"
+#include <linux/linkage.h>
.text
-
-
-
.align 5
.Lsigma:
.quad 0x3320646e61707865,0x6b20657479622d32 // endian-neutral
.Lone:
.long 1,0,0,0
-.LOPENSSL_armcap_P:
-#ifdef __ILP32__
-.long OPENSSL_armcap_P-.
-#else
-.quad OPENSSL_armcap_P-.
-#endif
-.byte 67,104,97,67,104,97,50,48,32,102,111,114,32,65,82,77,118,56,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
-.align 2
-.globl ChaCha20_ctr32
-.type ChaCha20_ctr32,%function
.align 5
-ChaCha20_ctr32:
+ENTRY(chacha20_arm)
cbz x2,.Labort
- adr x5,.LOPENSSL_armcap_P
- cmp x2,#192
- b.lo .Lshort
-#ifdef __ILP32__
- ldrsw x6,[x5]
-#else
- ldr x6,[x5]
-#endif
- ldr w17,[x6,x5]
- tst w17,#ARMV7_NEON
- b.ne ChaCha20_neon
-.Lshort:
stp x29,x30,[sp,#-96]!
add x29,sp,#0
@@ -56,7 +34,7 @@ ChaCha20_ctr32:
ldp x24,x25,[x3] // load key
ldp x26,x27,[x3,#16]
ldp x28,x30,[x4] // load counter
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
ror x24,x24,#32
ror x25,x25,#32
ror x26,x26,#32
@@ -217,7 +195,7 @@ ChaCha20_ctr32:
add x20,x20,x21,lsl#32
ldp x19,x21,[x1,#48]
add x1,x1,#64
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -273,7 +251,7 @@ ChaCha20_ctr32:
add x15,x15,x16,lsl#32
add x17,x17,x19,lsl#32
add x20,x20,x21,lsl#32
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -309,11 +287,13 @@ ChaCha20_ctr32:
ldp x27,x28,[x29,#80]
ldp x29,x30,[sp],#96
ret
-.size ChaCha20_ctr32,.-ChaCha20_ctr32
+ENDPROC(chacha20_arm)
-.type ChaCha20_neon,%function
+#ifdef CONFIG_KERNEL_MODE_NEON
.align 5
-ChaCha20_neon:
+ENTRY(chacha20_neon)
+ cbz x2,.Labort_neon
+
stp x29,x30,[sp,#-96]!
add x29,sp,#0
@@ -336,7 +316,7 @@ ChaCha20_neon:
ldp x28,x30,[x4] // load counter
ld1 {v27.4s},[x4]
ld1 {v31.4s},[x5]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev64 v24.4s,v24.4s
ror x24,x24,#32
ror x25,x25,#32
@@ -634,7 +614,7 @@ ChaCha20_neon:
add x20,x20,x21,lsl#32
ldp x19,x21,[x1,#48]
add x1,x1,#64
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -713,7 +693,7 @@ ChaCha20_neon:
add x20,x20,x21,lsl#32
ldp x19,x21,[x1,#48]
add x1,x1,#64
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -803,19 +783,6 @@ ChaCha20_neon:
ldp x27,x28,[x29,#80]
ldp x29,x30,[sp],#96
ret
-.size ChaCha20_neon,.-ChaCha20_neon
-.type ChaCha20_512_neon,%function
-.align 5
-ChaCha20_512_neon:
- stp x29,x30,[sp,#-96]!
- add x29,sp,#0
-
- adr x5,.Lsigma
- stp x19,x20,[sp,#16]
- stp x21,x22,[sp,#32]
- stp x23,x24,[sp,#48]
- stp x25,x26,[sp,#64]
- stp x27,x28,[sp,#80]
.L512_or_more_neon:
sub sp,sp,#128+64
@@ -828,7 +795,7 @@ ChaCha20_512_neon:
ldp x28,x30,[x4] // load counter
ld1 {v27.4s},[x4]
ld1 {v31.4s},[x5]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev64 v24.4s,v24.4s
ror x24,x24,#32
ror x25,x25,#32
@@ -1341,7 +1308,7 @@ ChaCha20_512_neon:
add x20,x20,x21,lsl#32
ldp x19,x21,[x1,#48]
add x1,x1,#64
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -1855,7 +1822,7 @@ ChaCha20_512_neon:
add x1,x1,#64
add v21.4s,v21.4s,v25.4s
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x5,x5
rev x7,x7
rev x9,x9
@@ -1969,5 +1936,7 @@ ChaCha20_512_neon:
ldp x25,x26,[x29,#64]
ldp x27,x28,[x29,#80]
ldp x29,x30,[sp],#96
+.Labort_neon:
ret
-.size ChaCha20_512_neon,.-ChaCha20_512_neon
+ENDPROC(chacha20_neon)
+#endif
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 09/28] zinc: ChaCha20 ARM and ARM64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (5 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 08/28] zinc: port " Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 10/28] zinc: ChaCha20 MIPS32r2 implementation Jason A. Donenfeld
` (16 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Russell King, linux-arm-kernel, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
These wire Andy Polyakov's implementations up to the kernel for ARMv7,8
NEON, and introduce Eric Biggers' ultra-fast scalar implementation for
CPUs without NEON or for CPUs with slow NEON (Cortex-A5,7).
This commit does the following:
- Adds the glue code for the assembly implementations.
- Renames the ARMv8 code into place, since it can at this point be
used wholesale.
- Merges Andy Polyakov's ARMv7 NEON code with Eric Biggers' <=ARMv7
scalar code.
This commit delivers approximately the same or much better performance than
the existing crypto API's code and has been measured to do as such on:
- ARM1176JZF-S [ARMv6]
- Cortex-A7 [ARMv7]
- Cortex-A8 [ARMv7]
- Cortex-A9 [ARMv7]
- Cortex-A17 [ARMv7]
- Cortex-A53 [ARMv8]
- Cortex-A55 [ARMv8]
- Cortex-A73 [ARMv8]
- Cortex-A75 [ARMv8]
Interestingly, Andy Polyakov's scalar code is slower than Eric Biggers',
but is also significantly shorter. This has the advantage that it does
not evict other code from L1 cache -- particularly on ARM11 chips -- and
so in certain circumstances it can actually be faster. However, it wasn't
found that this had an affect on any code existing in the kernel today.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Co-authored-by: Eric Biggers <ebiggers@google.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
Notes:
Eric Biggers' scalar code is brand new, and quite possibly prematurely
added to this commit, and so it may require a bit of revision. In initial
evaluation and fuzzing so far, it seems fine. But we'll be looking at this
a bit more as well.
lib/zinc/Makefile | 2 +
lib/zinc/chacha20/chacha20-arm-glue.c | 98 ++++
...acha20-arm-cryptogams.S => chacha20-arm.S} | 503 ++++++++++++++++--
...20-arm64-cryptogams.S => chacha20-arm64.S} | 0
lib/zinc/chacha20/chacha20.c | 2 +
5 files changed, 567 insertions(+), 38 deletions(-)
create mode 100644 lib/zinc/chacha20/chacha20-arm-glue.c
rename lib/zinc/chacha20/{chacha20-arm-cryptogams.S => chacha20-arm.S} (71%)
rename lib/zinc/chacha20/{chacha20-arm64-cryptogams.S => chacha20-arm64.S} (100%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 223a0816c918..e47f64e12bbd 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -4,4 +4,6 @@ ccflags-$(CONFIG_ZINC_DEBUG) += -DDEBUG
zinc_chacha20-y := chacha20/chacha20.o
zinc_chacha20-$(CONFIG_ZINC_ARCH_X86_64) += chacha20/chacha20-x86_64.o
+zinc_chacha20-$(CONFIG_ZINC_ARCH_ARM) += chacha20/chacha20-arm.o
+zinc_chacha20-$(CONFIG_ZINC_ARCH_ARM64) += chacha20/chacha20-arm64.o
obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
diff --git a/lib/zinc/chacha20/chacha20-arm-glue.c b/lib/zinc/chacha20/chacha20-arm-glue.c
new file mode 100644
index 000000000000..a0da95d3b9c4
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-arm-glue.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#if defined(CONFIG_ZINC_ARCH_ARM)
+#include <asm/system_info.h>
+#include <asm/cputype.h>
+#endif
+
+asmlinkage void chacha20_arm(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+asmlinkage void hchacha20_arm(const u32 state[16], u32 out[8]);
+asmlinkage void chacha20_neon(u8 *out, const u8 *in, const size_t len,
+ const u32 key[8], const u32 counter[4]);
+
+static bool chacha20_use_neon __ro_after_init;
+static bool *const chacha20_nobs[] __initconst = { &chacha20_use_neon };
+static void __init chacha20_fpu_init(void)
+{
+#if defined(CONFIG_ZINC_ARCH_ARM64)
+ chacha20_use_neon = elf_hwcap & HWCAP_ASIMD;
+#elif defined(CONFIG_ZINC_ARCH_ARM)
+ switch (read_cpuid_part()) {
+ case ARM_CPU_PART_CORTEX_A7:
+ case ARM_CPU_PART_CORTEX_A5:
+ /* The Cortex-A7 and Cortex-A5 do not perform well with the NEON
+ * implementation but do incredibly with the scalar one and use
+ * less power.
+ */
+ break;
+ default:
+ chacha20_use_neon = elf_hwcap & HWCAP_NEON;
+ }
+#endif
+}
+
+static inline bool chacha20_arch(struct chacha20_ctx *ctx, u8 *dst,
+ const u8 *src, size_t len,
+ simd_context_t *simd_context)
+{
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(PAGE_SIZE < CHACHA20_BLOCK_SIZE ||
+ PAGE_SIZE % CHACHA20_BLOCK_SIZE);
+
+ for (;;) {
+ if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && chacha20_use_neon &&
+ len >= CHACHA20_BLOCK_SIZE * 3 && simd_use(simd_context)) {
+ const size_t bytes = min_t(size_t, len, PAGE_SIZE);
+
+ chacha20_neon(dst, src, bytes, ctx->key, ctx->counter);
+ ctx->counter[0] += (bytes + 63) / 64;
+ len -= bytes;
+ if (!len)
+ break;
+ dst += bytes;
+ src += bytes;
+ simd_relax(simd_context);
+ } else {
+ chacha20_arm(dst, src, len, ctx->key, ctx->counter);
+ ctx->counter[0] += (len + 63) / 64;
+ break;
+ }
+ }
+
+ return true;
+}
+
+static inline bool hchacha20_arch(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ if (IS_ENABLED(CONFIG_ZINC_ARCH_ARM)) {
+ u32 x[] = { CHACHA20_CONSTANT_EXPA,
+ CHACHA20_CONSTANT_ND_3,
+ CHACHA20_CONSTANT_2_BY,
+ CHACHA20_CONSTANT_TE_K,
+ get_unaligned_le32(key + 0),
+ get_unaligned_le32(key + 4),
+ get_unaligned_le32(key + 8),
+ get_unaligned_le32(key + 12),
+ get_unaligned_le32(key + 16),
+ get_unaligned_le32(key + 20),
+ get_unaligned_le32(key + 24),
+ get_unaligned_le32(key + 28),
+ get_unaligned_le32(nonce + 0),
+ get_unaligned_le32(nonce + 4),
+ get_unaligned_le32(nonce + 8),
+ get_unaligned_le32(nonce + 12)
+ };
+ hchacha20_arm(x, derived_key);
+ return true;
+ }
+ return false;
+}
diff --git a/lib/zinc/chacha20/chacha20-arm-cryptogams.S b/lib/zinc/chacha20/chacha20-arm.S
similarity index 71%
rename from lib/zinc/chacha20/chacha20-arm-cryptogams.S
rename to lib/zinc/chacha20/chacha20-arm.S
index 770bab469171..79ed18fbcce3 100644
--- a/lib/zinc/chacha20/chacha20-arm-cryptogams.S
+++ b/lib/zinc/chacha20/chacha20-arm.S
@@ -1,12 +1,475 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2018 Google, Inc.
* Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
- *
- * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+/*
+ * The following scalar routine was written by Eric Biggers.
+ *
+ * Design notes:
+ *
+ * 16 registers would be needed to hold the state matrix, but only 14 are
+ * available because 'sp' and 'pc' cannot be used. So we spill the elements
+ * (x8, x9) to the stack and swap them out with (x10, x11). This adds one
+ * 'ldrd' and one 'strd' instruction per round.
+ *
+ * All rotates are performed using the implicit rotate operand accepted by the
+ * 'add' and 'eor' instructions. This is faster than using explicit rotate
+ * instructions. To make this work, we allow the values in the second and last
+ * rows of the ChaCha state matrix (rows 'b' and 'd') to temporarily have the
+ * wrong rotation amount. The rotation amount is then fixed up just in time
+ * when the values are used. 'brot' is the number of bits the values in row 'b'
+ * need to be rotated right to arrive at the correct values, and 'drot'
+ * similarly for row 'd'. (brot, drot) start out as (0, 0) but we make it such
+ * that they end up as (25, 24) after every round.
+ */
+
+ // ChaCha state registers
+ X0 .req r0
+ X1 .req r1
+ X2 .req r2
+ X3 .req r3
+ X4 .req r4
+ X5 .req r5
+ X6 .req r6
+ X7 .req r7
+ X8_X10 .req r8 // shared by x8 and x10
+ X9_X11 .req r9 // shared by x9 and x11
+ X12 .req r10
+ X13 .req r11
+ X14 .req r12
+ X15 .req r14
+
+.Lexpand_32byte_k:
+ // "expand 32-byte k"
+ .word 0x61707865, 0x3320646e, 0x79622d32, 0x6b206574
+
+#ifdef __thumb2__
+# define adrl adr
+#endif
+
+.macro __rev out, in, t0, t1, t2
+.if __LINUX_ARM_ARCH__ >= 6
+ rev \out, \in
+.else
+ lsl \t0, \in, #24
+ and \t1, \in, #0xff00
+ and \t2, \in, #0xff0000
+ orr \out, \t0, \in, lsr #24
+ orr \out, \out, \t1, lsl #8
+ orr \out, \out, \t2, lsr #8
+.endif
+.endm
+
+.macro _le32_bswap x, t0, t1, t2
+#ifdef __ARMEB__
+ __rev \x, \x, \t0, \t1, \t2
+#endif
+.endm
+
+.macro _le32_bswap_4x a, b, c, d, t0, t1, t2
+ _le32_bswap \a, \t0, \t1, \t2
+ _le32_bswap \b, \t0, \t1, \t2
+ _le32_bswap \c, \t0, \t1, \t2
+ _le32_bswap \d, \t0, \t1, \t2
+.endm
+
+.macro __ldrd a, b, src, offset
+#if __LINUX_ARM_ARCH__ >= 6
+ ldrd \a, \b, [\src, #\offset]
+#else
+ ldr \a, [\src, #\offset]
+ ldr \b, [\src, #\offset + 4]
+#endif
+.endm
+
+.macro __strd a, b, dst, offset
+#if __LINUX_ARM_ARCH__ >= 6
+ strd \a, \b, [\dst, #\offset]
+#else
+ str \a, [\dst, #\offset]
+ str \b, [\dst, #\offset + 4]
+#endif
+.endm
+
+.macro _halfround a1, b1, c1, d1, a2, b2, c2, d2
+
+ // a += b; d ^= a; d = rol(d, 16);
+ add \a1, \a1, \b1, ror #brot
+ add \a2, \a2, \b2, ror #brot
+ eor \d1, \a1, \d1, ror #drot
+ eor \d2, \a2, \d2, ror #drot
+ // drot == 32 - 16 == 16
+
+ // c += d; b ^= c; b = rol(b, 12);
+ add \c1, \c1, \d1, ror #16
+ add \c2, \c2, \d2, ror #16
+ eor \b1, \c1, \b1, ror #brot
+ eor \b2, \c2, \b2, ror #brot
+ // brot == 32 - 12 == 20
+
+ // a += b; d ^= a; d = rol(d, 8);
+ add \a1, \a1, \b1, ror #20
+ add \a2, \a2, \b2, ror #20
+ eor \d1, \a1, \d1, ror #16
+ eor \d2, \a2, \d2, ror #16
+ // drot == 32 - 8 == 24
+
+ // c += d; b ^= c; b = rol(b, 7);
+ add \c1, \c1, \d1, ror #24
+ add \c2, \c2, \d2, ror #24
+ eor \b1, \c1, \b1, ror #20
+ eor \b2, \c2, \b2, ror #20
+ // brot == 32 - 7 == 25
+.endm
+
+.macro _doubleround
+
+ // column round
+
+ // quarterrounds: (x0, x4, x8, x12) and (x1, x5, x9, x13)
+ _halfround X0, X4, X8_X10, X12, X1, X5, X9_X11, X13
+
+ // save (x8, x9); restore (x10, x11)
+ __strd X8_X10, X9_X11, sp, 0
+ __ldrd X8_X10, X9_X11, sp, 8
+
+ // quarterrounds: (x2, x6, x10, x14) and (x3, x7, x11, x15)
+ _halfround X2, X6, X8_X10, X14, X3, X7, X9_X11, X15
+
+ .set brot, 25
+ .set drot, 24
+
+ // diagonal round
+
+ // quarterrounds: (x0, x5, x10, x15) and (x1, x6, x11, x12)
+ _halfround X0, X5, X8_X10, X15, X1, X6, X9_X11, X12
+
+ // save (x10, x11); restore (x8, x9)
+ __strd X8_X10, X9_X11, sp, 8
+ __ldrd X8_X10, X9_X11, sp, 0
+
+ // quarterrounds: (x2, x7, x8, x13) and (x3, x4, x9, x14)
+ _halfround X2, X7, X8_X10, X13, X3, X4, X9_X11, X14
+.endm
+
+.macro _chacha_permute nrounds
+ .set brot, 0
+ .set drot, 0
+ .rept \nrounds / 2
+ _doubleround
+ .endr
+.endm
+
+.macro _chacha nrounds
+
+.Lnext_block\@:
+ // Stack: unused0-unused1 x10-x11 x0-x15 OUT IN LEN
+ // Registers contain x0-x9,x12-x15.
+
+ // Do the core ChaCha permutation to update x0-x15.
+ _chacha_permute \nrounds
+
+ add sp, #8
+ // Stack: x10-x11 orig_x0-orig_x15 OUT IN LEN
+ // Registers contain x0-x9,x12-x15.
+ // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'.
+
+ // Free up some registers (r8-r12,r14) by pushing (x8-x9,x12-x15).
+ push {X8_X10, X9_X11, X12, X13, X14, X15}
+
+ // Load (OUT, IN, LEN).
+ ldr r14, [sp, #96]
+ ldr r12, [sp, #100]
+ ldr r11, [sp, #104]
+
+ orr r10, r14, r12
+
+ // Use slow path if fewer than 64 bytes remain.
+ cmp r11, #64
+ blt .Lxor_slowpath\@
+
+ // Use slow path if IN and/or OUT isn't 4-byte aligned. Needed even on
+ // ARMv6+, since ldmia and stmia (used below) still require alignment.
+ tst r10, #3
+ bne .Lxor_slowpath\@
+
+ // Fast path: XOR 64 bytes of aligned data.
+
+ // Stack: x8-x9 x12-x15 x10-x11 orig_x0-orig_x15 OUT IN LEN
+ // Registers: r0-r7 are x0-x7; r8-r11 are free; r12 is IN; r14 is OUT.
+ // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'.
+
+ // x0-x3
+ __ldrd r8, r9, sp, 32
+ __ldrd r10, r11, sp, 40
+ add X0, X0, r8
+ add X1, X1, r9
+ add X2, X2, r10
+ add X3, X3, r11
+ _le32_bswap_4x X0, X1, X2, X3, r8, r9, r10
+ ldmia r12!, {r8-r11}
+ eor X0, X0, r8
+ eor X1, X1, r9
+ eor X2, X2, r10
+ eor X3, X3, r11
+ stmia r14!, {X0-X3}
+
+ // x4-x7
+ __ldrd r8, r9, sp, 48
+ __ldrd r10, r11, sp, 56
+ add X4, r8, X4, ror #brot
+ add X5, r9, X5, ror #brot
+ ldmia r12!, {X0-X3}
+ add X6, r10, X6, ror #brot
+ add X7, r11, X7, ror #brot
+ _le32_bswap_4x X4, X5, X6, X7, r8, r9, r10
+ eor X4, X4, X0
+ eor X5, X5, X1
+ eor X6, X6, X2
+ eor X7, X7, X3
+ stmia r14!, {X4-X7}
+
+ // x8-x15
+ pop {r0-r7} // (x8-x9,x12-x15,x10-x11)
+ __ldrd r8, r9, sp, 32
+ __ldrd r10, r11, sp, 40
+ add r0, r0, r8 // x8
+ add r1, r1, r9 // x9
+ add r6, r6, r10 // x10
+ add r7, r7, r11 // x11
+ _le32_bswap_4x r0, r1, r6, r7, r8, r9, r10
+ ldmia r12!, {r8-r11}
+ eor r0, r0, r8 // x8
+ eor r1, r1, r9 // x9
+ eor r6, r6, r10 // x10
+ eor r7, r7, r11 // x11
+ stmia r14!, {r0,r1,r6,r7}
+ ldmia r12!, {r0,r1,r6,r7}
+ __ldrd r8, r9, sp, 48
+ __ldrd r10, r11, sp, 56
+ add r2, r8, r2, ror #drot // x12
+ add r3, r9, r3, ror #drot // x13
+ add r4, r10, r4, ror #drot // x14
+ add r5, r11, r5, ror #drot // x15
+ _le32_bswap_4x r2, r3, r4, r5, r9, r10, r11
+ ldr r9, [sp, #72] // load LEN
+ eor r2, r2, r0 // x12
+ eor r3, r3, r1 // x13
+ eor r4, r4, r6 // x14
+ eor r5, r5, r7 // x15
+ subs r9, #64 // decrement and check LEN
+ stmia r14!, {r2-r5}
+
+ beq .Ldone\@
+
+.Lprepare_for_next_block\@:
+
+ // Stack: x0-x15 OUT IN LEN
+
+ // Increment block counter (x12)
+ add r8, #1
+
+ // Store updated (OUT, IN, LEN)
+ str r14, [sp, #64]
+ str r12, [sp, #68]
+ str r9, [sp, #72]
+
+ mov r14, sp
+
+ // Store updated block counter (x12)
+ str r8, [sp, #48]
+
+ sub sp, #16
+
+ // Reload state and do next block
+ ldmia r14!, {r0-r11} // load x0-x11
+ __strd r10, r11, sp, 8 // store x10-x11 before state
+ ldmia r14, {r10-r12,r14} // load x12-x15
+ b .Lnext_block\@
+
+.Lxor_slowpath\@:
+ // Slow path: < 64 bytes remaining, or unaligned input or output buffer.
+ // We handle it by storing the 64 bytes of keystream to the stack, then
+ // XOR-ing the needed portion with the data.
+
+ // Allocate keystream buffer
+ sub sp, #64
+ mov r14, sp
+
+ // Stack: ks0-ks15 x8-x9 x12-x15 x10-x11 orig_x0-orig_x15 OUT IN LEN
+ // Registers: r0-r7 are x0-x7; r8-r11 are free; r12 is IN; r14 is &ks0.
+ // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'.
+
+ // Save keystream for x0-x3
+ __ldrd r8, r9, sp, 96
+ __ldrd r10, r11, sp, 104
+ add X0, X0, r8
+ add X1, X1, r9
+ add X2, X2, r10
+ add X3, X3, r11
+ _le32_bswap_4x X0, X1, X2, X3, r8, r9, r10
+ stmia r14!, {X0-X3}
+
+ // Save keystream for x4-x7
+ __ldrd r8, r9, sp, 112
+ __ldrd r10, r11, sp, 120
+ add X4, r8, X4, ror #brot
+ add X5, r9, X5, ror #brot
+ add X6, r10, X6, ror #brot
+ add X7, r11, X7, ror #brot
+ _le32_bswap_4x X4, X5, X6, X7, r8, r9, r10
+ add r8, sp, #64
+ stmia r14!, {X4-X7}
+
+ // Save keystream for x8-x15
+ ldm r8, {r0-r7} // (x8-x9,x12-x15,x10-x11)
+ __ldrd r8, r9, sp, 128
+ __ldrd r10, r11, sp, 136
+ add r0, r0, r8 // x8
+ add r1, r1, r9 // x9
+ add r6, r6, r10 // x10
+ add r7, r7, r11 // x11
+ _le32_bswap_4x r0, r1, r6, r7, r8, r9, r10
+ stmia r14!, {r0,r1,r6,r7}
+ __ldrd r8, r9, sp, 144
+ __ldrd r10, r11, sp, 152
+ add r2, r8, r2, ror #drot // x12
+ add r3, r9, r3, ror #drot // x13
+ add r4, r10, r4, ror #drot // x14
+ add r5, r11, r5, ror #drot // x15
+ _le32_bswap_4x r2, r3, r4, r5, r9, r10, r11
+ stmia r14, {r2-r5}
+
+ // Stack: ks0-ks15 unused0-unused7 x0-x15 OUT IN LEN
+ // Registers: r8 is block counter, r12 is IN.
+
+ ldr r9, [sp, #168] // LEN
+ ldr r14, [sp, #160] // OUT
+ cmp r9, #64
+ mov r0, sp
+ movle r1, r9
+ movgt r1, #64
+ // r1 is number of bytes to XOR, in range [1, 64]
+
+.if __LINUX_ARM_ARCH__ < 6
+ orr r2, r12, r14
+ tst r2, #3 // IN or OUT misaligned?
+ bne .Lxor_next_byte\@
+.endif
+
+ // XOR a word at a time
+.rept 16
+ subs r1, #4
+ blt .Lxor_words_done\@
+ ldr r2, [r12], #4
+ ldr r3, [r0], #4
+ eor r2, r2, r3
+ str r2, [r14], #4
+.endr
+ b .Lxor_slowpath_done\@
+.Lxor_words_done\@:
+ ands r1, r1, #3
+ beq .Lxor_slowpath_done\@
+
+ // XOR a byte at a time
+.Lxor_next_byte\@:
+ ldrb r2, [r12], #1
+ ldrb r3, [r0], #1
+ eor r2, r2, r3
+ strb r2, [r14], #1
+ subs r1, #1
+ bne .Lxor_next_byte\@
+
+.Lxor_slowpath_done\@:
+ subs r9, #64
+ add sp, #96
+ bgt .Lprepare_for_next_block\@
+
+.Ldone\@:
+.endm // _chacha
+
+/*
+ * void chacha20_arm(u8 *out, const u8 *in, size_t len, const u32 key[8],
+ * const u32 iv[4]);
+ */
+ENTRY(chacha20_arm)
+ cmp r2, #0 // len == 0?
+ reteq lr
+
+ push {r0-r2,r4-r11,lr}
+
+ // Push state x0-x15 onto stack.
+ // Also store an extra copy of x10-x11 just before the state.
+
+ ldr r4, [sp, #48] // iv
+ mov r0, sp
+ sub sp, #80
+
+ // iv: x12-x15
+ ldm r4, {X12,X13,X14,X15}
+ stmdb r0!, {X12,X13,X14,X15}
+
+ // key: x4-x11
+ __ldrd X8_X10, X9_X11, r3, 24
+ __strd X8_X10, X9_X11, sp, 8
+ stmdb r0!, {X8_X10, X9_X11}
+ ldm r3, {X4-X9_X11}
+ stmdb r0!, {X4-X9_X11}
+
+ // constants: x0-x3
+ adrl X3, .Lexpand_32byte_k
+ ldm X3, {X0-X3}
+ __strd X0, X1, sp, 16
+ __strd X2, X3, sp, 24
+
+ _chacha 20
+
+ add sp, #76
+ pop {r4-r11, pc}
+ENDPROC(chacha20_arm)
+
+/*
+ * void hchacha20_arm(const u32 state[16], u32 out[8]);
+ */
+ENTRY(hchacha20_arm)
+ push {r1,r4-r11,lr}
+
+ mov r14, r0
+ ldmia r14!, {r0-r11} // load x0-x11
+ push {r10-r11} // store x10-x11 to stack
+ ldm r14, {r10-r12,r14} // load x12-x15
+ sub sp, #8
+
+ _chacha_permute 20
+
+ // Skip over (unused0-unused1, x10-x11)
+ add sp, #16
+
+ // Fix up rotations of x12-x15
+ ror X12, X12, #drot
+ ror X13, X13, #drot
+ pop {r4} // load 'out'
+ ror X14, X14, #drot
+ ror X15, X15, #drot
+
+ // Store (x0-x3,x12-x15) to 'out'
+ stm r4, {X0,X1,X2,X3,X12,X13,X14,X15}
+
+ pop {r4-r11,pc}
+ENDPROC(hchacha20_arm)
+
+#ifdef CONFIG_KERNEL_MODE_NEON
+/*
+ * This following NEON routine was ported from Andy Polyakov's implementation
+ * from CRYPTOGAMS. It begins with parts of the CRYPTOGAMS scalar routine,
+ * since certain NEON code paths actually branch to it.
+ */
.text
#if defined(__thumb2__) || defined(__clang__)
@@ -22,39 +485,6 @@
#define ldrhsb ldrbhs
#endif
-.align 5
-.Lsigma:
-.long 0x61707865,0x3320646e,0x79622d32,0x6b206574 @ endian-neutral
-.Lone:
-.long 1,0,0,0
-.word -1
-
-.align 5
-ENTRY(chacha20_arm)
- ldr r12,[sp,#0] @ pull pointer to counter and nonce
- stmdb sp!,{r0-r2,r4-r11,lr}
- cmp r2,#0 @ len==0?
-#ifdef __thumb2__
- itt eq
-#endif
- addeq sp,sp,#4*3
- beq .Lno_data_arm
- ldmia r12,{r4-r7} @ load counter and nonce
- sub sp,sp,#4*(16) @ off-load area
-#if __LINUX_ARM_ARCH__ < 7 && !defined(__thumb2__)
- sub r14,pc,#100 @ .Lsigma
-#else
- adr r14,.Lsigma @ .Lsigma
-#endif
- stmdb sp!,{r4-r7} @ copy counter and nonce
- ldmia r3,{r4-r11} @ load key
- ldmia r14,{r0-r3} @ load sigma
- stmdb sp!,{r4-r11} @ copy key
- stmdb sp!,{r0-r3} @ copy sigma
- str r10,[sp,#4*(16+10)] @ off-load "rx"
- str r11,[sp,#4*(16+11)] @ off-load "rx"
- b .Loop_outer_enter
-
.align 4
.Loop_outer:
ldmia sp,{r0-r9} @ load key material
@@ -748,11 +1178,8 @@ ENTRY(chacha20_arm)
.Ldone:
add sp,sp,#4*(32+3)
-.Lno_data_arm:
ldmia sp!,{r4-r11,pc}
-ENDPROC(chacha20_arm)
-#ifdef CONFIG_KERNEL_MODE_NEON
.align 5
.Lsigma2:
.long 0x61707865,0x3320646e,0x79622d32,0x6b206574 @ endian-neutral
diff --git a/lib/zinc/chacha20/chacha20-arm64-cryptogams.S b/lib/zinc/chacha20/chacha20-arm64.S
similarity index 100%
rename from lib/zinc/chacha20/chacha20-arm64-cryptogams.S
rename to lib/zinc/chacha20/chacha20-arm64.S
diff --git a/lib/zinc/chacha20/chacha20.c b/lib/zinc/chacha20/chacha20.c
index 22a21431c221..3698fcd8ae7f 100644
--- a/lib/zinc/chacha20/chacha20.c
+++ b/lib/zinc/chacha20/chacha20.c
@@ -18,6 +18,8 @@
#if defined(CONFIG_ZINC_ARCH_X86_64)
#include "chacha20-x86_64-glue.c"
+#elif defined(CONFIG_ZINC_ARCH_ARM) || defined(CONFIG_ZINC_ARCH_ARM64)
+#include "chacha20-arm-glue.c"
#else
static bool *const chacha20_nobs[] __initconst = { };
static void __init chacha20_fpu_init(void)
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 10/28] zinc: ChaCha20 MIPS32r2 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (6 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 09/28] zinc: " Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 11/28] zinc: Poly1305 generic C implementations and selftest Jason A. Donenfeld
` (15 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, René van Dorst, Ralf Baechle,
Paul Burton, James Hogan, linux-mips, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This MIPS32r2 implementation comes from René van Dorst and me and
results in a nice speedup on the usual OpenWRT targets.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Co-developed-by: René van Dorst <opensource@vdorst.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 2 +
lib/zinc/chacha20/chacha20-mips-glue.c | 28 ++
lib/zinc/chacha20/chacha20-mips.S | 424 +++++++++++++++++++++++++
lib/zinc/chacha20/chacha20.c | 2 +
4 files changed, 456 insertions(+)
create mode 100644 lib/zinc/chacha20/chacha20-mips-glue.c
create mode 100644 lib/zinc/chacha20/chacha20-mips.S
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index e47f64e12bbd..60d568cf5206 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -6,4 +6,6 @@ zinc_chacha20-y := chacha20/chacha20.o
zinc_chacha20-$(CONFIG_ZINC_ARCH_X86_64) += chacha20/chacha20-x86_64.o
zinc_chacha20-$(CONFIG_ZINC_ARCH_ARM) += chacha20/chacha20-arm.o
zinc_chacha20-$(CONFIG_ZINC_ARCH_ARM64) += chacha20/chacha20-arm64.o
+zinc_chacha20-$(CONFIG_ZINC_ARCH_MIPS) += chacha20/chacha20-mips.o
+AFLAGS_chacha20-mips.o += -O2 # This is required to fill the branch delay slots
obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
diff --git a/lib/zinc/chacha20/chacha20-mips-glue.c b/lib/zinc/chacha20/chacha20-mips-glue.c
new file mode 100644
index 000000000000..917d8fa8e3f4
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-mips-glue.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+asmlinkage void chacha20_mips(u32 state[16], u8 *out, const u8 *in,
+ const size_t len);
+static bool *const chacha20_nobs[] __initconst = { };
+static void __init chacha20_fpu_init(void)
+{
+}
+
+static inline bool chacha20_arch(struct chacha20_ctx *ctx, u8 *dst,
+ const u8 *src, size_t len,
+ simd_context_t *simd_context)
+{
+ chacha20_mips(ctx->state, dst, src, len);
+ return true;
+}
+
+
+static inline bool hchacha20_arch(u32 derived_key[CHACHA20_KEY_WORDS],
+ const u8 nonce[HCHACHA20_NONCE_SIZE],
+ const u8 key[HCHACHA20_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ return false;
+}
diff --git a/lib/zinc/chacha20/chacha20-mips.S b/lib/zinc/chacha20/chacha20-mips.S
new file mode 100644
index 000000000000..031ee5e794df
--- /dev/null
+++ b/lib/zinc/chacha20/chacha20-mips.S
@@ -0,0 +1,424 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2016-2018 René van Dorst <opensource@vdorst.com>. All Rights Reserved.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#define MASK_U32 0x3c
+#define CHACHA20_BLOCK_SIZE 64
+#define STACK_SIZE 32
+
+#define X0 $t0
+#define X1 $t1
+#define X2 $t2
+#define X3 $t3
+#define X4 $t4
+#define X5 $t5
+#define X6 $t6
+#define X7 $t7
+#define X8 $t8
+#define X9 $t9
+#define X10 $v1
+#define X11 $s6
+#define X12 $s5
+#define X13 $s4
+#define X14 $s3
+#define X15 $s2
+/* Use regs which are overwritten on exit for Tx so we don't leak clear data. */
+#define T0 $s1
+#define T1 $s0
+#define T(n) T ## n
+#define X(n) X ## n
+
+/* Input arguments */
+#define STATE $a0
+#define OUT $a1
+#define IN $a2
+#define BYTES $a3
+
+/* Output argument */
+/* NONCE[0] is kept in a register and not in memory.
+ * We don't want to touch original value in memory.
+ * Must be incremented every loop iteration.
+ */
+#define NONCE_0 $v0
+
+/* SAVED_X and SAVED_CA are set in the jump table.
+ * Use regs which are overwritten on exit else we don't leak clear data.
+ * They are used to handling the last bytes which are not multiple of 4.
+ */
+#define SAVED_X X15
+#define SAVED_CA $s7
+
+#define IS_UNALIGNED $s7
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+#define MSB 0
+#define LSB 3
+#define ROTx rotl
+#define ROTR(n) rotr n, 24
+#define CPU_TO_LE32(n) \
+ wsbh n; \
+ rotr n, 16;
+#else
+#define MSB 3
+#define LSB 0
+#define ROTx rotr
+#define CPU_TO_LE32(n)
+#define ROTR(n)
+#endif
+
+#define FOR_EACH_WORD(x) \
+ x( 0); \
+ x( 1); \
+ x( 2); \
+ x( 3); \
+ x( 4); \
+ x( 5); \
+ x( 6); \
+ x( 7); \
+ x( 8); \
+ x( 9); \
+ x(10); \
+ x(11); \
+ x(12); \
+ x(13); \
+ x(14); \
+ x(15);
+
+#define FOR_EACH_WORD_REV(x) \
+ x(15); \
+ x(14); \
+ x(13); \
+ x(12); \
+ x(11); \
+ x(10); \
+ x( 9); \
+ x( 8); \
+ x( 7); \
+ x( 6); \
+ x( 5); \
+ x( 4); \
+ x( 3); \
+ x( 2); \
+ x( 1); \
+ x( 0);
+
+#define PLUS_ONE_0 1
+#define PLUS_ONE_1 2
+#define PLUS_ONE_2 3
+#define PLUS_ONE_3 4
+#define PLUS_ONE_4 5
+#define PLUS_ONE_5 6
+#define PLUS_ONE_6 7
+#define PLUS_ONE_7 8
+#define PLUS_ONE_8 9
+#define PLUS_ONE_9 10
+#define PLUS_ONE_10 11
+#define PLUS_ONE_11 12
+#define PLUS_ONE_12 13
+#define PLUS_ONE_13 14
+#define PLUS_ONE_14 15
+#define PLUS_ONE_15 16
+#define PLUS_ONE(x) PLUS_ONE_ ## x
+#define _CONCAT3(a,b,c) a ## b ## c
+#define CONCAT3(a,b,c) _CONCAT3(a,b,c)
+
+#define STORE_UNALIGNED(x) \
+CONCAT3(.Lchacha20_mips_xor_unaligned_, PLUS_ONE(x), _b: ;) \
+ .if (x != 12); \
+ lw T0, (x*4)(STATE); \
+ .endif; \
+ lwl T1, (x*4)+MSB ## (IN); \
+ lwr T1, (x*4)+LSB ## (IN); \
+ .if (x == 12); \
+ addu X ## x, NONCE_0; \
+ .else; \
+ addu X ## x, T0; \
+ .endif; \
+ CPU_TO_LE32(X ## x); \
+ xor X ## x, T1; \
+ swl X ## x, (x*4)+MSB ## (OUT); \
+ swr X ## x, (x*4)+LSB ## (OUT);
+
+#define STORE_ALIGNED(x) \
+CONCAT3(.Lchacha20_mips_xor_aligned_, PLUS_ONE(x), _b: ;) \
+ .if (x != 12); \
+ lw T0, (x*4)(STATE); \
+ .endif; \
+ lw T1, (x*4) ## (IN); \
+ .if (x == 12); \
+ addu X ## x, NONCE_0; \
+ .else; \
+ addu X ## x, T0; \
+ .endif; \
+ CPU_TO_LE32(X ## x); \
+ xor X ## x, T1; \
+ sw X ## x, (x*4) ## (OUT);
+
+/* Jump table macro.
+ * Used for setup and handling the last bytes, which are not multiple of 4.
+ * X15 is free to store Xn
+ * Every jumptable entry must be equal in size.
+ */
+#define JMPTBL_ALIGNED(x) \
+.Lchacha20_mips_jmptbl_aligned_ ## x: ; \
+ .set noreorder; \
+ b .Lchacha20_mips_xor_aligned_ ## x ## _b; \
+ .if (x == 12); \
+ addu SAVED_X, X ## x, NONCE_0; \
+ .else; \
+ addu SAVED_X, X ## x, SAVED_CA; \
+ .endif; \
+ .set reorder
+
+#define JMPTBL_UNALIGNED(x) \
+.Lchacha20_mips_jmptbl_unaligned_ ## x: ; \
+ .set noreorder; \
+ b .Lchacha20_mips_xor_unaligned_ ## x ## _b; \
+ .if (x == 12); \
+ addu SAVED_X, X ## x, NONCE_0; \
+ .else; \
+ addu SAVED_X, X ## x, SAVED_CA; \
+ .endif; \
+ .set reorder
+
+#define AXR(A, B, C, D, K, L, M, N, V, W, Y, Z, S) \
+ addu X(A), X(K); \
+ addu X(B), X(L); \
+ addu X(C), X(M); \
+ addu X(D), X(N); \
+ xor X(V), X(A); \
+ xor X(W), X(B); \
+ xor X(Y), X(C); \
+ xor X(Z), X(D); \
+ rotl X(V), S; \
+ rotl X(W), S; \
+ rotl X(Y), S; \
+ rotl X(Z), S;
+
+.text
+.set reorder
+.set noat
+.globl chacha20_mips
+.ent chacha20_mips
+chacha20_mips:
+ .frame $sp, STACK_SIZE, $ra
+
+ addiu $sp, -STACK_SIZE
+
+ /* Return bytes = 0. */
+ beqz BYTES, .Lchacha20_mips_end
+
+ lw NONCE_0, 48(STATE)
+
+ /* Save s0-s7 */
+ sw $s0, 0($sp)
+ sw $s1, 4($sp)
+ sw $s2, 8($sp)
+ sw $s3, 12($sp)
+ sw $s4, 16($sp)
+ sw $s5, 20($sp)
+ sw $s6, 24($sp)
+ sw $s7, 28($sp)
+
+ /* Test IN or OUT is unaligned.
+ * IS_UNALIGNED = ( IN | OUT ) & 0x00000003
+ */
+ or IS_UNALIGNED, IN, OUT
+ andi IS_UNALIGNED, 0x3
+
+ /* Set number of rounds */
+ li $at, 20
+
+ b .Lchacha20_rounds_start
+
+.align 4
+.Loop_chacha20_rounds:
+ addiu IN, CHACHA20_BLOCK_SIZE
+ addiu OUT, CHACHA20_BLOCK_SIZE
+ addiu NONCE_0, 1
+
+.Lchacha20_rounds_start:
+ lw X0, 0(STATE)
+ lw X1, 4(STATE)
+ lw X2, 8(STATE)
+ lw X3, 12(STATE)
+
+ lw X4, 16(STATE)
+ lw X5, 20(STATE)
+ lw X6, 24(STATE)
+ lw X7, 28(STATE)
+ lw X8, 32(STATE)
+ lw X9, 36(STATE)
+ lw X10, 40(STATE)
+ lw X11, 44(STATE)
+
+ move X12, NONCE_0
+ lw X13, 52(STATE)
+ lw X14, 56(STATE)
+ lw X15, 60(STATE)
+
+.Loop_chacha20_xor_rounds:
+ addiu $at, -2
+ AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16);
+ AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12);
+ AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8);
+ AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7);
+ AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16);
+ AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12);
+ AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8);
+ AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7);
+ bnez $at, .Loop_chacha20_xor_rounds
+
+ addiu BYTES, -(CHACHA20_BLOCK_SIZE)
+
+ /* Is data src/dst unaligned? Jump */
+ bnez IS_UNALIGNED, .Loop_chacha20_unaligned
+
+ /* Set number rounds here to fill delayslot. */
+ li $at, 20
+
+ /* BYTES < 0, it has no full block. */
+ bltz BYTES, .Lchacha20_mips_no_full_block_aligned
+
+ FOR_EACH_WORD_REV(STORE_ALIGNED)
+
+ /* BYTES > 0? Loop again. */
+ bgtz BYTES, .Loop_chacha20_rounds
+
+ /* Place this here to fill delay slot */
+ addiu NONCE_0, 1
+
+ /* BYTES < 0? Handle last bytes */
+ bltz BYTES, .Lchacha20_mips_xor_bytes
+
+.Lchacha20_mips_xor_done:
+ /* Restore used registers */
+ lw $s0, 0($sp)
+ lw $s1, 4($sp)
+ lw $s2, 8($sp)
+ lw $s3, 12($sp)
+ lw $s4, 16($sp)
+ lw $s5, 20($sp)
+ lw $s6, 24($sp)
+ lw $s7, 28($sp)
+
+ /* Write NONCE_0 back to right location in state */
+ sw NONCE_0, 48(STATE)
+
+.Lchacha20_mips_end:
+ addiu $sp, STACK_SIZE
+ jr $ra
+
+.Lchacha20_mips_no_full_block_aligned:
+ /* Restore the offset on BYTES */
+ addiu BYTES, CHACHA20_BLOCK_SIZE
+
+ /* Get number of full WORDS */
+ andi $at, BYTES, MASK_U32
+
+ /* Load upper half of jump table addr */
+ lui T0, %hi(.Lchacha20_mips_jmptbl_aligned_0)
+
+ /* Calculate lower half jump table offset */
+ ins T0, $at, 1, 6
+
+ /* Add offset to STATE */
+ addu T1, STATE, $at
+
+ /* Add lower half jump table addr */
+ addiu T0, %lo(.Lchacha20_mips_jmptbl_aligned_0)
+
+ /* Read value from STATE */
+ lw SAVED_CA, 0(T1)
+
+ /* Store remaining bytecounter as negative value */
+ subu BYTES, $at, BYTES
+
+ jr T0
+
+ /* Jump table */
+ FOR_EACH_WORD(JMPTBL_ALIGNED)
+
+
+.Loop_chacha20_unaligned:
+ /* Set number rounds here to fill delayslot. */
+ li $at, 20
+
+ /* BYTES > 0, it has no full block. */
+ bltz BYTES, .Lchacha20_mips_no_full_block_unaligned
+
+ FOR_EACH_WORD_REV(STORE_UNALIGNED)
+
+ /* BYTES > 0? Loop again. */
+ bgtz BYTES, .Loop_chacha20_rounds
+
+ /* Write NONCE_0 back to right location in state */
+ sw NONCE_0, 48(STATE)
+
+ .set noreorder
+ /* Fall through to byte handling */
+ bgez BYTES, .Lchacha20_mips_xor_done
+.Lchacha20_mips_xor_unaligned_0_b:
+.Lchacha20_mips_xor_aligned_0_b:
+ /* Place this here to fill delay slot */
+ addiu NONCE_0, 1
+ .set reorder
+
+.Lchacha20_mips_xor_bytes:
+ addu IN, $at
+ addu OUT, $at
+ /* First byte */
+ lbu T1, 0(IN)
+ addiu $at, BYTES, 1
+ CPU_TO_LE32(SAVED_X)
+ ROTR(SAVED_X)
+ xor T1, SAVED_X
+ sb T1, 0(OUT)
+ beqz $at, .Lchacha20_mips_xor_done
+ /* Second byte */
+ lbu T1, 1(IN)
+ addiu $at, BYTES, 2
+ ROTx SAVED_X, 8
+ xor T1, SAVED_X
+ sb T1, 1(OUT)
+ beqz $at, .Lchacha20_mips_xor_done
+ /* Third byte */
+ lbu T1, 2(IN)
+ ROTx SAVED_X, 8
+ xor T1, SAVED_X
+ sb T1, 2(OUT)
+ b .Lchacha20_mips_xor_done
+
+.Lchacha20_mips_no_full_block_unaligned:
+ /* Restore the offset on BYTES */
+ addiu BYTES, CHACHA20_BLOCK_SIZE
+
+ /* Get number of full WORDS */
+ andi $at, BYTES, MASK_U32
+
+ /* Load upper half of jump table addr */
+ lui T0, %hi(.Lchacha20_mips_jmptbl_unaligned_0)
+
+ /* Calculate lower half jump table offset */
+ ins T0, $at, 1, 6
+
+ /* Add offset to STATE */
+ addu T1, STATE, $at
+
+ /* Add lower half jump table addr */
+ addiu T0, %lo(.Lchacha20_mips_jmptbl_unaligned_0)
+
+ /* Read value from STATE */
+ lw SAVED_CA, 0(T1)
+
+ /* Store remaining bytecounter as negative value */
+ subu BYTES, $at, BYTES
+
+ jr T0
+
+ /* Jump table */
+ FOR_EACH_WORD(JMPTBL_UNALIGNED)
+.end chacha20_mips
+.set at
diff --git a/lib/zinc/chacha20/chacha20.c b/lib/zinc/chacha20/chacha20.c
index 3698fcd8ae7f..0b833310a7d8 100644
--- a/lib/zinc/chacha20/chacha20.c
+++ b/lib/zinc/chacha20/chacha20.c
@@ -20,6 +20,8 @@
#include "chacha20-x86_64-glue.c"
#elif defined(CONFIG_ZINC_ARCH_ARM) || defined(CONFIG_ZINC_ARCH_ARM64)
#include "chacha20-arm-glue.c"
+#elif defined(CONFIG_ZINC_ARCH_MIPS)
+#include "chacha20-mips-glue.c"
#else
static bool *const chacha20_nobs[] __initconst = { };
static void __init chacha20_fpu_init(void)
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 11/28] zinc: Poly1305 generic C implementations and selftest
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (7 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 10/28] zinc: ChaCha20 MIPS32r2 implementation Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 12/28] zinc: import Andy Polyakov's Poly1305 x86_64 implementation Jason A. Donenfeld
` (14 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
These two C implementations -- a 32x32 one and a 64x64 one, depending on
the platform -- come from Andrew Moon's public domain poly1305-donna
portable code, modified for usage in the kernel and for usage with
accelerated primitives.
Information: https://cr.yp.to/mac.html
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
include/zinc/poly1305.h | 31 +
lib/zinc/Kconfig | 3 +
lib/zinc/Makefile | 3 +
lib/zinc/poly1305/poly1305-donna32.h | 205 +++++
lib/zinc/poly1305/poly1305-donna64.h | 182 +++++
lib/zinc/poly1305/poly1305.c | 154 ++++
lib/zinc/selftest/poly1305.c | 1107 ++++++++++++++++++++++++++
7 files changed, 1685 insertions(+)
create mode 100644 include/zinc/poly1305.h
create mode 100644 lib/zinc/poly1305/poly1305-donna32.h
create mode 100644 lib/zinc/poly1305/poly1305-donna64.h
create mode 100644 lib/zinc/poly1305/poly1305.c
create mode 100644 lib/zinc/selftest/poly1305.c
diff --git a/include/zinc/poly1305.h b/include/zinc/poly1305.h
new file mode 100644
index 000000000000..13fe0e50fc3c
--- /dev/null
+++ b/include/zinc/poly1305.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_POLY1305_H
+#define _ZINC_POLY1305_H
+
+#include <linux/simd.h>
+#include <linux/types.h>
+
+enum poly1305_lengths {
+ POLY1305_BLOCK_SIZE = 16,
+ POLY1305_KEY_SIZE = 32,
+ POLY1305_MAC_SIZE = 16
+};
+
+struct poly1305_ctx {
+ u8 opaque[24 * sizeof(u64)];
+ u32 nonce[4];
+ u8 data[POLY1305_BLOCK_SIZE];
+ size_t num;
+} __aligned(8);
+
+void poly1305_init(struct poly1305_ctx *ctx, const u8 key[POLY1305_KEY_SIZE]);
+void poly1305_update(struct poly1305_ctx *ctx, const u8 *input, size_t len,
+ simd_context_t *simd_context);
+void poly1305_final(struct poly1305_ctx *ctx, u8 mac[POLY1305_MAC_SIZE],
+ simd_context_t *simd_context);
+
+#endif /* _ZINC_POLY1305_H */
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
index d271be37cecb..f08bf1eaa2a0 100644
--- a/lib/zinc/Kconfig
+++ b/lib/zinc/Kconfig
@@ -2,6 +2,9 @@ config ZINC_CHACHA20
tristate
select CRYPTO_ALGAPI
+config ZINC_POLY1305
+ tristate
+
config ZINC_SELFTEST
bool "Zinc cryptography library self-tests"
help
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 60d568cf5206..6fc9626c55fa 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -9,3 +9,6 @@ zinc_chacha20-$(CONFIG_ZINC_ARCH_ARM64) += chacha20/chacha20-arm64.o
zinc_chacha20-$(CONFIG_ZINC_ARCH_MIPS) += chacha20/chacha20-mips.o
AFLAGS_chacha20-mips.o += -O2 # This is required to fill the branch delay slots
obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
+
+zinc_poly1305-y := poly1305/poly1305.o
+obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
diff --git a/lib/zinc/poly1305/poly1305-donna32.h b/lib/zinc/poly1305/poly1305-donna32.h
new file mode 100644
index 000000000000..2e7565dd9679
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-donna32.h
@@ -0,0 +1,205 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is based in part on Andrew Moon's poly1305-donna, which is in the
+ * public domain.
+ */
+
+struct poly1305_internal {
+ u32 h[5];
+ u32 r[5];
+ u32 s[4];
+};
+
+static void poly1305_init_generic(void *ctx, const u8 key[16])
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+
+ /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
+ st->r[0] = (get_unaligned_le32(&key[0])) & 0x3ffffff;
+ st->r[1] = (get_unaligned_le32(&key[3]) >> 2) & 0x3ffff03;
+ st->r[2] = (get_unaligned_le32(&key[6]) >> 4) & 0x3ffc0ff;
+ st->r[3] = (get_unaligned_le32(&key[9]) >> 6) & 0x3f03fff;
+ st->r[4] = (get_unaligned_le32(&key[12]) >> 8) & 0x00fffff;
+
+ /* s = 5*r */
+ st->s[0] = st->r[1] * 5;
+ st->s[1] = st->r[2] * 5;
+ st->s[2] = st->r[3] * 5;
+ st->s[3] = st->r[4] * 5;
+
+ /* h = 0 */
+ st->h[0] = 0;
+ st->h[1] = 0;
+ st->h[2] = 0;
+ st->h[3] = 0;
+ st->h[4] = 0;
+}
+
+static void poly1305_blocks_generic(void *ctx, const u8 *input, size_t len,
+ const u32 padbit)
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+ const u32 hibit = padbit << 24;
+ u32 r0, r1, r2, r3, r4;
+ u32 s1, s2, s3, s4;
+ u32 h0, h1, h2, h3, h4;
+ u64 d0, d1, d2, d3, d4;
+ u32 c;
+
+ r0 = st->r[0];
+ r1 = st->r[1];
+ r2 = st->r[2];
+ r3 = st->r[3];
+ r4 = st->r[4];
+
+ s1 = st->s[0];
+ s2 = st->s[1];
+ s3 = st->s[2];
+ s4 = st->s[3];
+
+ h0 = st->h[0];
+ h1 = st->h[1];
+ h2 = st->h[2];
+ h3 = st->h[3];
+ h4 = st->h[4];
+
+ while (len >= POLY1305_BLOCK_SIZE) {
+ /* h += m[i] */
+ h0 += (get_unaligned_le32(&input[0])) & 0x3ffffff;
+ h1 += (get_unaligned_le32(&input[3]) >> 2) & 0x3ffffff;
+ h2 += (get_unaligned_le32(&input[6]) >> 4) & 0x3ffffff;
+ h3 += (get_unaligned_le32(&input[9]) >> 6) & 0x3ffffff;
+ h4 += (get_unaligned_le32(&input[12]) >> 8) | hibit;
+
+ /* h *= r */
+ d0 = ((u64)h0 * r0) + ((u64)h1 * s4) +
+ ((u64)h2 * s3) + ((u64)h3 * s2) +
+ ((u64)h4 * s1);
+ d1 = ((u64)h0 * r1) + ((u64)h1 * r0) +
+ ((u64)h2 * s4) + ((u64)h3 * s3) +
+ ((u64)h4 * s2);
+ d2 = ((u64)h0 * r2) + ((u64)h1 * r1) +
+ ((u64)h2 * r0) + ((u64)h3 * s4) +
+ ((u64)h4 * s3);
+ d3 = ((u64)h0 * r3) + ((u64)h1 * r2) +
+ ((u64)h2 * r1) + ((u64)h3 * r0) +
+ ((u64)h4 * s4);
+ d4 = ((u64)h0 * r4) + ((u64)h1 * r3) +
+ ((u64)h2 * r2) + ((u64)h3 * r1) +
+ ((u64)h4 * r0);
+
+ /* (partial) h %= p */
+ c = (u32)(d0 >> 26);
+ h0 = (u32)d0 & 0x3ffffff;
+ d1 += c;
+ c = (u32)(d1 >> 26);
+ h1 = (u32)d1 & 0x3ffffff;
+ d2 += c;
+ c = (u32)(d2 >> 26);
+ h2 = (u32)d2 & 0x3ffffff;
+ d3 += c;
+ c = (u32)(d3 >> 26);
+ h3 = (u32)d3 & 0x3ffffff;
+ d4 += c;
+ c = (u32)(d4 >> 26);
+ h4 = (u32)d4 & 0x3ffffff;
+ h0 += c * 5;
+ c = (h0 >> 26);
+ h0 = h0 & 0x3ffffff;
+ h1 += c;
+
+ input += POLY1305_BLOCK_SIZE;
+ len -= POLY1305_BLOCK_SIZE;
+ }
+
+ st->h[0] = h0;
+ st->h[1] = h1;
+ st->h[2] = h2;
+ st->h[3] = h3;
+ st->h[4] = h4;
+}
+
+static void poly1305_emit_generic(void *ctx, u8 mac[16], const u32 nonce[4])
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+ u32 h0, h1, h2, h3, h4, c;
+ u32 g0, g1, g2, g3, g4;
+ u64 f;
+ u32 mask;
+
+ /* fully carry h */
+ h0 = st->h[0];
+ h1 = st->h[1];
+ h2 = st->h[2];
+ h3 = st->h[3];
+ h4 = st->h[4];
+
+ c = h1 >> 26;
+ h1 = h1 & 0x3ffffff;
+ h2 += c;
+ c = h2 >> 26;
+ h2 = h2 & 0x3ffffff;
+ h3 += c;
+ c = h3 >> 26;
+ h3 = h3 & 0x3ffffff;
+ h4 += c;
+ c = h4 >> 26;
+ h4 = h4 & 0x3ffffff;
+ h0 += c * 5;
+ c = h0 >> 26;
+ h0 = h0 & 0x3ffffff;
+ h1 += c;
+
+ /* compute h + -p */
+ g0 = h0 + 5;
+ c = g0 >> 26;
+ g0 &= 0x3ffffff;
+ g1 = h1 + c;
+ c = g1 >> 26;
+ g1 &= 0x3ffffff;
+ g2 = h2 + c;
+ c = g2 >> 26;
+ g2 &= 0x3ffffff;
+ g3 = h3 + c;
+ c = g3 >> 26;
+ g3 &= 0x3ffffff;
+ g4 = h4 + c - (1UL << 26);
+
+ /* select h if h < p, or h + -p if h >= p */
+ mask = (g4 >> ((sizeof(u32) * 8) - 1)) - 1;
+ g0 &= mask;
+ g1 &= mask;
+ g2 &= mask;
+ g3 &= mask;
+ g4 &= mask;
+ mask = ~mask;
+
+ h0 = (h0 & mask) | g0;
+ h1 = (h1 & mask) | g1;
+ h2 = (h2 & mask) | g2;
+ h3 = (h3 & mask) | g3;
+ h4 = (h4 & mask) | g4;
+
+ /* h = h % (2^128) */
+ h0 = ((h0) | (h1 << 26)) & 0xffffffff;
+ h1 = ((h1 >> 6) | (h2 << 20)) & 0xffffffff;
+ h2 = ((h2 >> 12) | (h3 << 14)) & 0xffffffff;
+ h3 = ((h3 >> 18) | (h4 << 8)) & 0xffffffff;
+
+ /* mac = (h + nonce) % (2^128) */
+ f = (u64)h0 + nonce[0];
+ h0 = (u32)f;
+ f = (u64)h1 + nonce[1] + (f >> 32);
+ h1 = (u32)f;
+ f = (u64)h2 + nonce[2] + (f >> 32);
+ h2 = (u32)f;
+ f = (u64)h3 + nonce[3] + (f >> 32);
+ h3 = (u32)f;
+
+ put_unaligned_le32(h0, &mac[0]);
+ put_unaligned_le32(h1, &mac[4]);
+ put_unaligned_le32(h2, &mac[8]);
+ put_unaligned_le32(h3, &mac[12]);
+}
diff --git a/lib/zinc/poly1305/poly1305-donna64.h b/lib/zinc/poly1305/poly1305-donna64.h
new file mode 100644
index 000000000000..7850e0a11600
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-donna64.h
@@ -0,0 +1,182 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is based in part on Andrew Moon's poly1305-donna, which is in the
+ * public domain.
+ */
+
+typedef __uint128_t u128;
+
+struct poly1305_internal {
+ u64 r[3];
+ u64 h[3];
+ u64 s[2];
+};
+
+static void poly1305_init_generic(void *ctx, const u8 key[16])
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+ u64 t0, t1;
+
+ /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
+ t0 = get_unaligned_le64(&key[0]);
+ t1 = get_unaligned_le64(&key[8]);
+
+ st->r[0] = t0 & 0xffc0fffffff;
+ st->r[1] = ((t0 >> 44) | (t1 << 20)) & 0xfffffc0ffff;
+ st->r[2] = ((t1 >> 24)) & 0x00ffffffc0f;
+
+ /* s = 20*r */
+ st->s[0] = st->r[1] * 20;
+ st->s[1] = st->r[2] * 20;
+
+ /* h = 0 */
+ st->h[0] = 0;
+ st->h[1] = 0;
+ st->h[2] = 0;
+}
+
+static void poly1305_blocks_generic(void *ctx, const u8 *input, size_t len,
+ const u32 padbit)
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+ const u64 hibit = ((u64)padbit) << 40;
+ u64 r0, r1, r2;
+ u64 s1, s2;
+ u64 h0, h1, h2;
+ u64 c;
+ u128 d0, d1, d2, d;
+
+ r0 = st->r[0];
+ r1 = st->r[1];
+ r2 = st->r[2];
+
+ h0 = st->h[0];
+ h1 = st->h[1];
+ h2 = st->h[2];
+
+ s1 = st->s[0];
+ s2 = st->s[1];
+
+ while (len >= POLY1305_BLOCK_SIZE) {
+ u64 t0, t1;
+
+ /* h += m[i] */
+ t0 = get_unaligned_le64(&input[0]);
+ t1 = get_unaligned_le64(&input[8]);
+
+ h0 += t0 & 0xfffffffffff;
+ h1 += ((t0 >> 44) | (t1 << 20)) & 0xfffffffffff;
+ h2 += (((t1 >> 24)) & 0x3ffffffffff) | hibit;
+
+ /* h *= r */
+ d0 = (u128)h0 * r0;
+ d = (u128)h1 * s2;
+ d0 += d;
+ d = (u128)h2 * s1;
+ d0 += d;
+ d1 = (u128)h0 * r1;
+ d = (u128)h1 * r0;
+ d1 += d;
+ d = (u128)h2 * s2;
+ d1 += d;
+ d2 = (u128)h0 * r2;
+ d = (u128)h1 * r1;
+ d2 += d;
+ d = (u128)h2 * r0;
+ d2 += d;
+
+ /* (partial) h %= p */
+ c = (u64)(d0 >> 44);
+ h0 = (u64)d0 & 0xfffffffffff;
+ d1 += c;
+ c = (u64)(d1 >> 44);
+ h1 = (u64)d1 & 0xfffffffffff;
+ d2 += c;
+ c = (u64)(d2 >> 42);
+ h2 = (u64)d2 & 0x3ffffffffff;
+ h0 += c * 5;
+ c = h0 >> 44;
+ h0 = h0 & 0xfffffffffff;
+ h1 += c;
+
+ input += POLY1305_BLOCK_SIZE;
+ len -= POLY1305_BLOCK_SIZE;
+ }
+
+ st->h[0] = h0;
+ st->h[1] = h1;
+ st->h[2] = h2;
+}
+
+static void poly1305_emit_generic(void *ctx, u8 mac[16], const u32 nonce[4])
+{
+ struct poly1305_internal *st = (struct poly1305_internal *)ctx;
+ u64 h0, h1, h2, c;
+ u64 g0, g1, g2;
+ u64 t0, t1;
+
+ /* fully carry h */
+ h0 = st->h[0];
+ h1 = st->h[1];
+ h2 = st->h[2];
+
+ c = h1 >> 44;
+ h1 &= 0xfffffffffff;
+ h2 += c;
+ c = h2 >> 42;
+ h2 &= 0x3ffffffffff;
+ h0 += c * 5;
+ c = h0 >> 44;
+ h0 &= 0xfffffffffff;
+ h1 += c;
+ c = h1 >> 44;
+ h1 &= 0xfffffffffff;
+ h2 += c;
+ c = h2 >> 42;
+ h2 &= 0x3ffffffffff;
+ h0 += c * 5;
+ c = h0 >> 44;
+ h0 &= 0xfffffffffff;
+ h1 += c;
+
+ /* compute h + -p */
+ g0 = h0 + 5;
+ c = g0 >> 44;
+ g0 &= 0xfffffffffff;
+ g1 = h1 + c;
+ c = g1 >> 44;
+ g1 &= 0xfffffffffff;
+ g2 = h2 + c - (1ULL << 42);
+
+ /* select h if h < p, or h + -p if h >= p */
+ c = (g2 >> ((sizeof(u64) * 8) - 1)) - 1;
+ g0 &= c;
+ g1 &= c;
+ g2 &= c;
+ c = ~c;
+ h0 = (h0 & c) | g0;
+ h1 = (h1 & c) | g1;
+ h2 = (h2 & c) | g2;
+
+ /* h = (h + nonce) */
+ t0 = ((u64)nonce[1] << 32) | nonce[0];
+ t1 = ((u64)nonce[3] << 32) | nonce[2];
+
+ h0 += t0 & 0xfffffffffff;
+ c = h0 >> 44;
+ h0 &= 0xfffffffffff;
+ h1 += (((t0 >> 44) | (t1 << 20)) & 0xfffffffffff) + c;
+ c = h1 >> 44;
+ h1 &= 0xfffffffffff;
+ h2 += (((t1 >> 24)) & 0x3ffffffffff) + c;
+ h2 &= 0x3ffffffffff;
+
+ /* mac = h % (2^128) */
+ h0 = h0 | (h1 << 44);
+ h1 = (h1 >> 20) | (h2 << 24);
+
+ put_unaligned_le64(h0, &mac[0]);
+ put_unaligned_le64(h1, &mac[8]);
+}
diff --git a/lib/zinc/poly1305/poly1305.c b/lib/zinc/poly1305/poly1305.c
new file mode 100644
index 000000000000..6c6c64035efb
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * Implementation of the Poly1305 message authenticator.
+ *
+ * Information: https://cr.yp.to/mac.html
+ */
+
+#include <zinc/poly1305.h>
+#include "../selftest/run.h"
+
+#include <asm/unaligned.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/module.h>
+#include <linux/init.h>
+
+static inline bool poly1305_init_arch(void *ctx,
+ const u8 key[POLY1305_KEY_SIZE])
+{
+ return false;
+}
+static inline bool poly1305_blocks_arch(void *ctx, const u8 *input,
+ size_t len, const u32 padbit,
+ simd_context_t *simd_context)
+{
+ return false;
+}
+static inline bool poly1305_emit_arch(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4],
+ simd_context_t *simd_context)
+{
+ return false;
+}
+static bool *const poly1305_nobs[] __initconst = { };
+static void __init poly1305_fpu_init(void)
+{
+}
+
+#if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__)
+#include "poly1305-donna64.h"
+#else
+#include "poly1305-donna32.h"
+#endif
+
+void poly1305_init(struct poly1305_ctx *ctx, const u8 key[POLY1305_KEY_SIZE])
+{
+ ctx->nonce[0] = get_unaligned_le32(&key[16]);
+ ctx->nonce[1] = get_unaligned_le32(&key[20]);
+ ctx->nonce[2] = get_unaligned_le32(&key[24]);
+ ctx->nonce[3] = get_unaligned_le32(&key[28]);
+
+ if (!poly1305_init_arch(ctx->opaque, key))
+ poly1305_init_generic(ctx->opaque, key);
+
+ ctx->num = 0;
+}
+EXPORT_SYMBOL(poly1305_init);
+
+static inline void poly1305_blocks(void *ctx, const u8 *input, const size_t len,
+ const u32 padbit,
+ simd_context_t *simd_context)
+{
+ if (!poly1305_blocks_arch(ctx, input, len, padbit, simd_context))
+ poly1305_blocks_generic(ctx, input, len, padbit);
+}
+
+static inline void poly1305_emit(void *ctx, u8 mac[POLY1305_KEY_SIZE],
+ const u32 nonce[4],
+ simd_context_t *simd_context)
+{
+ if (!poly1305_emit_arch(ctx, mac, nonce, simd_context))
+ poly1305_emit_generic(ctx, mac, nonce);
+}
+
+void poly1305_update(struct poly1305_ctx *ctx, const u8 *input, size_t len,
+ simd_context_t *simd_context)
+{
+ const size_t num = ctx->num;
+ size_t rem;
+
+ if (num) {
+ rem = POLY1305_BLOCK_SIZE - num;
+ if (len < rem) {
+ memcpy(ctx->data + num, input, len);
+ ctx->num = num + len;
+ return;
+ }
+ memcpy(ctx->data + num, input, rem);
+ poly1305_blocks(ctx->opaque, ctx->data, POLY1305_BLOCK_SIZE, 1,
+ simd_context);
+ input += rem;
+ len -= rem;
+ }
+
+ rem = len % POLY1305_BLOCK_SIZE;
+ len -= rem;
+
+ if (len >= POLY1305_BLOCK_SIZE) {
+ poly1305_blocks(ctx->opaque, input, len, 1, simd_context);
+ input += len;
+ }
+
+ if (rem)
+ memcpy(ctx->data, input, rem);
+
+ ctx->num = rem;
+}
+EXPORT_SYMBOL(poly1305_update);
+
+void poly1305_final(struct poly1305_ctx *ctx, u8 mac[POLY1305_MAC_SIZE],
+ simd_context_t *simd_context)
+{
+ size_t num = ctx->num;
+
+ if (num) {
+ ctx->data[num++] = 1;
+ while (num < POLY1305_BLOCK_SIZE)
+ ctx->data[num++] = 0;
+ poly1305_blocks(ctx->opaque, ctx->data, POLY1305_BLOCK_SIZE, 0,
+ simd_context);
+ }
+
+ poly1305_emit(ctx->opaque, mac, ctx->nonce, simd_context);
+
+ memzero_explicit(ctx, sizeof(*ctx));
+}
+EXPORT_SYMBOL(poly1305_final);
+
+#include "../selftest/poly1305.c"
+
+static bool nosimd __initdata = false;
+
+static int __init mod_init(void)
+{
+ if (!nosimd)
+ poly1305_fpu_init();
+ if (!selftest_run("poly1305", poly1305_selftest, poly1305_nobs,
+ ARRAY_SIZE(poly1305_nobs)))
+ return -ENOTRECOVERABLE;
+ return 0;
+}
+
+static void __exit mod_exit(void)
+{
+}
+
+module_param(nosimd, bool, 0);
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Poly1305 one-time authenticator");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/zinc/selftest/poly1305.c b/lib/zinc/selftest/poly1305.c
new file mode 100644
index 000000000000..6b1f87288490
--- /dev/null
+++ b/lib/zinc/selftest/poly1305.c
@@ -0,0 +1,1107 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+struct poly1305_testvec {
+ const u8 *input, *output, *key;
+ size_t ilen;
+};
+
+/* RFC7539 */
+static const u8 input01[] __initconst = {
+ 0x43, 0x72, 0x79, 0x70, 0x74, 0x6f, 0x67, 0x72,
+ 0x61, 0x70, 0x68, 0x69, 0x63, 0x20, 0x46, 0x6f,
+ 0x72, 0x75, 0x6d, 0x20, 0x52, 0x65, 0x73, 0x65,
+ 0x61, 0x72, 0x63, 0x68, 0x20, 0x47, 0x72, 0x6f,
+ 0x75, 0x70
+};
+static const u8 output01[] __initconst = {
+ 0xa8, 0x06, 0x1d, 0xc1, 0x30, 0x51, 0x36, 0xc6,
+ 0xc2, 0x2b, 0x8b, 0xaf, 0x0c, 0x01, 0x27, 0xa9
+};
+static const u8 key01[] __initconst = {
+ 0x85, 0xd6, 0xbe, 0x78, 0x57, 0x55, 0x6d, 0x33,
+ 0x7f, 0x44, 0x52, 0xfe, 0x42, 0xd5, 0x06, 0xa8,
+ 0x01, 0x03, 0x80, 0x8a, 0xfb, 0x0d, 0xb2, 0xfd,
+ 0x4a, 0xbf, 0xf6, 0xaf, 0x41, 0x49, 0xf5, 0x1b
+};
+
+/* "The Poly1305-AES message-authentication code" */
+static const u8 input02[] __initconst = {
+ 0xf3, 0xf6
+};
+static const u8 output02[] __initconst = {
+ 0xf4, 0xc6, 0x33, 0xc3, 0x04, 0x4f, 0xc1, 0x45,
+ 0xf8, 0x4f, 0x33, 0x5c, 0xb8, 0x19, 0x53, 0xde
+};
+static const u8 key02[] __initconst = {
+ 0x85, 0x1f, 0xc4, 0x0c, 0x34, 0x67, 0xac, 0x0b,
+ 0xe0, 0x5c, 0xc2, 0x04, 0x04, 0xf3, 0xf7, 0x00,
+ 0x58, 0x0b, 0x3b, 0x0f, 0x94, 0x47, 0xbb, 0x1e,
+ 0x69, 0xd0, 0x95, 0xb5, 0x92, 0x8b, 0x6d, 0xbc
+};
+
+static const u8 input03[] __initconst = { };
+static const u8 output03[] __initconst = {
+ 0xdd, 0x3f, 0xab, 0x22, 0x51, 0xf1, 0x1a, 0xc7,
+ 0x59, 0xf0, 0x88, 0x71, 0x29, 0xcc, 0x2e, 0xe7
+};
+static const u8 key03[] __initconst = {
+ 0xa0, 0xf3, 0x08, 0x00, 0x00, 0xf4, 0x64, 0x00,
+ 0xd0, 0xc7, 0xe9, 0x07, 0x6c, 0x83, 0x44, 0x03,
+ 0xdd, 0x3f, 0xab, 0x22, 0x51, 0xf1, 0x1a, 0xc7,
+ 0x59, 0xf0, 0x88, 0x71, 0x29, 0xcc, 0x2e, 0xe7
+};
+
+static const u8 input04[] __initconst = {
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36
+};
+static const u8 output04[] __initconst = {
+ 0x0e, 0xe1, 0xc1, 0x6b, 0xb7, 0x3f, 0x0f, 0x4f,
+ 0xd1, 0x98, 0x81, 0x75, 0x3c, 0x01, 0xcd, 0xbe
+};
+static const u8 key04[] __initconst = {
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef
+};
+
+static const u8 input05[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9
+};
+static const u8 output05[] __initconst = {
+ 0x51, 0x54, 0xad, 0x0d, 0x2c, 0xb2, 0x6e, 0x01,
+ 0x27, 0x4f, 0xc5, 0x11, 0x48, 0x49, 0x1f, 0x1b
+};
+static const u8 key05[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+/* self-generated vectors exercise "significant" lengths, such that they
+ * are handled by different code paths */
+static const u8 input06[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf
+};
+static const u8 output06[] __initconst = {
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66
+};
+static const u8 key06[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input07[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67
+};
+static const u8 output07[] __initconst = {
+ 0x5b, 0x88, 0xd7, 0xf6, 0x22, 0x8b, 0x11, 0xe2,
+ 0xe2, 0x85, 0x79, 0xa5, 0xc0, 0xc1, 0xf7, 0x61
+};
+static const u8 key07[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input08[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36
+};
+static const u8 output08[] __initconst = {
+ 0xbb, 0xb6, 0x13, 0xb2, 0xb6, 0xd7, 0x53, 0xba,
+ 0x07, 0x39, 0x5b, 0x91, 0x6a, 0xae, 0xce, 0x15
+};
+static const u8 key08[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input09[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24
+};
+static const u8 output09[] __initconst = {
+ 0xc7, 0x94, 0xd7, 0x05, 0x7d, 0x17, 0x78, 0xc4,
+ 0xbb, 0xee, 0x0a, 0x39, 0xb3, 0xd9, 0x73, 0x42
+};
+static const u8 key09[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input10[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36
+};
+static const u8 output10[] __initconst = {
+ 0xff, 0xbc, 0xb9, 0xb3, 0x71, 0x42, 0x31, 0x52,
+ 0xd7, 0xfc, 0xa5, 0xad, 0x04, 0x2f, 0xba, 0xa9
+};
+static const u8 key10[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input11[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36,
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66
+};
+static const u8 output11[] __initconst = {
+ 0x06, 0x9e, 0xd6, 0xb8, 0xef, 0x0f, 0x20, 0x7b,
+ 0x3e, 0x24, 0x3b, 0xb1, 0x01, 0x9f, 0xe6, 0x32
+};
+static const u8 key11[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input12[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36,
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66,
+ 0x5b, 0x88, 0xd7, 0xf6, 0x22, 0x8b, 0x11, 0xe2,
+ 0xe2, 0x85, 0x79, 0xa5, 0xc0, 0xc1, 0xf7, 0x61
+};
+static const u8 output12[] __initconst = {
+ 0xcc, 0xa3, 0x39, 0xd9, 0xa4, 0x5f, 0xa2, 0x36,
+ 0x8c, 0x2c, 0x68, 0xb3, 0xa4, 0x17, 0x91, 0x33
+};
+static const u8 key12[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input13[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36,
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66,
+ 0x5b, 0x88, 0xd7, 0xf6, 0x22, 0x8b, 0x11, 0xe2,
+ 0xe2, 0x85, 0x79, 0xa5, 0xc0, 0xc1, 0xf7, 0x61,
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36
+};
+static const u8 output13[] __initconst = {
+ 0x53, 0xf6, 0xe8, 0x28, 0xa2, 0xf0, 0xfe, 0x0e,
+ 0xe8, 0x15, 0xbf, 0x0b, 0xd5, 0x84, 0x1a, 0x34
+};
+static const u8 key13[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+static const u8 input14[] __initconst = {
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36,
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66,
+ 0x5b, 0x88, 0xd7, 0xf6, 0x22, 0x8b, 0x11, 0xe2,
+ 0xe2, 0x85, 0x79, 0xa5, 0xc0, 0xc1, 0xf7, 0x61,
+ 0xab, 0x08, 0x12, 0x72, 0x4a, 0x7f, 0x1e, 0x34,
+ 0x27, 0x42, 0xcb, 0xed, 0x37, 0x4d, 0x94, 0xd1,
+ 0x36, 0xc6, 0xb8, 0x79, 0x5d, 0x45, 0xb3, 0x81,
+ 0x98, 0x30, 0xf2, 0xc0, 0x44, 0x91, 0xfa, 0xf0,
+ 0x99, 0x0c, 0x62, 0xe4, 0x8b, 0x80, 0x18, 0xb2,
+ 0xc3, 0xe4, 0xa0, 0xfa, 0x31, 0x34, 0xcb, 0x67,
+ 0xfa, 0x83, 0xe1, 0x58, 0xc9, 0x94, 0xd9, 0x61,
+ 0xc4, 0xcb, 0x21, 0x09, 0x5c, 0x1b, 0xf9, 0xaf,
+ 0x48, 0x44, 0x3d, 0x0b, 0xb0, 0xd2, 0x11, 0x09,
+ 0xc8, 0x9a, 0x10, 0x0b, 0x5c, 0xe2, 0xc2, 0x08,
+ 0x83, 0x14, 0x9c, 0x69, 0xb5, 0x61, 0xdd, 0x88,
+ 0x29, 0x8a, 0x17, 0x98, 0xb1, 0x07, 0x16, 0xef,
+ 0x66, 0x3c, 0xea, 0x19, 0x0f, 0xfb, 0x83, 0xd8,
+ 0x95, 0x93, 0xf3, 0xf4, 0x76, 0xb6, 0xbc, 0x24,
+ 0xd7, 0xe6, 0x79, 0x10, 0x7e, 0xa2, 0x6a, 0xdb,
+ 0x8c, 0xaf, 0x66, 0x52, 0xd0, 0x65, 0x61, 0x36,
+ 0x81, 0x20, 0x59, 0xa5, 0xda, 0x19, 0x86, 0x37,
+ 0xca, 0xc7, 0xc4, 0xa6, 0x31, 0xbe, 0xe4, 0x66,
+ 0x5b, 0x88, 0xd7, 0xf6, 0x22, 0x8b, 0x11, 0xe2,
+ 0xe2, 0x85, 0x79, 0xa5, 0xc0, 0xc1, 0xf7, 0x61
+};
+static const u8 output14[] __initconst = {
+ 0xb8, 0x46, 0xd4, 0x4e, 0x9b, 0xbd, 0x53, 0xce,
+ 0xdf, 0xfb, 0xfb, 0xb6, 0xb7, 0xfa, 0x49, 0x33
+};
+static const u8 key14[] __initconst = {
+ 0x12, 0x97, 0x6a, 0x08, 0xc4, 0x42, 0x6d, 0x0c,
+ 0xe8, 0xa8, 0x24, 0x07, 0xc4, 0xf4, 0x82, 0x07,
+ 0x80, 0xf8, 0xc2, 0x0a, 0xa7, 0x12, 0x02, 0xd1,
+ 0xe2, 0x91, 0x79, 0xcb, 0xcb, 0x55, 0x5a, 0x57
+};
+
+/* 4th power of the key spills to 131th bit in SIMD key setup */
+static const u8 input15[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 output15[] __initconst = {
+ 0x07, 0x14, 0x5a, 0x4c, 0x02, 0xfe, 0x5f, 0xa3,
+ 0x20, 0x36, 0xde, 0x68, 0xfa, 0xbe, 0x90, 0x66
+};
+static const u8 key15[] __initconst = {
+ 0xad, 0x62, 0x81, 0x07, 0xe8, 0x35, 0x1d, 0x0f,
+ 0x2c, 0x23, 0x1a, 0x05, 0xdc, 0x4a, 0x41, 0x06,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* OpenSSL's poly1305_ieee754.c failed this in final stage */
+static const u8 input16[] __initconst = {
+ 0x84, 0x23, 0x64, 0xe1, 0x56, 0x33, 0x6c, 0x09,
+ 0x98, 0xb9, 0x33, 0xa6, 0x23, 0x77, 0x26, 0x18,
+ 0x0d, 0x9e, 0x3f, 0xdc, 0xbd, 0xe4, 0xcd, 0x5d,
+ 0x17, 0x08, 0x0f, 0xc3, 0xbe, 0xb4, 0x96, 0x14,
+ 0xd7, 0x12, 0x2c, 0x03, 0x74, 0x63, 0xff, 0x10,
+ 0x4d, 0x73, 0xf1, 0x9c, 0x12, 0x70, 0x46, 0x28,
+ 0xd4, 0x17, 0xc4, 0xc5, 0x4a, 0x3f, 0xe3, 0x0d,
+ 0x3c, 0x3d, 0x77, 0x14, 0x38, 0x2d, 0x43, 0xb0,
+ 0x38, 0x2a, 0x50, 0xa5, 0xde, 0xe5, 0x4b, 0xe8,
+ 0x44, 0xb0, 0x76, 0xe8, 0xdf, 0x88, 0x20, 0x1a,
+ 0x1c, 0xd4, 0x3b, 0x90, 0xeb, 0x21, 0x64, 0x3f,
+ 0xa9, 0x6f, 0x39, 0xb5, 0x18, 0xaa, 0x83, 0x40,
+ 0xc9, 0x42, 0xff, 0x3c, 0x31, 0xba, 0xf7, 0xc9,
+ 0xbd, 0xbf, 0x0f, 0x31, 0xae, 0x3f, 0xa0, 0x96,
+ 0xbf, 0x8c, 0x63, 0x03, 0x06, 0x09, 0x82, 0x9f,
+ 0xe7, 0x2e, 0x17, 0x98, 0x24, 0x89, 0x0b, 0xc8,
+ 0xe0, 0x8c, 0x31, 0x5c, 0x1c, 0xce, 0x2a, 0x83,
+ 0x14, 0x4d, 0xbb, 0xff, 0x09, 0xf7, 0x4e, 0x3e,
+ 0xfc, 0x77, 0x0b, 0x54, 0xd0, 0x98, 0x4a, 0x8f,
+ 0x19, 0xb1, 0x47, 0x19, 0xe6, 0x36, 0x35, 0x64,
+ 0x1d, 0x6b, 0x1e, 0xed, 0xf6, 0x3e, 0xfb, 0xf0,
+ 0x80, 0xe1, 0x78, 0x3d, 0x32, 0x44, 0x54, 0x12,
+ 0x11, 0x4c, 0x20, 0xde, 0x0b, 0x83, 0x7a, 0x0d,
+ 0xfa, 0x33, 0xd6, 0xb8, 0x28, 0x25, 0xff, 0xf4,
+ 0x4c, 0x9a, 0x70, 0xea, 0x54, 0xce, 0x47, 0xf0,
+ 0x7d, 0xf6, 0x98, 0xe6, 0xb0, 0x33, 0x23, 0xb5,
+ 0x30, 0x79, 0x36, 0x4a, 0x5f, 0xc3, 0xe9, 0xdd,
+ 0x03, 0x43, 0x92, 0xbd, 0xde, 0x86, 0xdc, 0xcd,
+ 0xda, 0x94, 0x32, 0x1c, 0x5e, 0x44, 0x06, 0x04,
+ 0x89, 0x33, 0x6c, 0xb6, 0x5b, 0xf3, 0x98, 0x9c,
+ 0x36, 0xf7, 0x28, 0x2c, 0x2f, 0x5d, 0x2b, 0x88,
+ 0x2c, 0x17, 0x1e, 0x74
+};
+static const u8 output16[] __initconst = {
+ 0xf2, 0x48, 0x31, 0x2e, 0x57, 0x8d, 0x9d, 0x58,
+ 0xf8, 0xb7, 0xbb, 0x4d, 0x19, 0x10, 0x54, 0x31
+};
+static const u8 key16[] __initconst = {
+ 0x95, 0xd5, 0xc0, 0x05, 0x50, 0x3e, 0x51, 0x0d,
+ 0x8c, 0xd0, 0xaa, 0x07, 0x2c, 0x4a, 0x4d, 0x06,
+ 0x6e, 0xab, 0xc5, 0x2d, 0x11, 0x65, 0x3d, 0xf4,
+ 0x7f, 0xbf, 0x63, 0xab, 0x19, 0x8b, 0xcc, 0x26
+};
+
+/* AVX2 in OpenSSL's poly1305-x86.pl failed this with 176+32 split */
+static const u8 input17[] __initconst = {
+ 0x24, 0x8a, 0xc3, 0x10, 0x85, 0xb6, 0xc2, 0xad,
+ 0xaa, 0xa3, 0x82, 0x59, 0xa0, 0xd7, 0x19, 0x2c,
+ 0x5c, 0x35, 0xd1, 0xbb, 0x4e, 0xf3, 0x9a, 0xd9,
+ 0x4c, 0x38, 0xd1, 0xc8, 0x24, 0x79, 0xe2, 0xdd,
+ 0x21, 0x59, 0xa0, 0x77, 0x02, 0x4b, 0x05, 0x89,
+ 0xbc, 0x8a, 0x20, 0x10, 0x1b, 0x50, 0x6f, 0x0a,
+ 0x1a, 0xd0, 0xbb, 0xab, 0x76, 0xe8, 0x3a, 0x83,
+ 0xf1, 0xb9, 0x4b, 0xe6, 0xbe, 0xae, 0x74, 0xe8,
+ 0x74, 0xca, 0xb6, 0x92, 0xc5, 0x96, 0x3a, 0x75,
+ 0x43, 0x6b, 0x77, 0x61, 0x21, 0xec, 0x9f, 0x62,
+ 0x39, 0x9a, 0x3e, 0x66, 0xb2, 0xd2, 0x27, 0x07,
+ 0xda, 0xe8, 0x19, 0x33, 0xb6, 0x27, 0x7f, 0x3c,
+ 0x85, 0x16, 0xbc, 0xbe, 0x26, 0xdb, 0xbd, 0x86,
+ 0xf3, 0x73, 0x10, 0x3d, 0x7c, 0xf4, 0xca, 0xd1,
+ 0x88, 0x8c, 0x95, 0x21, 0x18, 0xfb, 0xfb, 0xd0,
+ 0xd7, 0xb4, 0xbe, 0xdc, 0x4a, 0xe4, 0x93, 0x6a,
+ 0xff, 0x91, 0x15, 0x7e, 0x7a, 0xa4, 0x7c, 0x54,
+ 0x44, 0x2e, 0xa7, 0x8d, 0x6a, 0xc2, 0x51, 0xd3,
+ 0x24, 0xa0, 0xfb, 0xe4, 0x9d, 0x89, 0xcc, 0x35,
+ 0x21, 0xb6, 0x6d, 0x16, 0xe9, 0xc6, 0x6a, 0x37,
+ 0x09, 0x89, 0x4e, 0x4e, 0xb0, 0xa4, 0xee, 0xdc,
+ 0x4a, 0xe1, 0x94, 0x68, 0xe6, 0x6b, 0x81, 0xf2,
+ 0x71, 0x35, 0x1b, 0x1d, 0x92, 0x1e, 0xa5, 0x51,
+ 0x04, 0x7a, 0xbc, 0xc6, 0xb8, 0x7a, 0x90, 0x1f,
+ 0xde, 0x7d, 0xb7, 0x9f, 0xa1, 0x81, 0x8c, 0x11,
+ 0x33, 0x6d, 0xbc, 0x07, 0x24, 0x4a, 0x40, 0xeb
+};
+static const u8 output17[] __initconst = {
+ 0xbc, 0x93, 0x9b, 0xc5, 0x28, 0x14, 0x80, 0xfa,
+ 0x99, 0xc6, 0xd6, 0x8c, 0x25, 0x8e, 0xc4, 0x2f
+};
+static const u8 key17[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* test vectors from Google */
+static const u8 input18[] __initconst = { };
+static const u8 output18[] __initconst = {
+ 0x47, 0x10, 0x13, 0x0e, 0x9f, 0x6f, 0xea, 0x8d,
+ 0x72, 0x29, 0x38, 0x50, 0xa6, 0x67, 0xd8, 0x6c
+};
+static const u8 key18[] __initconst = {
+ 0xc8, 0xaf, 0xaa, 0xc3, 0x31, 0xee, 0x37, 0x2c,
+ 0xd6, 0x08, 0x2d, 0xe1, 0x34, 0x94, 0x3b, 0x17,
+ 0x47, 0x10, 0x13, 0x0e, 0x9f, 0x6f, 0xea, 0x8d,
+ 0x72, 0x29, 0x38, 0x50, 0xa6, 0x67, 0xd8, 0x6c
+};
+
+static const u8 input19[] __initconst = {
+ 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f,
+ 0x72, 0x6c, 0x64, 0x21
+};
+static const u8 output19[] __initconst = {
+ 0xa6, 0xf7, 0x45, 0x00, 0x8f, 0x81, 0xc9, 0x16,
+ 0xa2, 0x0d, 0xcc, 0x74, 0xee, 0xf2, 0xb2, 0xf0
+};
+static const u8 key19[] __initconst = {
+ 0x74, 0x68, 0x69, 0x73, 0x20, 0x69, 0x73, 0x20,
+ 0x33, 0x32, 0x2d, 0x62, 0x79, 0x74, 0x65, 0x20,
+ 0x6b, 0x65, 0x79, 0x20, 0x66, 0x6f, 0x72, 0x20,
+ 0x50, 0x6f, 0x6c, 0x79, 0x31, 0x33, 0x30, 0x35
+};
+
+static const u8 input20[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 output20[] __initconst = {
+ 0x49, 0xec, 0x78, 0x09, 0x0e, 0x48, 0x1e, 0xc6,
+ 0xc2, 0x6b, 0x33, 0xb9, 0x1c, 0xcc, 0x03, 0x07
+};
+static const u8 key20[] __initconst = {
+ 0x74, 0x68, 0x69, 0x73, 0x20, 0x69, 0x73, 0x20,
+ 0x33, 0x32, 0x2d, 0x62, 0x79, 0x74, 0x65, 0x20,
+ 0x6b, 0x65, 0x79, 0x20, 0x66, 0x6f, 0x72, 0x20,
+ 0x50, 0x6f, 0x6c, 0x79, 0x31, 0x33, 0x30, 0x35
+};
+
+static const u8 input21[] __initconst = {
+ 0x89, 0xda, 0xb8, 0x0b, 0x77, 0x17, 0xc1, 0xdb,
+ 0x5d, 0xb4, 0x37, 0x86, 0x0a, 0x3f, 0x70, 0x21,
+ 0x8e, 0x93, 0xe1, 0xb8, 0xf4, 0x61, 0xfb, 0x67,
+ 0x7f, 0x16, 0xf3, 0x5f, 0x6f, 0x87, 0xe2, 0xa9,
+ 0x1c, 0x99, 0xbc, 0x3a, 0x47, 0xac, 0xe4, 0x76,
+ 0x40, 0xcc, 0x95, 0xc3, 0x45, 0xbe, 0x5e, 0xcc,
+ 0xa5, 0xa3, 0x52, 0x3c, 0x35, 0xcc, 0x01, 0x89,
+ 0x3a, 0xf0, 0xb6, 0x4a, 0x62, 0x03, 0x34, 0x27,
+ 0x03, 0x72, 0xec, 0x12, 0x48, 0x2d, 0x1b, 0x1e,
+ 0x36, 0x35, 0x61, 0x69, 0x8a, 0x57, 0x8b, 0x35,
+ 0x98, 0x03, 0x49, 0x5b, 0xb4, 0xe2, 0xef, 0x19,
+ 0x30, 0xb1, 0x7a, 0x51, 0x90, 0xb5, 0x80, 0xf1,
+ 0x41, 0x30, 0x0d, 0xf3, 0x0a, 0xdb, 0xec, 0xa2,
+ 0x8f, 0x64, 0x27, 0xa8, 0xbc, 0x1a, 0x99, 0x9f,
+ 0xd5, 0x1c, 0x55, 0x4a, 0x01, 0x7d, 0x09, 0x5d,
+ 0x8c, 0x3e, 0x31, 0x27, 0xda, 0xf9, 0xf5, 0x95
+};
+static const u8 output21[] __initconst = {
+ 0xc8, 0x5d, 0x15, 0xed, 0x44, 0xc3, 0x78, 0xd6,
+ 0xb0, 0x0e, 0x23, 0x06, 0x4c, 0x7b, 0xcd, 0x51
+};
+static const u8 key21[] __initconst = {
+ 0x2d, 0x77, 0x3b, 0xe3, 0x7a, 0xdb, 0x1e, 0x4d,
+ 0x68, 0x3b, 0xf0, 0x07, 0x5e, 0x79, 0xc4, 0xee,
+ 0x03, 0x79, 0x18, 0x53, 0x5a, 0x7f, 0x99, 0xcc,
+ 0xb7, 0x04, 0x0f, 0xb5, 0xf5, 0xf4, 0x3a, 0xea
+};
+
+static const u8 input22[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0b,
+ 0x17, 0x03, 0x03, 0x02, 0x00, 0x00, 0x00, 0x00,
+ 0x06, 0xdb, 0x1f, 0x1f, 0x36, 0x8d, 0x69, 0x6a,
+ 0x81, 0x0a, 0x34, 0x9c, 0x0c, 0x71, 0x4c, 0x9a,
+ 0x5e, 0x78, 0x50, 0xc2, 0x40, 0x7d, 0x72, 0x1a,
+ 0xcd, 0xed, 0x95, 0xe0, 0x18, 0xd7, 0xa8, 0x52,
+ 0x66, 0xa6, 0xe1, 0x28, 0x9c, 0xdb, 0x4a, 0xeb,
+ 0x18, 0xda, 0x5a, 0xc8, 0xa2, 0xb0, 0x02, 0x6d,
+ 0x24, 0xa5, 0x9a, 0xd4, 0x85, 0x22, 0x7f, 0x3e,
+ 0xae, 0xdb, 0xb2, 0xe7, 0xe3, 0x5e, 0x1c, 0x66,
+ 0xcd, 0x60, 0xf9, 0xab, 0xf7, 0x16, 0xdc, 0xc9,
+ 0xac, 0x42, 0x68, 0x2d, 0xd7, 0xda, 0xb2, 0x87,
+ 0xa7, 0x02, 0x4c, 0x4e, 0xef, 0xc3, 0x21, 0xcc,
+ 0x05, 0x74, 0xe1, 0x67, 0x93, 0xe3, 0x7c, 0xec,
+ 0x03, 0xc5, 0xbd, 0xa4, 0x2b, 0x54, 0xc1, 0x14,
+ 0xa8, 0x0b, 0x57, 0xaf, 0x26, 0x41, 0x6c, 0x7b,
+ 0xe7, 0x42, 0x00, 0x5e, 0x20, 0x85, 0x5c, 0x73,
+ 0xe2, 0x1d, 0xc8, 0xe2, 0xed, 0xc9, 0xd4, 0x35,
+ 0xcb, 0x6f, 0x60, 0x59, 0x28, 0x00, 0x11, 0xc2,
+ 0x70, 0xb7, 0x15, 0x70, 0x05, 0x1c, 0x1c, 0x9b,
+ 0x30, 0x52, 0x12, 0x66, 0x20, 0xbc, 0x1e, 0x27,
+ 0x30, 0xfa, 0x06, 0x6c, 0x7a, 0x50, 0x9d, 0x53,
+ 0xc6, 0x0e, 0x5a, 0xe1, 0xb4, 0x0a, 0xa6, 0xe3,
+ 0x9e, 0x49, 0x66, 0x92, 0x28, 0xc9, 0x0e, 0xec,
+ 0xb4, 0xa5, 0x0d, 0xb3, 0x2a, 0x50, 0xbc, 0x49,
+ 0xe9, 0x0b, 0x4f, 0x4b, 0x35, 0x9a, 0x1d, 0xfd,
+ 0x11, 0x74, 0x9c, 0xd3, 0x86, 0x7f, 0xcf, 0x2f,
+ 0xb7, 0xbb, 0x6c, 0xd4, 0x73, 0x8f, 0x6a, 0x4a,
+ 0xd6, 0xf7, 0xca, 0x50, 0x58, 0xf7, 0x61, 0x88,
+ 0x45, 0xaf, 0x9f, 0x02, 0x0f, 0x6c, 0x3b, 0x96,
+ 0x7b, 0x8f, 0x4c, 0xd4, 0xa9, 0x1e, 0x28, 0x13,
+ 0xb5, 0x07, 0xae, 0x66, 0xf2, 0xd3, 0x5c, 0x18,
+ 0x28, 0x4f, 0x72, 0x92, 0x18, 0x60, 0x62, 0xe1,
+ 0x0f, 0xd5, 0x51, 0x0d, 0x18, 0x77, 0x53, 0x51,
+ 0xef, 0x33, 0x4e, 0x76, 0x34, 0xab, 0x47, 0x43,
+ 0xf5, 0xb6, 0x8f, 0x49, 0xad, 0xca, 0xb3, 0x84,
+ 0xd3, 0xfd, 0x75, 0xf7, 0x39, 0x0f, 0x40, 0x06,
+ 0xef, 0x2a, 0x29, 0x5c, 0x8c, 0x7a, 0x07, 0x6a,
+ 0xd5, 0x45, 0x46, 0xcd, 0x25, 0xd2, 0x10, 0x7f,
+ 0xbe, 0x14, 0x36, 0xc8, 0x40, 0x92, 0x4a, 0xae,
+ 0xbe, 0x5b, 0x37, 0x08, 0x93, 0xcd, 0x63, 0xd1,
+ 0x32, 0x5b, 0x86, 0x16, 0xfc, 0x48, 0x10, 0x88,
+ 0x6b, 0xc1, 0x52, 0xc5, 0x32, 0x21, 0xb6, 0xdf,
+ 0x37, 0x31, 0x19, 0x39, 0x32, 0x55, 0xee, 0x72,
+ 0xbc, 0xaa, 0x88, 0x01, 0x74, 0xf1, 0x71, 0x7f,
+ 0x91, 0x84, 0xfa, 0x91, 0x64, 0x6f, 0x17, 0xa2,
+ 0x4a, 0xc5, 0x5d, 0x16, 0xbf, 0xdd, 0xca, 0x95,
+ 0x81, 0xa9, 0x2e, 0xda, 0x47, 0x92, 0x01, 0xf0,
+ 0xed, 0xbf, 0x63, 0x36, 0x00, 0xd6, 0x06, 0x6d,
+ 0x1a, 0xb3, 0x6d, 0x5d, 0x24, 0x15, 0xd7, 0x13,
+ 0x51, 0xbb, 0xcd, 0x60, 0x8a, 0x25, 0x10, 0x8d,
+ 0x25, 0x64, 0x19, 0x92, 0xc1, 0xf2, 0x6c, 0x53,
+ 0x1c, 0xf9, 0xf9, 0x02, 0x03, 0xbc, 0x4c, 0xc1,
+ 0x9f, 0x59, 0x27, 0xd8, 0x34, 0xb0, 0xa4, 0x71,
+ 0x16, 0xd3, 0x88, 0x4b, 0xbb, 0x16, 0x4b, 0x8e,
+ 0xc8, 0x83, 0xd1, 0xac, 0x83, 0x2e, 0x56, 0xb3,
+ 0x91, 0x8a, 0x98, 0x60, 0x1a, 0x08, 0xd1, 0x71,
+ 0x88, 0x15, 0x41, 0xd5, 0x94, 0xdb, 0x39, 0x9c,
+ 0x6a, 0xe6, 0x15, 0x12, 0x21, 0x74, 0x5a, 0xec,
+ 0x81, 0x4c, 0x45, 0xb0, 0xb0, 0x5b, 0x56, 0x54,
+ 0x36, 0xfd, 0x6f, 0x13, 0x7a, 0xa1, 0x0a, 0x0c,
+ 0x0b, 0x64, 0x37, 0x61, 0xdb, 0xd6, 0xf9, 0xa9,
+ 0xdc, 0xb9, 0x9b, 0x1a, 0x6e, 0x69, 0x08, 0x54,
+ 0xce, 0x07, 0x69, 0xcd, 0xe3, 0x97, 0x61, 0xd8,
+ 0x2f, 0xcd, 0xec, 0x15, 0xf0, 0xd9, 0x2d, 0x7d,
+ 0x8e, 0x94, 0xad, 0xe8, 0xeb, 0x83, 0xfb, 0xe0
+};
+static const u8 output22[] __initconst = {
+ 0x26, 0x37, 0x40, 0x8f, 0xe1, 0x30, 0x86, 0xea,
+ 0x73, 0xf9, 0x71, 0xe3, 0x42, 0x5e, 0x28, 0x20
+};
+static const u8 key22[] __initconst = {
+ 0x99, 0xe5, 0x82, 0x2d, 0xd4, 0x17, 0x3c, 0x99,
+ 0x5e, 0x3d, 0xae, 0x0d, 0xde, 0xfb, 0x97, 0x74,
+ 0x3f, 0xde, 0x3b, 0x08, 0x01, 0x34, 0xb3, 0x9f,
+ 0x76, 0xe9, 0xbf, 0x8d, 0x0e, 0x88, 0xd5, 0x46
+};
+
+/* test vectors from Hanno Böck */
+static const u8 input23[] __initconst = {
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0x80, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xce, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xc5,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xe3, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xac, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xe6,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0x00, 0x00, 0x00,
+ 0xaf, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc,
+ 0xcc, 0xcc, 0xff, 0xff, 0xff, 0xf5, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0xff, 0xff, 0xff, 0xe7, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x71, 0x92, 0x05, 0xa8, 0x52, 0x1d,
+ 0xfc
+};
+static const u8 output23[] __initconst = {
+ 0x85, 0x59, 0xb8, 0x76, 0xec, 0xee, 0xd6, 0x6e,
+ 0xb3, 0x77, 0x98, 0xc0, 0x45, 0x7b, 0xaf, 0xf9
+};
+static const u8 key23[] __initconst = {
+ 0x7f, 0x1b, 0x02, 0x64, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc, 0xcc
+};
+
+static const u8 input24[] __initconst = {
+ 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa,
+ 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa,
+ 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa,
+ 0xaa, 0xaa, 0xaa, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x80, 0x02, 0x64
+};
+static const u8 output24[] __initconst = {
+ 0x00, 0xbd, 0x12, 0x58, 0x97, 0x8e, 0x20, 0x54,
+ 0x44, 0xc9, 0xaa, 0xaa, 0x82, 0x00, 0x6f, 0xed
+};
+static const u8 key24[] __initconst = {
+ 0xe0, 0x00, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa,
+ 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa
+};
+
+static const u8 input25[] __initconst = {
+ 0x02, 0xfc
+};
+static const u8 output25[] __initconst = {
+ 0x06, 0x12, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
+ 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c
+};
+static const u8 key25[] __initconst = {
+ 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
+ 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
+ 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
+ 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c
+};
+
+static const u8 input26[] __initconst = {
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7a, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x5c, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x6e, 0x7b, 0x00, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7a, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x5c,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b, 0x7b,
+ 0x7b, 0x6e, 0x7b, 0x00, 0x13, 0x00, 0x00, 0x00,
+ 0x00, 0xb3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xf2, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x20, 0x00, 0xef, 0xff, 0x00,
+ 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00,
+ 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x64, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x13, 0x00, 0x00, 0x00, 0x00,
+ 0xb3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xf2,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x20, 0x00, 0xef, 0xff, 0x00, 0x09,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x7a, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
+ 0x00, 0x09, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xfc
+};
+static const u8 output26[] __initconst = {
+ 0x33, 0x20, 0x5b, 0xbf, 0x9e, 0x9f, 0x8f, 0x72,
+ 0x12, 0xab, 0x9e, 0x2a, 0xb9, 0xb7, 0xe4, 0xa5
+};
+static const u8 key26[] __initconst = {
+ 0x00, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x1e, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x7b, 0x7b
+};
+
+static const u8 input27[] __initconst = {
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0xff, 0xff, 0xff, 0xe9,
+ 0xe9, 0xac, 0xac, 0xac, 0xac, 0xac, 0xac, 0xac,
+ 0xac, 0xac, 0xac, 0xac, 0x00, 0x00, 0xac, 0xac,
+ 0xec, 0x01, 0x00, 0xac, 0xac, 0xac, 0x2c, 0xac,
+ 0xa2, 0xac, 0xac, 0xac, 0xac, 0xac, 0xac, 0xac,
+ 0xac, 0xac, 0xac, 0xac, 0x64, 0xf2
+};
+static const u8 output27[] __initconst = {
+ 0x02, 0xee, 0x7c, 0x8c, 0x54, 0x6d, 0xde, 0xb1,
+ 0xa4, 0x67, 0xe4, 0xc3, 0x98, 0x11, 0x58, 0xb9
+};
+static const u8 key27[] __initconst = {
+ 0x00, 0x00, 0x00, 0x7f, 0x00, 0x00, 0x00, 0x7f,
+ 0x01, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0xcf, 0x77, 0x77, 0x77, 0x77, 0x77,
+ 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77, 0x77
+};
+
+/* nacl */
+static const u8 input28[] __initconst = {
+ 0x8e, 0x99, 0x3b, 0x9f, 0x48, 0x68, 0x12, 0x73,
+ 0xc2, 0x96, 0x50, 0xba, 0x32, 0xfc, 0x76, 0xce,
+ 0x48, 0x33, 0x2e, 0xa7, 0x16, 0x4d, 0x96, 0xa4,
+ 0x47, 0x6f, 0xb8, 0xc5, 0x31, 0xa1, 0x18, 0x6a,
+ 0xc0, 0xdf, 0xc1, 0x7c, 0x98, 0xdc, 0xe8, 0x7b,
+ 0x4d, 0xa7, 0xf0, 0x11, 0xec, 0x48, 0xc9, 0x72,
+ 0x71, 0xd2, 0xc2, 0x0f, 0x9b, 0x92, 0x8f, 0xe2,
+ 0x27, 0x0d, 0x6f, 0xb8, 0x63, 0xd5, 0x17, 0x38,
+ 0xb4, 0x8e, 0xee, 0xe3, 0x14, 0xa7, 0xcc, 0x8a,
+ 0xb9, 0x32, 0x16, 0x45, 0x48, 0xe5, 0x26, 0xae,
+ 0x90, 0x22, 0x43, 0x68, 0x51, 0x7a, 0xcf, 0xea,
+ 0xbd, 0x6b, 0xb3, 0x73, 0x2b, 0xc0, 0xe9, 0xda,
+ 0x99, 0x83, 0x2b, 0x61, 0xca, 0x01, 0xb6, 0xde,
+ 0x56, 0x24, 0x4a, 0x9e, 0x88, 0xd5, 0xf9, 0xb3,
+ 0x79, 0x73, 0xf6, 0x22, 0xa4, 0x3d, 0x14, 0xa6,
+ 0x59, 0x9b, 0x1f, 0x65, 0x4c, 0xb4, 0x5a, 0x74,
+ 0xe3, 0x55, 0xa5
+};
+static const u8 output28[] __initconst = {
+ 0xf3, 0xff, 0xc7, 0x70, 0x3f, 0x94, 0x00, 0xe5,
+ 0x2a, 0x7d, 0xfb, 0x4b, 0x3d, 0x33, 0x05, 0xd9
+};
+static const u8 key28[] __initconst = {
+ 0xee, 0xa6, 0xa7, 0x25, 0x1c, 0x1e, 0x72, 0x91,
+ 0x6d, 0x11, 0xc2, 0xcb, 0x21, 0x4d, 0x3c, 0x25,
+ 0x25, 0x39, 0x12, 0x1d, 0x8e, 0x23, 0x4e, 0x65,
+ 0x2d, 0x65, 0x1f, 0xa4, 0xc8, 0xcf, 0xf8, 0x80
+};
+
+/* wrap 2^130-5 */
+static const u8 input29[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 output29[] __initconst = {
+ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key29[] __initconst = {
+ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* wrap 2^128 */
+static const u8 input30[] __initconst = {
+ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 output30[] __initconst = {
+ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key30[] __initconst = {
+ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+
+/* limb carry */
+static const u8 input31[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x11, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 output31[] __initconst = {
+ 0x05, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key31[] __initconst = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* 2^130-5 */
+static const u8 input32[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xfb, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe,
+ 0xfe, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe, 0xfe,
+ 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01
+};
+static const u8 output32[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key32[] __initconst = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* 2^130-6 */
+static const u8 input33[] __initconst = {
+ 0xfd, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 output33[] __initconst = {
+ 0xfa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 key33[] __initconst = {
+ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* 5*H+L reduction intermediate */
+static const u8 input34[] __initconst = {
+ 0xe3, 0x35, 0x94, 0xd7, 0x50, 0x5e, 0x43, 0xb9,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x33, 0x94, 0xd7, 0x50, 0x5e, 0x43, 0x79, 0xcd,
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 output34[] __initconst = {
+ 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x55, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key34[] __initconst = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+/* 5*H+L reduction final */
+static const u8 input35[] __initconst = {
+ 0xe3, 0x35, 0x94, 0xd7, 0x50, 0x5e, 0x43, 0xb9,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x33, 0x94, 0xd7, 0x50, 0x5e, 0x43, 0x79, 0xcd,
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 output35[] __initconst = {
+ 0x13, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 key35[] __initconst = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static const struct poly1305_testvec poly1305_testvecs[] __initconst = {
+ { input01, output01, key01, sizeof(input01) },
+ { input02, output02, key02, sizeof(input02) },
+ { input03, output03, key03, sizeof(input03) },
+ { input04, output04, key04, sizeof(input04) },
+ { input05, output05, key05, sizeof(input05) },
+ { input06, output06, key06, sizeof(input06) },
+ { input07, output07, key07, sizeof(input07) },
+ { input08, output08, key08, sizeof(input08) },
+ { input09, output09, key09, sizeof(input09) },
+ { input10, output10, key10, sizeof(input10) },
+ { input11, output11, key11, sizeof(input11) },
+ { input12, output12, key12, sizeof(input12) },
+ { input13, output13, key13, sizeof(input13) },
+ { input14, output14, key14, sizeof(input14) },
+ { input15, output15, key15, sizeof(input15) },
+ { input16, output16, key16, sizeof(input16) },
+ { input17, output17, key17, sizeof(input17) },
+ { input18, output18, key18, sizeof(input18) },
+ { input19, output19, key19, sizeof(input19) },
+ { input20, output20, key20, sizeof(input20) },
+ { input21, output21, key21, sizeof(input21) },
+ { input22, output22, key22, sizeof(input22) },
+ { input23, output23, key23, sizeof(input23) },
+ { input24, output24, key24, sizeof(input24) },
+ { input25, output25, key25, sizeof(input25) },
+ { input26, output26, key26, sizeof(input26) },
+ { input27, output27, key27, sizeof(input27) },
+ { input28, output28, key28, sizeof(input28) },
+ { input29, output29, key29, sizeof(input29) },
+ { input30, output30, key30, sizeof(input30) },
+ { input31, output31, key31, sizeof(input31) },
+ { input32, output32, key32, sizeof(input32) },
+ { input33, output33, key33, sizeof(input33) },
+ { input34, output34, key34, sizeof(input34) },
+ { input35, output35, key35, sizeof(input35) }
+};
+
+static bool __init poly1305_selftest(void)
+{
+ simd_context_t simd_context;
+ bool success = true;
+ size_t i, j;
+
+ simd_get(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(poly1305_testvecs); ++i) {
+ struct poly1305_ctx poly1305;
+ u8 out[POLY1305_MAC_SIZE];
+
+ memset(out, 0, sizeof(out));
+ memset(&poly1305, 0, sizeof(poly1305));
+ poly1305_init(&poly1305, poly1305_testvecs[i].key);
+ poly1305_update(&poly1305, poly1305_testvecs[i].input,
+ poly1305_testvecs[i].ilen, &simd_context);
+ poly1305_final(&poly1305, out, &simd_context);
+ if (memcmp(out, poly1305_testvecs[i].output,
+ POLY1305_MAC_SIZE)) {
+ pr_err("poly1305 self-test %zu: FAIL\n", i + 1);
+ success = false;
+ }
+ simd_relax(&simd_context);
+
+ if (poly1305_testvecs[i].ilen <= 1)
+ continue;
+
+ for (j = 1; j < poly1305_testvecs[i].ilen - 1; ++j) {
+ memset(out, 0, sizeof(out));
+ memset(&poly1305, 0, sizeof(poly1305));
+ poly1305_init(&poly1305, poly1305_testvecs[i].key);
+ poly1305_update(&poly1305, poly1305_testvecs[i].input,
+ j, &simd_context);
+ poly1305_update(&poly1305,
+ poly1305_testvecs[i].input + j,
+ poly1305_testvecs[i].ilen - j,
+ &simd_context);
+ poly1305_final(&poly1305, out, &simd_context);
+ if (memcmp(out, poly1305_testvecs[i].output,
+ POLY1305_MAC_SIZE)) {
+ pr_err("poly1305 self-test %zu (split %zu): FAIL\n",
+ i + 1, j);
+ success = false;
+ }
+
+ memset(out, 0, sizeof(out));
+ memset(&poly1305, 0, sizeof(poly1305));
+ poly1305_init(&poly1305, poly1305_testvecs[i].key);
+ poly1305_update(&poly1305, poly1305_testvecs[i].input,
+ j, &simd_context);
+ poly1305_update(&poly1305,
+ poly1305_testvecs[i].input + j,
+ poly1305_testvecs[i].ilen - j,
+ DONT_USE_SIMD);
+ poly1305_final(&poly1305, out, &simd_context);
+ if (memcmp(out, poly1305_testvecs[i].output,
+ POLY1305_MAC_SIZE)) {
+ pr_err("poly1305 self-test %zu (split %zu, mixed simd): FAIL\n",
+ i + 1, j);
+ success = false;
+ }
+ simd_relax(&simd_context);
+ }
+ }
+ simd_put(&simd_context);
+
+ return success;
+}
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 12/28] zinc: import Andy Polyakov's Poly1305 x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (8 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 11/28] zinc: Poly1305 generic C implementations and selftest Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 13/28] zinc: " Jason A. Donenfeld
` (13 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Andy Polyakov, Thomas Gleixner, Ingo Molnar,
x86, Samuel Neves, Jean-Philippe Aumasson, Andy Lutomirski,
Andrew Morton, Linus Torvalds, kernel-hardening, linux-crypto
These x86_64 vectorized implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
be the same as OpenSSL's commit 4dfe4310c31c4483705991d9a798ce9be1ed1c68
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Based-on-code-from: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
.../poly1305/poly1305-x86_64-cryptogams.S | 3565 +++++++++++++++++
1 file changed, 3565 insertions(+)
create mode 100644 lib/zinc/poly1305/poly1305-x86_64-cryptogams.S
diff --git a/lib/zinc/poly1305/poly1305-x86_64-cryptogams.S b/lib/zinc/poly1305/poly1305-x86_64-cryptogams.S
new file mode 100644
index 000000000000..ed634757354b
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-x86_64-cryptogams.S
@@ -0,0 +1,3565 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+.text
+
+
+
+.globl poly1305_init
+.hidden poly1305_init
+.globl poly1305_blocks
+.hidden poly1305_blocks
+.globl poly1305_emit
+.hidden poly1305_emit
+
+.type poly1305_init,@function
+.align 32
+poly1305_init:
+ xorq %rax,%rax
+ movq %rax,0(%rdi)
+ movq %rax,8(%rdi)
+ movq %rax,16(%rdi)
+
+ cmpq $0,%rsi
+ je .Lno_key
+
+ leaq poly1305_blocks(%rip),%r10
+ leaq poly1305_emit(%rip),%r11
+ movq OPENSSL_ia32cap_P+4(%rip),%r9
+ leaq poly1305_blocks_avx(%rip),%rax
+ leaq poly1305_emit_avx(%rip),%rcx
+ btq $28,%r9
+ cmovcq %rax,%r10
+ cmovcq %rcx,%r11
+ leaq poly1305_blocks_avx2(%rip),%rax
+ btq $37,%r9
+ cmovcq %rax,%r10
+ movq $2149646336,%rax
+ shrq $32,%r9
+ andq %rax,%r9
+ cmpq %rax,%r9
+ je .Linit_base2_44
+ movq $0x0ffffffc0fffffff,%rax
+ movq $0x0ffffffc0ffffffc,%rcx
+ andq 0(%rsi),%rax
+ andq 8(%rsi),%rcx
+ movq %rax,24(%rdi)
+ movq %rcx,32(%rdi)
+ movq %r10,0(%rdx)
+ movq %r11,8(%rdx)
+ movl $1,%eax
+.Lno_key:
+ .byte 0xf3,0xc3
+.size poly1305_init,.-poly1305_init
+
+.type poly1305_blocks,@function
+.align 32
+poly1305_blocks:
+.cfi_startproc
+.Lblocks:
+ shrq $4,%rdx
+ jz .Lno_data
+
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+.Lblocks_body:
+
+ movq %rdx,%r15
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+ movq 0(%rdi),%r14
+ movq 8(%rdi),%rbx
+ movq 16(%rdi),%rbp
+
+ movq %r13,%r12
+ shrq $2,%r13
+ movq %r12,%rax
+ addq %r12,%r13
+ jmp .Loop
+
+.align 32
+.Loop:
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%rbp
+ mulq %r14
+ movq %rax,%r9
+ movq %r11,%rax
+ movq %rdx,%r10
+
+ mulq %r14
+ movq %rax,%r14
+ movq %r11,%rax
+ movq %rdx,%r8
+
+ mulq %rbx
+ addq %rax,%r9
+ movq %r13,%rax
+ adcq %rdx,%r10
+
+ mulq %rbx
+ movq %rbp,%rbx
+ addq %rax,%r14
+ adcq %rdx,%r8
+
+ imulq %r13,%rbx
+ addq %rbx,%r9
+ movq %r8,%rbx
+ adcq $0,%r10
+
+ imulq %r11,%rbp
+ addq %r9,%rbx
+ movq $-4,%rax
+ adcq %rbp,%r10
+
+ andq %r10,%rax
+ movq %r10,%rbp
+ shrq $2,%r10
+ andq $3,%rbp
+ addq %r10,%rax
+ addq %rax,%r14
+ adcq $0,%rbx
+ adcq $0,%rbp
+ movq %r12,%rax
+ decq %r15
+ jnz .Loop
+
+ movq %r14,0(%rdi)
+ movq %rbx,8(%rdi)
+ movq %rbp,16(%rdi)
+
+ movq 0(%rsp),%r15
+.cfi_restore %r15
+ movq 8(%rsp),%r14
+.cfi_restore %r14
+ movq 16(%rsp),%r13
+.cfi_restore %r13
+ movq 24(%rsp),%r12
+.cfi_restore %r12
+ movq 32(%rsp),%rbp
+.cfi_restore %rbp
+ movq 40(%rsp),%rbx
+.cfi_restore %rbx
+ leaq 48(%rsp),%rsp
+.cfi_adjust_cfa_offset -48
+.Lno_data:
+.Lblocks_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size poly1305_blocks,.-poly1305_blocks
+
+.type poly1305_emit,@function
+.align 32
+poly1305_emit:
+.Lemit:
+ movq 0(%rdi),%r8
+ movq 8(%rdi),%r9
+ movq 16(%rdi),%r10
+
+ movq %r8,%rax
+ addq $5,%r8
+ movq %r9,%rcx
+ adcq $0,%r9
+ adcq $0,%r10
+ shrq $2,%r10
+ cmovnzq %r8,%rax
+ cmovnzq %r9,%rcx
+
+ addq 0(%rdx),%rax
+ adcq 8(%rdx),%rcx
+ movq %rax,0(%rsi)
+ movq %rcx,8(%rsi)
+
+ .byte 0xf3,0xc3
+.size poly1305_emit,.-poly1305_emit
+.type __poly1305_block,@function
+.align 32
+__poly1305_block:
+ mulq %r14
+ movq %rax,%r9
+ movq %r11,%rax
+ movq %rdx,%r10
+
+ mulq %r14
+ movq %rax,%r14
+ movq %r11,%rax
+ movq %rdx,%r8
+
+ mulq %rbx
+ addq %rax,%r9
+ movq %r13,%rax
+ adcq %rdx,%r10
+
+ mulq %rbx
+ movq %rbp,%rbx
+ addq %rax,%r14
+ adcq %rdx,%r8
+
+ imulq %r13,%rbx
+ addq %rbx,%r9
+ movq %r8,%rbx
+ adcq $0,%r10
+
+ imulq %r11,%rbp
+ addq %r9,%rbx
+ movq $-4,%rax
+ adcq %rbp,%r10
+
+ andq %r10,%rax
+ movq %r10,%rbp
+ shrq $2,%r10
+ andq $3,%rbp
+ addq %r10,%rax
+ addq %rax,%r14
+ adcq $0,%rbx
+ adcq $0,%rbp
+ .byte 0xf3,0xc3
+.size __poly1305_block,.-__poly1305_block
+
+.type __poly1305_init_avx,@function
+.align 32
+__poly1305_init_avx:
+ movq %r11,%r14
+ movq %r12,%rbx
+ xorq %rbp,%rbp
+
+ leaq 48+64(%rdi),%rdi
+
+ movq %r12,%rax
+ call __poly1305_block
+
+ movl $0x3ffffff,%eax
+ movl $0x3ffffff,%edx
+ movq %r14,%r8
+ andl %r14d,%eax
+ movq %r11,%r9
+ andl %r11d,%edx
+ movl %eax,-64(%rdi)
+ shrq $26,%r8
+ movl %edx,-60(%rdi)
+ shrq $26,%r9
+
+ movl $0x3ffffff,%eax
+ movl $0x3ffffff,%edx
+ andl %r8d,%eax
+ andl %r9d,%edx
+ movl %eax,-48(%rdi)
+ leal (%rax,%rax,4),%eax
+ movl %edx,-44(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ movl %eax,-32(%rdi)
+ shrq $26,%r8
+ movl %edx,-28(%rdi)
+ shrq $26,%r9
+
+ movq %rbx,%rax
+ movq %r12,%rdx
+ shlq $12,%rax
+ shlq $12,%rdx
+ orq %r8,%rax
+ orq %r9,%rdx
+ andl $0x3ffffff,%eax
+ andl $0x3ffffff,%edx
+ movl %eax,-16(%rdi)
+ leal (%rax,%rax,4),%eax
+ movl %edx,-12(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ movl %eax,0(%rdi)
+ movq %rbx,%r8
+ movl %edx,4(%rdi)
+ movq %r12,%r9
+
+ movl $0x3ffffff,%eax
+ movl $0x3ffffff,%edx
+ shrq $14,%r8
+ shrq $14,%r9
+ andl %r8d,%eax
+ andl %r9d,%edx
+ movl %eax,16(%rdi)
+ leal (%rax,%rax,4),%eax
+ movl %edx,20(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ movl %eax,32(%rdi)
+ shrq $26,%r8
+ movl %edx,36(%rdi)
+ shrq $26,%r9
+
+ movq %rbp,%rax
+ shlq $24,%rax
+ orq %rax,%r8
+ movl %r8d,48(%rdi)
+ leaq (%r8,%r8,4),%r8
+ movl %r9d,52(%rdi)
+ leaq (%r9,%r9,4),%r9
+ movl %r8d,64(%rdi)
+ movl %r9d,68(%rdi)
+
+ movq %r12,%rax
+ call __poly1305_block
+
+ movl $0x3ffffff,%eax
+ movq %r14,%r8
+ andl %r14d,%eax
+ shrq $26,%r8
+ movl %eax,-52(%rdi)
+
+ movl $0x3ffffff,%edx
+ andl %r8d,%edx
+ movl %edx,-36(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ shrq $26,%r8
+ movl %edx,-20(%rdi)
+
+ movq %rbx,%rax
+ shlq $12,%rax
+ orq %r8,%rax
+ andl $0x3ffffff,%eax
+ movl %eax,-4(%rdi)
+ leal (%rax,%rax,4),%eax
+ movq %rbx,%r8
+ movl %eax,12(%rdi)
+
+ movl $0x3ffffff,%edx
+ shrq $14,%r8
+ andl %r8d,%edx
+ movl %edx,28(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ shrq $26,%r8
+ movl %edx,44(%rdi)
+
+ movq %rbp,%rax
+ shlq $24,%rax
+ orq %rax,%r8
+ movl %r8d,60(%rdi)
+ leaq (%r8,%r8,4),%r8
+ movl %r8d,76(%rdi)
+
+ movq %r12,%rax
+ call __poly1305_block
+
+ movl $0x3ffffff,%eax
+ movq %r14,%r8
+ andl %r14d,%eax
+ shrq $26,%r8
+ movl %eax,-56(%rdi)
+
+ movl $0x3ffffff,%edx
+ andl %r8d,%edx
+ movl %edx,-40(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ shrq $26,%r8
+ movl %edx,-24(%rdi)
+
+ movq %rbx,%rax
+ shlq $12,%rax
+ orq %r8,%rax
+ andl $0x3ffffff,%eax
+ movl %eax,-8(%rdi)
+ leal (%rax,%rax,4),%eax
+ movq %rbx,%r8
+ movl %eax,8(%rdi)
+
+ movl $0x3ffffff,%edx
+ shrq $14,%r8
+ andl %r8d,%edx
+ movl %edx,24(%rdi)
+ leal (%rdx,%rdx,4),%edx
+ shrq $26,%r8
+ movl %edx,40(%rdi)
+
+ movq %rbp,%rax
+ shlq $24,%rax
+ orq %rax,%r8
+ movl %r8d,56(%rdi)
+ leaq (%r8,%r8,4),%r8
+ movl %r8d,72(%rdi)
+
+ leaq -48-64(%rdi),%rdi
+ .byte 0xf3,0xc3
+.size __poly1305_init_avx,.-__poly1305_init_avx
+
+.type poly1305_blocks_avx,@function
+.align 32
+poly1305_blocks_avx:
+.cfi_startproc
+ movl 20(%rdi),%r8d
+ cmpq $128,%rdx
+ jae .Lblocks_avx
+ testl %r8d,%r8d
+ jz .Lblocks
+
+.Lblocks_avx:
+ andq $-16,%rdx
+ jz .Lno_data_avx
+
+ vzeroupper
+
+ testl %r8d,%r8d
+ jz .Lbase2_64_avx
+
+ testq $31,%rdx
+ jz .Leven_avx
+
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+.Lblocks_avx_body:
+
+ movq %rdx,%r15
+
+ movq 0(%rdi),%r8
+ movq 8(%rdi),%r9
+ movl 16(%rdi),%ebp
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+
+ movl %r8d,%r14d
+ andq $-2147483648,%r8
+ movq %r9,%r12
+ movl %r9d,%ebx
+ andq $-2147483648,%r9
+
+ shrq $6,%r8
+ shlq $52,%r12
+ addq %r8,%r14
+ shrq $12,%rbx
+ shrq $18,%r9
+ addq %r12,%r14
+ adcq %r9,%rbx
+
+ movq %rbp,%r8
+ shlq $40,%r8
+ shrq $24,%rbp
+ addq %r8,%rbx
+ adcq $0,%rbp
+
+ movq $-4,%r9
+ movq %rbp,%r8
+ andq %rbp,%r9
+ shrq $2,%r8
+ andq $3,%rbp
+ addq %r9,%r8
+ addq %r8,%r14
+ adcq $0,%rbx
+ adcq $0,%rbp
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%rbp
+
+ call __poly1305_block
+
+ testq %rcx,%rcx
+ jz .Lstore_base2_64_avx
+
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r11
+ movq %rbx,%r12
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r11
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r11,%r14
+ shlq $24,%rbp
+ andq $0x3ffffff,%r14
+ shrq $40,%r12
+ andq $0x3ffffff,%rbx
+ orq %r12,%rbp
+
+ subq $16,%r15
+ jz .Lstore_base2_26_avx
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %ebp,%xmm4
+ jmp .Lproceed_avx
+
+.align 32
+.Lstore_base2_64_avx:
+ movq %r14,0(%rdi)
+ movq %rbx,8(%rdi)
+ movq %rbp,16(%rdi)
+ jmp .Ldone_avx
+
+.align 16
+.Lstore_base2_26_avx:
+ movl %eax,0(%rdi)
+ movl %edx,4(%rdi)
+ movl %r14d,8(%rdi)
+ movl %ebx,12(%rdi)
+ movl %ebp,16(%rdi)
+.align 16
+.Ldone_avx:
+ movq 0(%rsp),%r15
+.cfi_restore %r15
+ movq 8(%rsp),%r14
+.cfi_restore %r14
+ movq 16(%rsp),%r13
+.cfi_restore %r13
+ movq 24(%rsp),%r12
+.cfi_restore %r12
+ movq 32(%rsp),%rbp
+.cfi_restore %rbp
+ movq 40(%rsp),%rbx
+.cfi_restore %rbx
+ leaq 48(%rsp),%rsp
+.cfi_adjust_cfa_offset -48
+.Lno_data_avx:
+.Lblocks_avx_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+
+.align 32
+.Lbase2_64_avx:
+.cfi_startproc
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+.Lbase2_64_avx_body:
+
+ movq %rdx,%r15
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+ movq 0(%rdi),%r14
+ movq 8(%rdi),%rbx
+ movl 16(%rdi),%ebp
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+ testq $31,%rdx
+ jz .Linit_avx
+
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%rbp
+ subq $16,%r15
+
+ call __poly1305_block
+
+.Linit_avx:
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r8
+ movq %rbx,%r9
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r8
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r8,%r14
+ shlq $24,%rbp
+ andq $0x3ffffff,%r14
+ shrq $40,%r9
+ andq $0x3ffffff,%rbx
+ orq %r9,%rbp
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %ebp,%xmm4
+ movl $1,20(%rdi)
+
+ call __poly1305_init_avx
+
+.Lproceed_avx:
+ movq %r15,%rdx
+
+ movq 0(%rsp),%r15
+.cfi_restore %r15
+ movq 8(%rsp),%r14
+.cfi_restore %r14
+ movq 16(%rsp),%r13
+.cfi_restore %r13
+ movq 24(%rsp),%r12
+.cfi_restore %r12
+ movq 32(%rsp),%rbp
+.cfi_restore %rbp
+ movq 40(%rsp),%rbx
+.cfi_restore %rbx
+ leaq 48(%rsp),%rax
+ leaq 48(%rsp),%rsp
+.cfi_adjust_cfa_offset -48
+.Lbase2_64_avx_epilogue:
+ jmp .Ldo_avx
+.cfi_endproc
+
+.align 32
+.Leven_avx:
+.cfi_startproc
+ vmovd 0(%rdi),%xmm0
+ vmovd 4(%rdi),%xmm1
+ vmovd 8(%rdi),%xmm2
+ vmovd 12(%rdi),%xmm3
+ vmovd 16(%rdi),%xmm4
+
+.Ldo_avx:
+ leaq -88(%rsp),%r11
+.cfi_def_cfa %r11,0x60
+ subq $0x178,%rsp
+ subq $64,%rdx
+ leaq -32(%rsi),%rax
+ cmovcq %rax,%rsi
+
+ vmovdqu 48(%rdi),%xmm14
+ leaq 112(%rdi),%rdi
+ leaq .Lconst(%rip),%rcx
+
+
+
+ vmovdqu 32(%rsi),%xmm5
+ vmovdqu 48(%rsi),%xmm6
+ vmovdqa 64(%rcx),%xmm15
+
+ vpsrldq $6,%xmm5,%xmm7
+ vpsrldq $6,%xmm6,%xmm8
+ vpunpckhqdq %xmm6,%xmm5,%xmm9
+ vpunpcklqdq %xmm6,%xmm5,%xmm5
+ vpunpcklqdq %xmm8,%xmm7,%xmm8
+
+ vpsrlq $40,%xmm9,%xmm9
+ vpsrlq $26,%xmm5,%xmm6
+ vpand %xmm15,%xmm5,%xmm5
+ vpsrlq $4,%xmm8,%xmm7
+ vpand %xmm15,%xmm6,%xmm6
+ vpsrlq $30,%xmm8,%xmm8
+ vpand %xmm15,%xmm7,%xmm7
+ vpand %xmm15,%xmm8,%xmm8
+ vpor 32(%rcx),%xmm9,%xmm9
+
+ jbe .Lskip_loop_avx
+
+
+ vmovdqu -48(%rdi),%xmm11
+ vmovdqu -32(%rdi),%xmm12
+ vpshufd $0xEE,%xmm14,%xmm13
+ vpshufd $0x44,%xmm14,%xmm10
+ vmovdqa %xmm13,-144(%r11)
+ vmovdqa %xmm10,0(%rsp)
+ vpshufd $0xEE,%xmm11,%xmm14
+ vmovdqu -16(%rdi),%xmm10
+ vpshufd $0x44,%xmm11,%xmm11
+ vmovdqa %xmm14,-128(%r11)
+ vmovdqa %xmm11,16(%rsp)
+ vpshufd $0xEE,%xmm12,%xmm13
+ vmovdqu 0(%rdi),%xmm11
+ vpshufd $0x44,%xmm12,%xmm12
+ vmovdqa %xmm13,-112(%r11)
+ vmovdqa %xmm12,32(%rsp)
+ vpshufd $0xEE,%xmm10,%xmm14
+ vmovdqu 16(%rdi),%xmm12
+ vpshufd $0x44,%xmm10,%xmm10
+ vmovdqa %xmm14,-96(%r11)
+ vmovdqa %xmm10,48(%rsp)
+ vpshufd $0xEE,%xmm11,%xmm13
+ vmovdqu 32(%rdi),%xmm10
+ vpshufd $0x44,%xmm11,%xmm11
+ vmovdqa %xmm13,-80(%r11)
+ vmovdqa %xmm11,64(%rsp)
+ vpshufd $0xEE,%xmm12,%xmm14
+ vmovdqu 48(%rdi),%xmm11
+ vpshufd $0x44,%xmm12,%xmm12
+ vmovdqa %xmm14,-64(%r11)
+ vmovdqa %xmm12,80(%rsp)
+ vpshufd $0xEE,%xmm10,%xmm13
+ vmovdqu 64(%rdi),%xmm12
+ vpshufd $0x44,%xmm10,%xmm10
+ vmovdqa %xmm13,-48(%r11)
+ vmovdqa %xmm10,96(%rsp)
+ vpshufd $0xEE,%xmm11,%xmm14
+ vpshufd $0x44,%xmm11,%xmm11
+ vmovdqa %xmm14,-32(%r11)
+ vmovdqa %xmm11,112(%rsp)
+ vpshufd $0xEE,%xmm12,%xmm13
+ vmovdqa 0(%rsp),%xmm14
+ vpshufd $0x44,%xmm12,%xmm12
+ vmovdqa %xmm13,-16(%r11)
+ vmovdqa %xmm12,128(%rsp)
+
+ jmp .Loop_avx
+
+.align 32
+.Loop_avx:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ vpmuludq %xmm5,%xmm14,%xmm10
+ vpmuludq %xmm6,%xmm14,%xmm11
+ vmovdqa %xmm2,32(%r11)
+ vpmuludq %xmm7,%xmm14,%xmm12
+ vmovdqa 16(%rsp),%xmm2
+ vpmuludq %xmm8,%xmm14,%xmm13
+ vpmuludq %xmm9,%xmm14,%xmm14
+
+ vmovdqa %xmm0,0(%r11)
+ vpmuludq 32(%rsp),%xmm9,%xmm0
+ vmovdqa %xmm1,16(%r11)
+ vpmuludq %xmm8,%xmm2,%xmm1
+ vpaddq %xmm0,%xmm10,%xmm10
+ vpaddq %xmm1,%xmm14,%xmm14
+ vmovdqa %xmm3,48(%r11)
+ vpmuludq %xmm7,%xmm2,%xmm0
+ vpmuludq %xmm6,%xmm2,%xmm1
+ vpaddq %xmm0,%xmm13,%xmm13
+ vmovdqa 48(%rsp),%xmm3
+ vpaddq %xmm1,%xmm12,%xmm12
+ vmovdqa %xmm4,64(%r11)
+ vpmuludq %xmm5,%xmm2,%xmm2
+ vpmuludq %xmm7,%xmm3,%xmm0
+ vpaddq %xmm2,%xmm11,%xmm11
+
+ vmovdqa 64(%rsp),%xmm4
+ vpaddq %xmm0,%xmm14,%xmm14
+ vpmuludq %xmm6,%xmm3,%xmm1
+ vpmuludq %xmm5,%xmm3,%xmm3
+ vpaddq %xmm1,%xmm13,%xmm13
+ vmovdqa 80(%rsp),%xmm2
+ vpaddq %xmm3,%xmm12,%xmm12
+ vpmuludq %xmm9,%xmm4,%xmm0
+ vpmuludq %xmm8,%xmm4,%xmm4
+ vpaddq %xmm0,%xmm11,%xmm11
+ vmovdqa 96(%rsp),%xmm3
+ vpaddq %xmm4,%xmm10,%xmm10
+
+ vmovdqa 128(%rsp),%xmm4
+ vpmuludq %xmm6,%xmm2,%xmm1
+ vpmuludq %xmm5,%xmm2,%xmm2
+ vpaddq %xmm1,%xmm14,%xmm14
+ vpaddq %xmm2,%xmm13,%xmm13
+ vpmuludq %xmm9,%xmm3,%xmm0
+ vpmuludq %xmm8,%xmm3,%xmm1
+ vpaddq %xmm0,%xmm12,%xmm12
+ vmovdqu 0(%rsi),%xmm0
+ vpaddq %xmm1,%xmm11,%xmm11
+ vpmuludq %xmm7,%xmm3,%xmm3
+ vpmuludq %xmm7,%xmm4,%xmm7
+ vpaddq %xmm3,%xmm10,%xmm10
+
+ vmovdqu 16(%rsi),%xmm1
+ vpaddq %xmm7,%xmm11,%xmm11
+ vpmuludq %xmm8,%xmm4,%xmm8
+ vpmuludq %xmm9,%xmm4,%xmm9
+ vpsrldq $6,%xmm0,%xmm2
+ vpaddq %xmm8,%xmm12,%xmm12
+ vpaddq %xmm9,%xmm13,%xmm13
+ vpsrldq $6,%xmm1,%xmm3
+ vpmuludq 112(%rsp),%xmm5,%xmm9
+ vpmuludq %xmm6,%xmm4,%xmm5
+ vpunpckhqdq %xmm1,%xmm0,%xmm4
+ vpaddq %xmm9,%xmm14,%xmm14
+ vmovdqa -144(%r11),%xmm9
+ vpaddq %xmm5,%xmm10,%xmm10
+
+ vpunpcklqdq %xmm1,%xmm0,%xmm0
+ vpunpcklqdq %xmm3,%xmm2,%xmm3
+
+
+ vpsrldq $5,%xmm4,%xmm4
+ vpsrlq $26,%xmm0,%xmm1
+ vpand %xmm15,%xmm0,%xmm0
+ vpsrlq $4,%xmm3,%xmm2
+ vpand %xmm15,%xmm1,%xmm1
+ vpand 0(%rcx),%xmm4,%xmm4
+ vpsrlq $30,%xmm3,%xmm3
+ vpand %xmm15,%xmm2,%xmm2
+ vpand %xmm15,%xmm3,%xmm3
+ vpor 32(%rcx),%xmm4,%xmm4
+
+ vpaddq 0(%r11),%xmm0,%xmm0
+ vpaddq 16(%r11),%xmm1,%xmm1
+ vpaddq 32(%r11),%xmm2,%xmm2
+ vpaddq 48(%r11),%xmm3,%xmm3
+ vpaddq 64(%r11),%xmm4,%xmm4
+
+ leaq 32(%rsi),%rax
+ leaq 64(%rsi),%rsi
+ subq $64,%rdx
+ cmovcq %rax,%rsi
+
+
+
+
+
+
+
+
+
+
+ vpmuludq %xmm0,%xmm9,%xmm5
+ vpmuludq %xmm1,%xmm9,%xmm6
+ vpaddq %xmm5,%xmm10,%xmm10
+ vpaddq %xmm6,%xmm11,%xmm11
+ vmovdqa -128(%r11),%xmm7
+ vpmuludq %xmm2,%xmm9,%xmm5
+ vpmuludq %xmm3,%xmm9,%xmm6
+ vpaddq %xmm5,%xmm12,%xmm12
+ vpaddq %xmm6,%xmm13,%xmm13
+ vpmuludq %xmm4,%xmm9,%xmm9
+ vpmuludq -112(%r11),%xmm4,%xmm5
+ vpaddq %xmm9,%xmm14,%xmm14
+
+ vpaddq %xmm5,%xmm10,%xmm10
+ vpmuludq %xmm2,%xmm7,%xmm6
+ vpmuludq %xmm3,%xmm7,%xmm5
+ vpaddq %xmm6,%xmm13,%xmm13
+ vmovdqa -96(%r11),%xmm8
+ vpaddq %xmm5,%xmm14,%xmm14
+ vpmuludq %xmm1,%xmm7,%xmm6
+ vpmuludq %xmm0,%xmm7,%xmm7
+ vpaddq %xmm6,%xmm12,%xmm12
+ vpaddq %xmm7,%xmm11,%xmm11
+
+ vmovdqa -80(%r11),%xmm9
+ vpmuludq %xmm2,%xmm8,%xmm5
+ vpmuludq %xmm1,%xmm8,%xmm6
+ vpaddq %xmm5,%xmm14,%xmm14
+ vpaddq %xmm6,%xmm13,%xmm13
+ vmovdqa -64(%r11),%xmm7
+ vpmuludq %xmm0,%xmm8,%xmm8
+ vpmuludq %xmm4,%xmm9,%xmm5
+ vpaddq %xmm8,%xmm12,%xmm12
+ vpaddq %xmm5,%xmm11,%xmm11
+ vmovdqa -48(%r11),%xmm8
+ vpmuludq %xmm3,%xmm9,%xmm9
+ vpmuludq %xmm1,%xmm7,%xmm6
+ vpaddq %xmm9,%xmm10,%xmm10
+
+ vmovdqa -16(%r11),%xmm9
+ vpaddq %xmm6,%xmm14,%xmm14
+ vpmuludq %xmm0,%xmm7,%xmm7
+ vpmuludq %xmm4,%xmm8,%xmm5
+ vpaddq %xmm7,%xmm13,%xmm13
+ vpaddq %xmm5,%xmm12,%xmm12
+ vmovdqu 32(%rsi),%xmm5
+ vpmuludq %xmm3,%xmm8,%xmm7
+ vpmuludq %xmm2,%xmm8,%xmm8
+ vpaddq %xmm7,%xmm11,%xmm11
+ vmovdqu 48(%rsi),%xmm6
+ vpaddq %xmm8,%xmm10,%xmm10
+
+ vpmuludq %xmm2,%xmm9,%xmm2
+ vpmuludq %xmm3,%xmm9,%xmm3
+ vpsrldq $6,%xmm5,%xmm7
+ vpaddq %xmm2,%xmm11,%xmm11
+ vpmuludq %xmm4,%xmm9,%xmm4
+ vpsrldq $6,%xmm6,%xmm8
+ vpaddq %xmm3,%xmm12,%xmm2
+ vpaddq %xmm4,%xmm13,%xmm3
+ vpmuludq -32(%r11),%xmm0,%xmm4
+ vpmuludq %xmm1,%xmm9,%xmm0
+ vpunpckhqdq %xmm6,%xmm5,%xmm9
+ vpaddq %xmm4,%xmm14,%xmm4
+ vpaddq %xmm0,%xmm10,%xmm0
+
+ vpunpcklqdq %xmm6,%xmm5,%xmm5
+ vpunpcklqdq %xmm8,%xmm7,%xmm8
+
+
+ vpsrldq $5,%xmm9,%xmm9
+ vpsrlq $26,%xmm5,%xmm6
+ vmovdqa 0(%rsp),%xmm14
+ vpand %xmm15,%xmm5,%xmm5
+ vpsrlq $4,%xmm8,%xmm7
+ vpand %xmm15,%xmm6,%xmm6
+ vpand 0(%rcx),%xmm9,%xmm9
+ vpsrlq $30,%xmm8,%xmm8
+ vpand %xmm15,%xmm7,%xmm7
+ vpand %xmm15,%xmm8,%xmm8
+ vpor 32(%rcx),%xmm9,%xmm9
+
+
+
+
+
+ vpsrlq $26,%xmm3,%xmm13
+ vpand %xmm15,%xmm3,%xmm3
+ vpaddq %xmm13,%xmm4,%xmm4
+
+ vpsrlq $26,%xmm0,%xmm10
+ vpand %xmm15,%xmm0,%xmm0
+ vpaddq %xmm10,%xmm11,%xmm1
+
+ vpsrlq $26,%xmm4,%xmm10
+ vpand %xmm15,%xmm4,%xmm4
+
+ vpsrlq $26,%xmm1,%xmm11
+ vpand %xmm15,%xmm1,%xmm1
+ vpaddq %xmm11,%xmm2,%xmm2
+
+ vpaddq %xmm10,%xmm0,%xmm0
+ vpsllq $2,%xmm10,%xmm10
+ vpaddq %xmm10,%xmm0,%xmm0
+
+ vpsrlq $26,%xmm2,%xmm12
+ vpand %xmm15,%xmm2,%xmm2
+ vpaddq %xmm12,%xmm3,%xmm3
+
+ vpsrlq $26,%xmm0,%xmm10
+ vpand %xmm15,%xmm0,%xmm0
+ vpaddq %xmm10,%xmm1,%xmm1
+
+ vpsrlq $26,%xmm3,%xmm13
+ vpand %xmm15,%xmm3,%xmm3
+ vpaddq %xmm13,%xmm4,%xmm4
+
+ ja .Loop_avx
+
+.Lskip_loop_avx:
+
+
+
+ vpshufd $0x10,%xmm14,%xmm14
+ addq $32,%rdx
+ jnz .Long_tail_avx
+
+ vpaddq %xmm2,%xmm7,%xmm7
+ vpaddq %xmm0,%xmm5,%xmm5
+ vpaddq %xmm1,%xmm6,%xmm6
+ vpaddq %xmm3,%xmm8,%xmm8
+ vpaddq %xmm4,%xmm9,%xmm9
+
+.Long_tail_avx:
+ vmovdqa %xmm2,32(%r11)
+ vmovdqa %xmm0,0(%r11)
+ vmovdqa %xmm1,16(%r11)
+ vmovdqa %xmm3,48(%r11)
+ vmovdqa %xmm4,64(%r11)
+
+
+
+
+
+
+
+ vpmuludq %xmm7,%xmm14,%xmm12
+ vpmuludq %xmm5,%xmm14,%xmm10
+ vpshufd $0x10,-48(%rdi),%xmm2
+ vpmuludq %xmm6,%xmm14,%xmm11
+ vpmuludq %xmm8,%xmm14,%xmm13
+ vpmuludq %xmm9,%xmm14,%xmm14
+
+ vpmuludq %xmm8,%xmm2,%xmm0
+ vpaddq %xmm0,%xmm14,%xmm14
+ vpshufd $0x10,-32(%rdi),%xmm3
+ vpmuludq %xmm7,%xmm2,%xmm1
+ vpaddq %xmm1,%xmm13,%xmm13
+ vpshufd $0x10,-16(%rdi),%xmm4
+ vpmuludq %xmm6,%xmm2,%xmm0
+ vpaddq %xmm0,%xmm12,%xmm12
+ vpmuludq %xmm5,%xmm2,%xmm2
+ vpaddq %xmm2,%xmm11,%xmm11
+ vpmuludq %xmm9,%xmm3,%xmm3
+ vpaddq %xmm3,%xmm10,%xmm10
+
+ vpshufd $0x10,0(%rdi),%xmm2
+ vpmuludq %xmm7,%xmm4,%xmm1
+ vpaddq %xmm1,%xmm14,%xmm14
+ vpmuludq %xmm6,%xmm4,%xmm0
+ vpaddq %xmm0,%xmm13,%xmm13
+ vpshufd $0x10,16(%rdi),%xmm3
+ vpmuludq %xmm5,%xmm4,%xmm4
+ vpaddq %xmm4,%xmm12,%xmm12
+ vpmuludq %xmm9,%xmm2,%xmm1
+ vpaddq %xmm1,%xmm11,%xmm11
+ vpshufd $0x10,32(%rdi),%xmm4
+ vpmuludq %xmm8,%xmm2,%xmm2
+ vpaddq %xmm2,%xmm10,%xmm10
+
+ vpmuludq %xmm6,%xmm3,%xmm0
+ vpaddq %xmm0,%xmm14,%xmm14
+ vpmuludq %xmm5,%xmm3,%xmm3
+ vpaddq %xmm3,%xmm13,%xmm13
+ vpshufd $0x10,48(%rdi),%xmm2
+ vpmuludq %xmm9,%xmm4,%xmm1
+ vpaddq %xmm1,%xmm12,%xmm12
+ vpshufd $0x10,64(%rdi),%xmm3
+ vpmuludq %xmm8,%xmm4,%xmm0
+ vpaddq %xmm0,%xmm11,%xmm11
+ vpmuludq %xmm7,%xmm4,%xmm4
+ vpaddq %xmm4,%xmm10,%xmm10
+
+ vpmuludq %xmm5,%xmm2,%xmm2
+ vpaddq %xmm2,%xmm14,%xmm14
+ vpmuludq %xmm9,%xmm3,%xmm1
+ vpaddq %xmm1,%xmm13,%xmm13
+ vpmuludq %xmm8,%xmm3,%xmm0
+ vpaddq %xmm0,%xmm12,%xmm12
+ vpmuludq %xmm7,%xmm3,%xmm1
+ vpaddq %xmm1,%xmm11,%xmm11
+ vpmuludq %xmm6,%xmm3,%xmm3
+ vpaddq %xmm3,%xmm10,%xmm10
+
+ jz .Lshort_tail_avx
+
+ vmovdqu 0(%rsi),%xmm0
+ vmovdqu 16(%rsi),%xmm1
+
+ vpsrldq $6,%xmm0,%xmm2
+ vpsrldq $6,%xmm1,%xmm3
+ vpunpckhqdq %xmm1,%xmm0,%xmm4
+ vpunpcklqdq %xmm1,%xmm0,%xmm0
+ vpunpcklqdq %xmm3,%xmm2,%xmm3
+
+ vpsrlq $40,%xmm4,%xmm4
+ vpsrlq $26,%xmm0,%xmm1
+ vpand %xmm15,%xmm0,%xmm0
+ vpsrlq $4,%xmm3,%xmm2
+ vpand %xmm15,%xmm1,%xmm1
+ vpsrlq $30,%xmm3,%xmm3
+ vpand %xmm15,%xmm2,%xmm2
+ vpand %xmm15,%xmm3,%xmm3
+ vpor 32(%rcx),%xmm4,%xmm4
+
+ vpshufd $0x32,-64(%rdi),%xmm9
+ vpaddq 0(%r11),%xmm0,%xmm0
+ vpaddq 16(%r11),%xmm1,%xmm1
+ vpaddq 32(%r11),%xmm2,%xmm2
+ vpaddq 48(%r11),%xmm3,%xmm3
+ vpaddq 64(%r11),%xmm4,%xmm4
+
+
+
+
+ vpmuludq %xmm0,%xmm9,%xmm5
+ vpaddq %xmm5,%xmm10,%xmm10
+ vpmuludq %xmm1,%xmm9,%xmm6
+ vpaddq %xmm6,%xmm11,%xmm11
+ vpmuludq %xmm2,%xmm9,%xmm5
+ vpaddq %xmm5,%xmm12,%xmm12
+ vpshufd $0x32,-48(%rdi),%xmm7
+ vpmuludq %xmm3,%xmm9,%xmm6
+ vpaddq %xmm6,%xmm13,%xmm13
+ vpmuludq %xmm4,%xmm9,%xmm9
+ vpaddq %xmm9,%xmm14,%xmm14
+
+ vpmuludq %xmm3,%xmm7,%xmm5
+ vpaddq %xmm5,%xmm14,%xmm14
+ vpshufd $0x32,-32(%rdi),%xmm8
+ vpmuludq %xmm2,%xmm7,%xmm6
+ vpaddq %xmm6,%xmm13,%xmm13
+ vpshufd $0x32,-16(%rdi),%xmm9
+ vpmuludq %xmm1,%xmm7,%xmm5
+ vpaddq %xmm5,%xmm12,%xmm12
+ vpmuludq %xmm0,%xmm7,%xmm7
+ vpaddq %xmm7,%xmm11,%xmm11
+ vpmuludq %xmm4,%xmm8,%xmm8
+ vpaddq %xmm8,%xmm10,%xmm10
+
+ vpshufd $0x32,0(%rdi),%xmm7
+ vpmuludq %xmm2,%xmm9,%xmm6
+ vpaddq %xmm6,%xmm14,%xmm14
+ vpmuludq %xmm1,%xmm9,%xmm5
+ vpaddq %xmm5,%xmm13,%xmm13
+ vpshufd $0x32,16(%rdi),%xmm8
+ vpmuludq %xmm0,%xmm9,%xmm9
+ vpaddq %xmm9,%xmm12,%xmm12
+ vpmuludq %xmm4,%xmm7,%xmm6
+ vpaddq %xmm6,%xmm11,%xmm11
+ vpshufd $0x32,32(%rdi),%xmm9
+ vpmuludq %xmm3,%xmm7,%xmm7
+ vpaddq %xmm7,%xmm10,%xmm10
+
+ vpmuludq %xmm1,%xmm8,%xmm5
+ vpaddq %xmm5,%xmm14,%xmm14
+ vpmuludq %xmm0,%xmm8,%xmm8
+ vpaddq %xmm8,%xmm13,%xmm13
+ vpshufd $0x32,48(%rdi),%xmm7
+ vpmuludq %xmm4,%xmm9,%xmm6
+ vpaddq %xmm6,%xmm12,%xmm12
+ vpshufd $0x32,64(%rdi),%xmm8
+ vpmuludq %xmm3,%xmm9,%xmm5
+ vpaddq %xmm5,%xmm11,%xmm11
+ vpmuludq %xmm2,%xmm9,%xmm9
+ vpaddq %xmm9,%xmm10,%xmm10
+
+ vpmuludq %xmm0,%xmm7,%xmm7
+ vpaddq %xmm7,%xmm14,%xmm14
+ vpmuludq %xmm4,%xmm8,%xmm6
+ vpaddq %xmm6,%xmm13,%xmm13
+ vpmuludq %xmm3,%xmm8,%xmm5
+ vpaddq %xmm5,%xmm12,%xmm12
+ vpmuludq %xmm2,%xmm8,%xmm6
+ vpaddq %xmm6,%xmm11,%xmm11
+ vpmuludq %xmm1,%xmm8,%xmm8
+ vpaddq %xmm8,%xmm10,%xmm10
+
+.Lshort_tail_avx:
+
+
+
+ vpsrldq $8,%xmm14,%xmm9
+ vpsrldq $8,%xmm13,%xmm8
+ vpsrldq $8,%xmm11,%xmm6
+ vpsrldq $8,%xmm10,%xmm5
+ vpsrldq $8,%xmm12,%xmm7
+ vpaddq %xmm8,%xmm13,%xmm13
+ vpaddq %xmm9,%xmm14,%xmm14
+ vpaddq %xmm5,%xmm10,%xmm10
+ vpaddq %xmm6,%xmm11,%xmm11
+ vpaddq %xmm7,%xmm12,%xmm12
+
+
+
+
+ vpsrlq $26,%xmm13,%xmm3
+ vpand %xmm15,%xmm13,%xmm13
+ vpaddq %xmm3,%xmm14,%xmm14
+
+ vpsrlq $26,%xmm10,%xmm0
+ vpand %xmm15,%xmm10,%xmm10
+ vpaddq %xmm0,%xmm11,%xmm11
+
+ vpsrlq $26,%xmm14,%xmm4
+ vpand %xmm15,%xmm14,%xmm14
+
+ vpsrlq $26,%xmm11,%xmm1
+ vpand %xmm15,%xmm11,%xmm11
+ vpaddq %xmm1,%xmm12,%xmm12
+
+ vpaddq %xmm4,%xmm10,%xmm10
+ vpsllq $2,%xmm4,%xmm4
+ vpaddq %xmm4,%xmm10,%xmm10
+
+ vpsrlq $26,%xmm12,%xmm2
+ vpand %xmm15,%xmm12,%xmm12
+ vpaddq %xmm2,%xmm13,%xmm13
+
+ vpsrlq $26,%xmm10,%xmm0
+ vpand %xmm15,%xmm10,%xmm10
+ vpaddq %xmm0,%xmm11,%xmm11
+
+ vpsrlq $26,%xmm13,%xmm3
+ vpand %xmm15,%xmm13,%xmm13
+ vpaddq %xmm3,%xmm14,%xmm14
+
+ vmovd %xmm10,-112(%rdi)
+ vmovd %xmm11,-108(%rdi)
+ vmovd %xmm12,-104(%rdi)
+ vmovd %xmm13,-100(%rdi)
+ vmovd %xmm14,-96(%rdi)
+ leaq 88(%r11),%rsp
+.cfi_def_cfa %rsp,8
+ vzeroupper
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size poly1305_blocks_avx,.-poly1305_blocks_avx
+
+.type poly1305_emit_avx,@function
+.align 32
+poly1305_emit_avx:
+ cmpl $0,20(%rdi)
+ je .Lemit
+
+ movl 0(%rdi),%eax
+ movl 4(%rdi),%ecx
+ movl 8(%rdi),%r8d
+ movl 12(%rdi),%r11d
+ movl 16(%rdi),%r10d
+
+ shlq $26,%rcx
+ movq %r8,%r9
+ shlq $52,%r8
+ addq %rcx,%rax
+ shrq $12,%r9
+ addq %rax,%r8
+ adcq $0,%r9
+
+ shlq $14,%r11
+ movq %r10,%rax
+ shrq $24,%r10
+ addq %r11,%r9
+ shlq $40,%rax
+ addq %rax,%r9
+ adcq $0,%r10
+
+ movq %r10,%rax
+ movq %r10,%rcx
+ andq $3,%r10
+ shrq $2,%rax
+ andq $-4,%rcx
+ addq %rcx,%rax
+ addq %rax,%r8
+ adcq $0,%r9
+ adcq $0,%r10
+
+ movq %r8,%rax
+ addq $5,%r8
+ movq %r9,%rcx
+ adcq $0,%r9
+ adcq $0,%r10
+ shrq $2,%r10
+ cmovnzq %r8,%rax
+ cmovnzq %r9,%rcx
+
+ addq 0(%rdx),%rax
+ adcq 8(%rdx),%rcx
+ movq %rax,0(%rsi)
+ movq %rcx,8(%rsi)
+
+ .byte 0xf3,0xc3
+.size poly1305_emit_avx,.-poly1305_emit_avx
+.type poly1305_blocks_avx2,@function
+.align 32
+poly1305_blocks_avx2:
+.cfi_startproc
+ movl 20(%rdi),%r8d
+ cmpq $128,%rdx
+ jae .Lblocks_avx2
+ testl %r8d,%r8d
+ jz .Lblocks
+
+.Lblocks_avx2:
+ andq $-16,%rdx
+ jz .Lno_data_avx2
+
+ vzeroupper
+
+ testl %r8d,%r8d
+ jz .Lbase2_64_avx2
+
+ testq $63,%rdx
+ jz .Leven_avx2
+
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+.Lblocks_avx2_body:
+
+ movq %rdx,%r15
+
+ movq 0(%rdi),%r8
+ movq 8(%rdi),%r9
+ movl 16(%rdi),%ebp
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+
+ movl %r8d,%r14d
+ andq $-2147483648,%r8
+ movq %r9,%r12
+ movl %r9d,%ebx
+ andq $-2147483648,%r9
+
+ shrq $6,%r8
+ shlq $52,%r12
+ addq %r8,%r14
+ shrq $12,%rbx
+ shrq $18,%r9
+ addq %r12,%r14
+ adcq %r9,%rbx
+
+ movq %rbp,%r8
+ shlq $40,%r8
+ shrq $24,%rbp
+ addq %r8,%rbx
+ adcq $0,%rbp
+
+ movq $-4,%r9
+ movq %rbp,%r8
+ andq %rbp,%r9
+ shrq $2,%r8
+ andq $3,%rbp
+ addq %r9,%r8
+ addq %r8,%r14
+ adcq $0,%rbx
+ adcq $0,%rbp
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+.Lbase2_26_pre_avx2:
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%rbp
+ subq $16,%r15
+
+ call __poly1305_block
+ movq %r12,%rax
+
+ testq $63,%r15
+ jnz .Lbase2_26_pre_avx2
+
+ testq %rcx,%rcx
+ jz .Lstore_base2_64_avx2
+
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r11
+ movq %rbx,%r12
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r11
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r11,%r14
+ shlq $24,%rbp
+ andq $0x3ffffff,%r14
+ shrq $40,%r12
+ andq $0x3ffffff,%rbx
+ orq %r12,%rbp
+
+ testq %r15,%r15
+ jz .Lstore_base2_26_avx2
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %ebp,%xmm4
+ jmp .Lproceed_avx2
+
+.align 32
+.Lstore_base2_64_avx2:
+ movq %r14,0(%rdi)
+ movq %rbx,8(%rdi)
+ movq %rbp,16(%rdi)
+ jmp .Ldone_avx2
+
+.align 16
+.Lstore_base2_26_avx2:
+ movl %eax,0(%rdi)
+ movl %edx,4(%rdi)
+ movl %r14d,8(%rdi)
+ movl %ebx,12(%rdi)
+ movl %ebp,16(%rdi)
+.align 16
+.Ldone_avx2:
+ movq 0(%rsp),%r15
+.cfi_restore %r15
+ movq 8(%rsp),%r14
+.cfi_restore %r14
+ movq 16(%rsp),%r13
+.cfi_restore %r13
+ movq 24(%rsp),%r12
+.cfi_restore %r12
+ movq 32(%rsp),%rbp
+.cfi_restore %rbp
+ movq 40(%rsp),%rbx
+.cfi_restore %rbx
+ leaq 48(%rsp),%rsp
+.cfi_adjust_cfa_offset -48
+.Lno_data_avx2:
+.Lblocks_avx2_epilogue:
+ .byte 0xf3,0xc3
+.cfi_endproc
+
+.align 32
+.Lbase2_64_avx2:
+.cfi_startproc
+ pushq %rbx
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbx,-16
+ pushq %rbp
+.cfi_adjust_cfa_offset 8
+.cfi_offset %rbp,-24
+ pushq %r12
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r12,-32
+ pushq %r13
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r13,-40
+ pushq %r14
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r14,-48
+ pushq %r15
+.cfi_adjust_cfa_offset 8
+.cfi_offset %r15,-56
+.Lbase2_64_avx2_body:
+
+ movq %rdx,%r15
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+ movq 0(%rdi),%r14
+ movq 8(%rdi),%rbx
+ movl 16(%rdi),%ebp
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+ testq $63,%rdx
+ jz .Linit_avx2
+
+.Lbase2_64_pre_avx2:
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%rbp
+ subq $16,%r15
+
+ call __poly1305_block
+ movq %r12,%rax
+
+ testq $63,%r15
+ jnz .Lbase2_64_pre_avx2
+
+.Linit_avx2:
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r8
+ movq %rbx,%r9
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r8
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r8,%r14
+ shlq $24,%rbp
+ andq $0x3ffffff,%r14
+ shrq $40,%r9
+ andq $0x3ffffff,%rbx
+ orq %r9,%rbp
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %ebp,%xmm4
+ movl $1,20(%rdi)
+
+ call __poly1305_init_avx
+
+.Lproceed_avx2:
+ movq %r15,%rdx
+ movl OPENSSL_ia32cap_P+8(%rip),%r10d
+ movl $3221291008,%r11d
+
+ movq 0(%rsp),%r15
+.cfi_restore %r15
+ movq 8(%rsp),%r14
+.cfi_restore %r14
+ movq 16(%rsp),%r13
+.cfi_restore %r13
+ movq 24(%rsp),%r12
+.cfi_restore %r12
+ movq 32(%rsp),%rbp
+.cfi_restore %rbp
+ movq 40(%rsp),%rbx
+.cfi_restore %rbx
+ leaq 48(%rsp),%rax
+ leaq 48(%rsp),%rsp
+.cfi_adjust_cfa_offset -48
+.Lbase2_64_avx2_epilogue:
+ jmp .Ldo_avx2
+.cfi_endproc
+
+.align 32
+.Leven_avx2:
+.cfi_startproc
+ movl OPENSSL_ia32cap_P+8(%rip),%r10d
+ vmovd 0(%rdi),%xmm0
+ vmovd 4(%rdi),%xmm1
+ vmovd 8(%rdi),%xmm2
+ vmovd 12(%rdi),%xmm3
+ vmovd 16(%rdi),%xmm4
+
+.Ldo_avx2:
+ cmpq $512,%rdx
+ jb .Lskip_avx512
+ andl %r11d,%r10d
+ testl $65536,%r10d
+ jnz .Lblocks_avx512
+.Lskip_avx512:
+ leaq -8(%rsp),%r11
+.cfi_def_cfa %r11,16
+ subq $0x128,%rsp
+ leaq .Lconst(%rip),%rcx
+ leaq 48+64(%rdi),%rdi
+ vmovdqa 96(%rcx),%ymm7
+
+
+ vmovdqu -64(%rdi),%xmm9
+ andq $-512,%rsp
+ vmovdqu -48(%rdi),%xmm10
+ vmovdqu -32(%rdi),%xmm6
+ vmovdqu -16(%rdi),%xmm11
+ vmovdqu 0(%rdi),%xmm12
+ vmovdqu 16(%rdi),%xmm13
+ leaq 144(%rsp),%rax
+ vmovdqu 32(%rdi),%xmm14
+ vpermd %ymm9,%ymm7,%ymm9
+ vmovdqu 48(%rdi),%xmm15
+ vpermd %ymm10,%ymm7,%ymm10
+ vmovdqu 64(%rdi),%xmm5
+ vpermd %ymm6,%ymm7,%ymm6
+ vmovdqa %ymm9,0(%rsp)
+ vpermd %ymm11,%ymm7,%ymm11
+ vmovdqa %ymm10,32-144(%rax)
+ vpermd %ymm12,%ymm7,%ymm12
+ vmovdqa %ymm6,64-144(%rax)
+ vpermd %ymm13,%ymm7,%ymm13
+ vmovdqa %ymm11,96-144(%rax)
+ vpermd %ymm14,%ymm7,%ymm14
+ vmovdqa %ymm12,128-144(%rax)
+ vpermd %ymm15,%ymm7,%ymm15
+ vmovdqa %ymm13,160-144(%rax)
+ vpermd %ymm5,%ymm7,%ymm5
+ vmovdqa %ymm14,192-144(%rax)
+ vmovdqa %ymm15,224-144(%rax)
+ vmovdqa %ymm5,256-144(%rax)
+ vmovdqa 64(%rcx),%ymm5
+
+
+
+ vmovdqu 0(%rsi),%xmm7
+ vmovdqu 16(%rsi),%xmm8
+ vinserti128 $1,32(%rsi),%ymm7,%ymm7
+ vinserti128 $1,48(%rsi),%ymm8,%ymm8
+ leaq 64(%rsi),%rsi
+
+ vpsrldq $6,%ymm7,%ymm9
+ vpsrldq $6,%ymm8,%ymm10
+ vpunpckhqdq %ymm8,%ymm7,%ymm6
+ vpunpcklqdq %ymm10,%ymm9,%ymm9
+ vpunpcklqdq %ymm8,%ymm7,%ymm7
+
+ vpsrlq $30,%ymm9,%ymm10
+ vpsrlq $4,%ymm9,%ymm9
+ vpsrlq $26,%ymm7,%ymm8
+ vpsrlq $40,%ymm6,%ymm6
+ vpand %ymm5,%ymm9,%ymm9
+ vpand %ymm5,%ymm7,%ymm7
+ vpand %ymm5,%ymm8,%ymm8
+ vpand %ymm5,%ymm10,%ymm10
+ vpor 32(%rcx),%ymm6,%ymm6
+
+ vpaddq %ymm2,%ymm9,%ymm2
+ subq $64,%rdx
+ jz .Ltail_avx2
+ jmp .Loop_avx2
+
+.align 32
+.Loop_avx2:
+
+
+
+
+
+
+
+
+ vpaddq %ymm0,%ymm7,%ymm0
+ vmovdqa 0(%rsp),%ymm7
+ vpaddq %ymm1,%ymm8,%ymm1
+ vmovdqa 32(%rsp),%ymm8
+ vpaddq %ymm3,%ymm10,%ymm3
+ vmovdqa 96(%rsp),%ymm9
+ vpaddq %ymm4,%ymm6,%ymm4
+ vmovdqa 48(%rax),%ymm10
+ vmovdqa 112(%rax),%ymm5
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ vpmuludq %ymm2,%ymm7,%ymm13
+ vpmuludq %ymm2,%ymm8,%ymm14
+ vpmuludq %ymm2,%ymm9,%ymm15
+ vpmuludq %ymm2,%ymm10,%ymm11
+ vpmuludq %ymm2,%ymm5,%ymm12
+
+ vpmuludq %ymm0,%ymm8,%ymm6
+ vpmuludq %ymm1,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq 64(%rsp),%ymm4,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm11,%ymm11
+ vmovdqa -16(%rax),%ymm8
+
+ vpmuludq %ymm0,%ymm7,%ymm6
+ vpmuludq %ymm1,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vpmuludq %ymm3,%ymm7,%ymm6
+ vpmuludq %ymm4,%ymm7,%ymm2
+ vmovdqu 0(%rsi),%xmm7
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm2,%ymm15,%ymm15
+ vinserti128 $1,32(%rsi),%ymm7,%ymm7
+
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq %ymm4,%ymm8,%ymm2
+ vmovdqu 16(%rsi),%xmm8
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vmovdqa 16(%rax),%ymm2
+ vpmuludq %ymm1,%ymm9,%ymm6
+ vpmuludq %ymm0,%ymm9,%ymm9
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm9,%ymm13,%ymm13
+ vinserti128 $1,48(%rsi),%ymm8,%ymm8
+ leaq 64(%rsi),%rsi
+
+ vpmuludq %ymm1,%ymm2,%ymm6
+ vpmuludq %ymm0,%ymm2,%ymm2
+ vpsrldq $6,%ymm7,%ymm9
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm14,%ymm14
+ vpmuludq %ymm3,%ymm10,%ymm6
+ vpmuludq %ymm4,%ymm10,%ymm2
+ vpsrldq $6,%ymm8,%ymm10
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpunpckhqdq %ymm8,%ymm7,%ymm6
+
+ vpmuludq %ymm3,%ymm5,%ymm3
+ vpmuludq %ymm4,%ymm5,%ymm4
+ vpunpcklqdq %ymm8,%ymm7,%ymm7
+ vpaddq %ymm3,%ymm13,%ymm2
+ vpaddq %ymm4,%ymm14,%ymm3
+ vpunpcklqdq %ymm10,%ymm9,%ymm10
+ vpmuludq 80(%rax),%ymm0,%ymm4
+ vpmuludq %ymm1,%ymm5,%ymm0
+ vmovdqa 64(%rcx),%ymm5
+ vpaddq %ymm4,%ymm15,%ymm4
+ vpaddq %ymm0,%ymm11,%ymm0
+
+
+
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm12,%ymm1
+
+ vpsrlq $26,%ymm4,%ymm15
+ vpand %ymm5,%ymm4,%ymm4
+
+ vpsrlq $4,%ymm10,%ymm9
+
+ vpsrlq $26,%ymm1,%ymm12
+ vpand %ymm5,%ymm1,%ymm1
+ vpaddq %ymm12,%ymm2,%ymm2
+
+ vpaddq %ymm15,%ymm0,%ymm0
+ vpsllq $2,%ymm15,%ymm15
+ vpaddq %ymm15,%ymm0,%ymm0
+
+ vpand %ymm5,%ymm9,%ymm9
+ vpsrlq $26,%ymm7,%ymm8
+
+ vpsrlq $26,%ymm2,%ymm13
+ vpand %ymm5,%ymm2,%ymm2
+ vpaddq %ymm13,%ymm3,%ymm3
+
+ vpaddq %ymm9,%ymm2,%ymm2
+ vpsrlq $30,%ymm10,%ymm10
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $40,%ymm6,%ymm6
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpand %ymm5,%ymm7,%ymm7
+ vpand %ymm5,%ymm8,%ymm8
+ vpand %ymm5,%ymm10,%ymm10
+ vpor 32(%rcx),%ymm6,%ymm6
+
+ subq $64,%rdx
+ jnz .Loop_avx2
+
+.byte 0x66,0x90
+.Ltail_avx2:
+
+
+
+
+
+
+
+ vpaddq %ymm0,%ymm7,%ymm0
+ vmovdqu 4(%rsp),%ymm7
+ vpaddq %ymm1,%ymm8,%ymm1
+ vmovdqu 36(%rsp),%ymm8
+ vpaddq %ymm3,%ymm10,%ymm3
+ vmovdqu 100(%rsp),%ymm9
+ vpaddq %ymm4,%ymm6,%ymm4
+ vmovdqu 52(%rax),%ymm10
+ vmovdqu 116(%rax),%ymm5
+
+ vpmuludq %ymm2,%ymm7,%ymm13
+ vpmuludq %ymm2,%ymm8,%ymm14
+ vpmuludq %ymm2,%ymm9,%ymm15
+ vpmuludq %ymm2,%ymm10,%ymm11
+ vpmuludq %ymm2,%ymm5,%ymm12
+
+ vpmuludq %ymm0,%ymm8,%ymm6
+ vpmuludq %ymm1,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq 68(%rsp),%ymm4,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm11,%ymm11
+
+ vpmuludq %ymm0,%ymm7,%ymm6
+ vpmuludq %ymm1,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vmovdqu -12(%rax),%ymm8
+ vpaddq %ymm2,%ymm12,%ymm12
+ vpmuludq %ymm3,%ymm7,%ymm6
+ vpmuludq %ymm4,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm2,%ymm15,%ymm15
+
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq %ymm4,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vmovdqu 20(%rax),%ymm2
+ vpmuludq %ymm1,%ymm9,%ymm6
+ vpmuludq %ymm0,%ymm9,%ymm9
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm9,%ymm13,%ymm13
+
+ vpmuludq %ymm1,%ymm2,%ymm6
+ vpmuludq %ymm0,%ymm2,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm14,%ymm14
+ vpmuludq %ymm3,%ymm10,%ymm6
+ vpmuludq %ymm4,%ymm10,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+
+ vpmuludq %ymm3,%ymm5,%ymm3
+ vpmuludq %ymm4,%ymm5,%ymm4
+ vpaddq %ymm3,%ymm13,%ymm2
+ vpaddq %ymm4,%ymm14,%ymm3
+ vpmuludq 84(%rax),%ymm0,%ymm4
+ vpmuludq %ymm1,%ymm5,%ymm0
+ vmovdqa 64(%rcx),%ymm5
+ vpaddq %ymm4,%ymm15,%ymm4
+ vpaddq %ymm0,%ymm11,%ymm0
+
+
+
+
+ vpsrldq $8,%ymm12,%ymm8
+ vpsrldq $8,%ymm2,%ymm9
+ vpsrldq $8,%ymm3,%ymm10
+ vpsrldq $8,%ymm4,%ymm6
+ vpsrldq $8,%ymm0,%ymm7
+ vpaddq %ymm8,%ymm12,%ymm12
+ vpaddq %ymm9,%ymm2,%ymm2
+ vpaddq %ymm10,%ymm3,%ymm3
+ vpaddq %ymm6,%ymm4,%ymm4
+ vpaddq %ymm7,%ymm0,%ymm0
+
+ vpermq $0x2,%ymm3,%ymm10
+ vpermq $0x2,%ymm4,%ymm6
+ vpermq $0x2,%ymm0,%ymm7
+ vpermq $0x2,%ymm12,%ymm8
+ vpermq $0x2,%ymm2,%ymm9
+ vpaddq %ymm10,%ymm3,%ymm3
+ vpaddq %ymm6,%ymm4,%ymm4
+ vpaddq %ymm7,%ymm0,%ymm0
+ vpaddq %ymm8,%ymm12,%ymm12
+ vpaddq %ymm9,%ymm2,%ymm2
+
+
+
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm12,%ymm1
+
+ vpsrlq $26,%ymm4,%ymm15
+ vpand %ymm5,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm1,%ymm12
+ vpand %ymm5,%ymm1,%ymm1
+ vpaddq %ymm12,%ymm2,%ymm2
+
+ vpaddq %ymm15,%ymm0,%ymm0
+ vpsllq $2,%ymm15,%ymm15
+ vpaddq %ymm15,%ymm0,%ymm0
+
+ vpsrlq $26,%ymm2,%ymm13
+ vpand %ymm5,%ymm2,%ymm2
+ vpaddq %ymm13,%ymm3,%ymm3
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vmovd %xmm0,-112(%rdi)
+ vmovd %xmm1,-108(%rdi)
+ vmovd %xmm2,-104(%rdi)
+ vmovd %xmm3,-100(%rdi)
+ vmovd %xmm4,-96(%rdi)
+ leaq 8(%r11),%rsp
+.cfi_def_cfa %rsp,8
+ vzeroupper
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size poly1305_blocks_avx2,.-poly1305_blocks_avx2
+.type poly1305_blocks_avx512,@function
+.align 32
+poly1305_blocks_avx512:
+.cfi_startproc
+.Lblocks_avx512:
+ movl $15,%eax
+ kmovw %eax,%k2
+ leaq -8(%rsp),%r11
+.cfi_def_cfa %r11,16
+ subq $0x128,%rsp
+ leaq .Lconst(%rip),%rcx
+ leaq 48+64(%rdi),%rdi
+ vmovdqa 96(%rcx),%ymm9
+
+
+ vmovdqu -64(%rdi),%xmm11
+ andq $-512,%rsp
+ vmovdqu -48(%rdi),%xmm12
+ movq $0x20,%rax
+ vmovdqu -32(%rdi),%xmm7
+ vmovdqu -16(%rdi),%xmm13
+ vmovdqu 0(%rdi),%xmm8
+ vmovdqu 16(%rdi),%xmm14
+ vmovdqu 32(%rdi),%xmm10
+ vmovdqu 48(%rdi),%xmm15
+ vmovdqu 64(%rdi),%xmm6
+ vpermd %zmm11,%zmm9,%zmm16
+ vpbroadcastq 64(%rcx),%zmm5
+ vpermd %zmm12,%zmm9,%zmm17
+ vpermd %zmm7,%zmm9,%zmm21
+ vpermd %zmm13,%zmm9,%zmm18
+ vmovdqa64 %zmm16,0(%rsp){%k2}
+ vpsrlq $32,%zmm16,%zmm7
+ vpermd %zmm8,%zmm9,%zmm22
+ vmovdqu64 %zmm17,0(%rsp,%rax,1){%k2}
+ vpsrlq $32,%zmm17,%zmm8
+ vpermd %zmm14,%zmm9,%zmm19
+ vmovdqa64 %zmm21,64(%rsp){%k2}
+ vpermd %zmm10,%zmm9,%zmm23
+ vpermd %zmm15,%zmm9,%zmm20
+ vmovdqu64 %zmm18,64(%rsp,%rax,1){%k2}
+ vpermd %zmm6,%zmm9,%zmm24
+ vmovdqa64 %zmm22,128(%rsp){%k2}
+ vmovdqu64 %zmm19,128(%rsp,%rax,1){%k2}
+ vmovdqa64 %zmm23,192(%rsp){%k2}
+ vmovdqu64 %zmm20,192(%rsp,%rax,1){%k2}
+ vmovdqa64 %zmm24,256(%rsp){%k2}
+
+
+
+
+
+
+
+
+
+
+ vpmuludq %zmm7,%zmm16,%zmm11
+ vpmuludq %zmm7,%zmm17,%zmm12
+ vpmuludq %zmm7,%zmm18,%zmm13
+ vpmuludq %zmm7,%zmm19,%zmm14
+ vpmuludq %zmm7,%zmm20,%zmm15
+ vpsrlq $32,%zmm18,%zmm9
+
+ vpmuludq %zmm8,%zmm24,%zmm25
+ vpmuludq %zmm8,%zmm16,%zmm26
+ vpmuludq %zmm8,%zmm17,%zmm27
+ vpmuludq %zmm8,%zmm18,%zmm28
+ vpmuludq %zmm8,%zmm19,%zmm29
+ vpsrlq $32,%zmm19,%zmm10
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+
+ vpmuludq %zmm9,%zmm23,%zmm25
+ vpmuludq %zmm9,%zmm24,%zmm26
+ vpmuludq %zmm9,%zmm17,%zmm28
+ vpmuludq %zmm9,%zmm18,%zmm29
+ vpmuludq %zmm9,%zmm16,%zmm27
+ vpsrlq $32,%zmm20,%zmm6
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm10,%zmm22,%zmm25
+ vpmuludq %zmm10,%zmm16,%zmm28
+ vpmuludq %zmm10,%zmm17,%zmm29
+ vpmuludq %zmm10,%zmm23,%zmm26
+ vpmuludq %zmm10,%zmm24,%zmm27
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm6,%zmm24,%zmm28
+ vpmuludq %zmm6,%zmm16,%zmm29
+ vpmuludq %zmm6,%zmm21,%zmm25
+ vpmuludq %zmm6,%zmm22,%zmm26
+ vpmuludq %zmm6,%zmm23,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+
+
+ vmovdqu64 0(%rsi),%zmm10
+ vmovdqu64 64(%rsi),%zmm6
+ leaq 128(%rsi),%rsi
+
+
+
+
+ vpsrlq $26,%zmm14,%zmm28
+ vpandq %zmm5,%zmm14,%zmm14
+ vpaddq %zmm28,%zmm15,%zmm15
+
+ vpsrlq $26,%zmm11,%zmm25
+ vpandq %zmm5,%zmm11,%zmm11
+ vpaddq %zmm25,%zmm12,%zmm12
+
+ vpsrlq $26,%zmm15,%zmm29
+ vpandq %zmm5,%zmm15,%zmm15
+
+ vpsrlq $26,%zmm12,%zmm26
+ vpandq %zmm5,%zmm12,%zmm12
+ vpaddq %zmm26,%zmm13,%zmm13
+
+ vpaddq %zmm29,%zmm11,%zmm11
+ vpsllq $2,%zmm29,%zmm29
+ vpaddq %zmm29,%zmm11,%zmm11
+
+ vpsrlq $26,%zmm13,%zmm27
+ vpandq %zmm5,%zmm13,%zmm13
+ vpaddq %zmm27,%zmm14,%zmm14
+
+ vpsrlq $26,%zmm11,%zmm25
+ vpandq %zmm5,%zmm11,%zmm11
+ vpaddq %zmm25,%zmm12,%zmm12
+
+ vpsrlq $26,%zmm14,%zmm28
+ vpandq %zmm5,%zmm14,%zmm14
+ vpaddq %zmm28,%zmm15,%zmm15
+
+
+
+
+
+ vpunpcklqdq %zmm6,%zmm10,%zmm7
+ vpunpckhqdq %zmm6,%zmm10,%zmm6
+
+
+
+
+
+
+ vmovdqa32 128(%rcx),%zmm25
+ movl $0x7777,%eax
+ kmovw %eax,%k1
+
+ vpermd %zmm16,%zmm25,%zmm16
+ vpermd %zmm17,%zmm25,%zmm17
+ vpermd %zmm18,%zmm25,%zmm18
+ vpermd %zmm19,%zmm25,%zmm19
+ vpermd %zmm20,%zmm25,%zmm20
+
+ vpermd %zmm11,%zmm25,%zmm16{%k1}
+ vpermd %zmm12,%zmm25,%zmm17{%k1}
+ vpermd %zmm13,%zmm25,%zmm18{%k1}
+ vpermd %zmm14,%zmm25,%zmm19{%k1}
+ vpermd %zmm15,%zmm25,%zmm20{%k1}
+
+ vpslld $2,%zmm17,%zmm21
+ vpslld $2,%zmm18,%zmm22
+ vpslld $2,%zmm19,%zmm23
+ vpslld $2,%zmm20,%zmm24
+ vpaddd %zmm17,%zmm21,%zmm21
+ vpaddd %zmm18,%zmm22,%zmm22
+ vpaddd %zmm19,%zmm23,%zmm23
+ vpaddd %zmm20,%zmm24,%zmm24
+
+ vpbroadcastq 32(%rcx),%zmm30
+
+ vpsrlq $52,%zmm7,%zmm9
+ vpsllq $12,%zmm6,%zmm10
+ vporq %zmm10,%zmm9,%zmm9
+ vpsrlq $26,%zmm7,%zmm8
+ vpsrlq $14,%zmm6,%zmm10
+ vpsrlq $40,%zmm6,%zmm6
+ vpandq %zmm5,%zmm9,%zmm9
+ vpandq %zmm5,%zmm7,%zmm7
+
+
+
+
+ vpaddq %zmm2,%zmm9,%zmm2
+ subq $192,%rdx
+ jbe .Ltail_avx512
+ jmp .Loop_avx512
+
+.align 32
+.Loop_avx512:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ vpmuludq %zmm2,%zmm17,%zmm14
+ vpaddq %zmm0,%zmm7,%zmm0
+ vpmuludq %zmm2,%zmm18,%zmm15
+ vpandq %zmm5,%zmm8,%zmm8
+ vpmuludq %zmm2,%zmm23,%zmm11
+ vpandq %zmm5,%zmm10,%zmm10
+ vpmuludq %zmm2,%zmm24,%zmm12
+ vporq %zmm30,%zmm6,%zmm6
+ vpmuludq %zmm2,%zmm16,%zmm13
+ vpaddq %zmm1,%zmm8,%zmm1
+ vpaddq %zmm3,%zmm10,%zmm3
+ vpaddq %zmm4,%zmm6,%zmm4
+
+ vmovdqu64 0(%rsi),%zmm10
+ vmovdqu64 64(%rsi),%zmm6
+ leaq 128(%rsi),%rsi
+ vpmuludq %zmm0,%zmm19,%zmm28
+ vpmuludq %zmm0,%zmm20,%zmm29
+ vpmuludq %zmm0,%zmm16,%zmm25
+ vpmuludq %zmm0,%zmm17,%zmm26
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+
+ vpmuludq %zmm1,%zmm18,%zmm28
+ vpmuludq %zmm1,%zmm19,%zmm29
+ vpmuludq %zmm1,%zmm24,%zmm25
+ vpmuludq %zmm0,%zmm18,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpunpcklqdq %zmm6,%zmm10,%zmm7
+ vpunpckhqdq %zmm6,%zmm10,%zmm6
+
+ vpmuludq %zmm3,%zmm16,%zmm28
+ vpmuludq %zmm3,%zmm17,%zmm29
+ vpmuludq %zmm1,%zmm16,%zmm26
+ vpmuludq %zmm1,%zmm17,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm4,%zmm24,%zmm28
+ vpmuludq %zmm4,%zmm16,%zmm29
+ vpmuludq %zmm3,%zmm22,%zmm25
+ vpmuludq %zmm3,%zmm23,%zmm26
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpmuludq %zmm3,%zmm24,%zmm27
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm4,%zmm21,%zmm25
+ vpmuludq %zmm4,%zmm22,%zmm26
+ vpmuludq %zmm4,%zmm23,%zmm27
+ vpaddq %zmm25,%zmm11,%zmm0
+ vpaddq %zmm26,%zmm12,%zmm1
+ vpaddq %zmm27,%zmm13,%zmm2
+
+
+
+
+ vpsrlq $52,%zmm7,%zmm9
+ vpsllq $12,%zmm6,%zmm10
+
+ vpsrlq $26,%zmm14,%zmm3
+ vpandq %zmm5,%zmm14,%zmm14
+ vpaddq %zmm3,%zmm15,%zmm4
+
+ vporq %zmm10,%zmm9,%zmm9
+
+ vpsrlq $26,%zmm0,%zmm11
+ vpandq %zmm5,%zmm0,%zmm0
+ vpaddq %zmm11,%zmm1,%zmm1
+
+ vpandq %zmm5,%zmm9,%zmm9
+
+ vpsrlq $26,%zmm4,%zmm15
+ vpandq %zmm5,%zmm4,%zmm4
+
+ vpsrlq $26,%zmm1,%zmm12
+ vpandq %zmm5,%zmm1,%zmm1
+ vpaddq %zmm12,%zmm2,%zmm2
+
+ vpaddq %zmm15,%zmm0,%zmm0
+ vpsllq $2,%zmm15,%zmm15
+ vpaddq %zmm15,%zmm0,%zmm0
+
+ vpaddq %zmm9,%zmm2,%zmm2
+ vpsrlq $26,%zmm7,%zmm8
+
+ vpsrlq $26,%zmm2,%zmm13
+ vpandq %zmm5,%zmm2,%zmm2
+ vpaddq %zmm13,%zmm14,%zmm3
+
+ vpsrlq $14,%zmm6,%zmm10
+
+ vpsrlq $26,%zmm0,%zmm11
+ vpandq %zmm5,%zmm0,%zmm0
+ vpaddq %zmm11,%zmm1,%zmm1
+
+ vpsrlq $40,%zmm6,%zmm6
+
+ vpsrlq $26,%zmm3,%zmm14
+ vpandq %zmm5,%zmm3,%zmm3
+ vpaddq %zmm14,%zmm4,%zmm4
+
+ vpandq %zmm5,%zmm7,%zmm7
+
+
+
+
+ subq $128,%rdx
+ ja .Loop_avx512
+
+.Ltail_avx512:
+
+
+
+
+
+ vpsrlq $32,%zmm16,%zmm16
+ vpsrlq $32,%zmm17,%zmm17
+ vpsrlq $32,%zmm18,%zmm18
+ vpsrlq $32,%zmm23,%zmm23
+ vpsrlq $32,%zmm24,%zmm24
+ vpsrlq $32,%zmm19,%zmm19
+ vpsrlq $32,%zmm20,%zmm20
+ vpsrlq $32,%zmm21,%zmm21
+ vpsrlq $32,%zmm22,%zmm22
+
+
+
+ leaq (%rsi,%rdx,1),%rsi
+
+
+ vpaddq %zmm0,%zmm7,%zmm0
+
+ vpmuludq %zmm2,%zmm17,%zmm14
+ vpmuludq %zmm2,%zmm18,%zmm15
+ vpmuludq %zmm2,%zmm23,%zmm11
+ vpandq %zmm5,%zmm8,%zmm8
+ vpmuludq %zmm2,%zmm24,%zmm12
+ vpandq %zmm5,%zmm10,%zmm10
+ vpmuludq %zmm2,%zmm16,%zmm13
+ vporq %zmm30,%zmm6,%zmm6
+ vpaddq %zmm1,%zmm8,%zmm1
+ vpaddq %zmm3,%zmm10,%zmm3
+ vpaddq %zmm4,%zmm6,%zmm4
+
+ vmovdqu 0(%rsi),%xmm7
+ vpmuludq %zmm0,%zmm19,%zmm28
+ vpmuludq %zmm0,%zmm20,%zmm29
+ vpmuludq %zmm0,%zmm16,%zmm25
+ vpmuludq %zmm0,%zmm17,%zmm26
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+
+ vmovdqu 16(%rsi),%xmm8
+ vpmuludq %zmm1,%zmm18,%zmm28
+ vpmuludq %zmm1,%zmm19,%zmm29
+ vpmuludq %zmm1,%zmm24,%zmm25
+ vpmuludq %zmm0,%zmm18,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vinserti128 $1,32(%rsi),%ymm7,%ymm7
+ vpmuludq %zmm3,%zmm16,%zmm28
+ vpmuludq %zmm3,%zmm17,%zmm29
+ vpmuludq %zmm1,%zmm16,%zmm26
+ vpmuludq %zmm1,%zmm17,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vinserti128 $1,48(%rsi),%ymm8,%ymm8
+ vpmuludq %zmm4,%zmm24,%zmm28
+ vpmuludq %zmm4,%zmm16,%zmm29
+ vpmuludq %zmm3,%zmm22,%zmm25
+ vpmuludq %zmm3,%zmm23,%zmm26
+ vpmuludq %zmm3,%zmm24,%zmm27
+ vpaddq %zmm28,%zmm14,%zmm3
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm4,%zmm21,%zmm25
+ vpmuludq %zmm4,%zmm22,%zmm26
+ vpmuludq %zmm4,%zmm23,%zmm27
+ vpaddq %zmm25,%zmm11,%zmm0
+ vpaddq %zmm26,%zmm12,%zmm1
+ vpaddq %zmm27,%zmm13,%zmm2
+
+
+
+
+ movl $1,%eax
+ vpermq $0xb1,%zmm3,%zmm14
+ vpermq $0xb1,%zmm15,%zmm4
+ vpermq $0xb1,%zmm0,%zmm11
+ vpermq $0xb1,%zmm1,%zmm12
+ vpermq $0xb1,%zmm2,%zmm13
+ vpaddq %zmm14,%zmm3,%zmm3
+ vpaddq %zmm15,%zmm4,%zmm4
+ vpaddq %zmm11,%zmm0,%zmm0
+ vpaddq %zmm12,%zmm1,%zmm1
+ vpaddq %zmm13,%zmm2,%zmm2
+
+ kmovw %eax,%k3
+ vpermq $0x2,%zmm3,%zmm14
+ vpermq $0x2,%zmm4,%zmm15
+ vpermq $0x2,%zmm0,%zmm11
+ vpermq $0x2,%zmm1,%zmm12
+ vpermq $0x2,%zmm2,%zmm13
+ vpaddq %zmm14,%zmm3,%zmm3
+ vpaddq %zmm15,%zmm4,%zmm4
+ vpaddq %zmm11,%zmm0,%zmm0
+ vpaddq %zmm12,%zmm1,%zmm1
+ vpaddq %zmm13,%zmm2,%zmm2
+
+ vextracti64x4 $0x1,%zmm3,%ymm14
+ vextracti64x4 $0x1,%zmm4,%ymm15
+ vextracti64x4 $0x1,%zmm0,%ymm11
+ vextracti64x4 $0x1,%zmm1,%ymm12
+ vextracti64x4 $0x1,%zmm2,%ymm13
+ vpaddq %zmm14,%zmm3,%zmm3{%k3}{z}
+ vpaddq %zmm15,%zmm4,%zmm4{%k3}{z}
+ vpaddq %zmm11,%zmm0,%zmm0{%k3}{z}
+ vpaddq %zmm12,%zmm1,%zmm1{%k3}{z}
+ vpaddq %zmm13,%zmm2,%zmm2{%k3}{z}
+
+
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpsrldq $6,%ymm7,%ymm9
+ vpsrldq $6,%ymm8,%ymm10
+ vpunpckhqdq %ymm8,%ymm7,%ymm6
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpunpcklqdq %ymm10,%ymm9,%ymm9
+ vpunpcklqdq %ymm8,%ymm7,%ymm7
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $26,%ymm4,%ymm15
+ vpand %ymm5,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm1,%ymm12
+ vpand %ymm5,%ymm1,%ymm1
+ vpsrlq $30,%ymm9,%ymm10
+ vpsrlq $4,%ymm9,%ymm9
+ vpaddq %ymm12,%ymm2,%ymm2
+
+ vpaddq %ymm15,%ymm0,%ymm0
+ vpsllq $2,%ymm15,%ymm15
+ vpsrlq $26,%ymm7,%ymm8
+ vpsrlq $40,%ymm6,%ymm6
+ vpaddq %ymm15,%ymm0,%ymm0
+
+ vpsrlq $26,%ymm2,%ymm13
+ vpand %ymm5,%ymm2,%ymm2
+ vpand %ymm5,%ymm9,%ymm9
+ vpand %ymm5,%ymm7,%ymm7
+ vpaddq %ymm13,%ymm3,%ymm3
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm2,%ymm9,%ymm2
+ vpand %ymm5,%ymm8,%ymm8
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpand %ymm5,%ymm10,%ymm10
+ vpor 32(%rcx),%ymm6,%ymm6
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ leaq 144(%rsp),%rax
+ addq $64,%rdx
+ jnz .Ltail_avx2
+
+ vpsubq %ymm9,%ymm2,%ymm2
+ vmovd %xmm0,-112(%rdi)
+ vmovd %xmm1,-108(%rdi)
+ vmovd %xmm2,-104(%rdi)
+ vmovd %xmm3,-100(%rdi)
+ vmovd %xmm4,-96(%rdi)
+ vzeroall
+ leaq 8(%r11),%rsp
+.cfi_def_cfa %rsp,8
+ .byte 0xf3,0xc3
+.cfi_endproc
+.size poly1305_blocks_avx512,.-poly1305_blocks_avx512
+.type poly1305_init_base2_44,@function
+.align 32
+poly1305_init_base2_44:
+ xorq %rax,%rax
+ movq %rax,0(%rdi)
+ movq %rax,8(%rdi)
+ movq %rax,16(%rdi)
+
+.Linit_base2_44:
+ leaq poly1305_blocks_vpmadd52(%rip),%r10
+ leaq poly1305_emit_base2_44(%rip),%r11
+
+ movq $0x0ffffffc0fffffff,%rax
+ movq $0x0ffffffc0ffffffc,%rcx
+ andq 0(%rsi),%rax
+ movq $0x00000fffffffffff,%r8
+ andq 8(%rsi),%rcx
+ movq $0x00000fffffffffff,%r9
+ andq %rax,%r8
+ shrdq $44,%rcx,%rax
+ movq %r8,40(%rdi)
+ andq %r9,%rax
+ shrq $24,%rcx
+ movq %rax,48(%rdi)
+ leaq (%rax,%rax,4),%rax
+ movq %rcx,56(%rdi)
+ shlq $2,%rax
+ leaq (%rcx,%rcx,4),%rcx
+ shlq $2,%rcx
+ movq %rax,24(%rdi)
+ movq %rcx,32(%rdi)
+ movq $-1,64(%rdi)
+ movq %r10,0(%rdx)
+ movq %r11,8(%rdx)
+ movl $1,%eax
+ .byte 0xf3,0xc3
+.size poly1305_init_base2_44,.-poly1305_init_base2_44
+.type poly1305_blocks_vpmadd52,@function
+.align 32
+poly1305_blocks_vpmadd52:
+ shrq $4,%rdx
+ jz .Lno_data_vpmadd52
+
+ shlq $40,%rcx
+ movq 64(%rdi),%r8
+
+
+
+
+
+
+ movq $3,%rax
+ movq $1,%r10
+ cmpq $4,%rdx
+ cmovaeq %r10,%rax
+ testq %r8,%r8
+ cmovnsq %r10,%rax
+
+ andq %rdx,%rax
+ jz .Lblocks_vpmadd52_4x
+
+ subq %rax,%rdx
+ movl $7,%r10d
+ movl $1,%r11d
+ kmovw %r10d,%k7
+ leaq .L2_44_inp_permd(%rip),%r10
+ kmovw %r11d,%k1
+
+ vmovq %rcx,%xmm21
+ vmovdqa64 0(%r10),%ymm19
+ vmovdqa64 32(%r10),%ymm20
+ vpermq $0xcf,%ymm21,%ymm21
+ vmovdqa64 64(%r10),%ymm22
+
+ vmovdqu64 0(%rdi),%ymm16{%k7}{z}
+ vmovdqu64 40(%rdi),%ymm3{%k7}{z}
+ vmovdqu64 32(%rdi),%ymm4{%k7}{z}
+ vmovdqu64 24(%rdi),%ymm5{%k7}{z}
+
+ vmovdqa64 96(%r10),%ymm23
+ vmovdqa64 128(%r10),%ymm24
+
+ jmp .Loop_vpmadd52
+
+.align 32
+.Loop_vpmadd52:
+ vmovdqu32 0(%rsi),%xmm18
+ leaq 16(%rsi),%rsi
+
+ vpermd %ymm18,%ymm19,%ymm18
+ vpsrlvq %ymm20,%ymm18,%ymm18
+ vpandq %ymm22,%ymm18,%ymm18
+ vporq %ymm21,%ymm18,%ymm18
+
+ vpaddq %ymm18,%ymm16,%ymm16
+
+ vpermq $0,%ymm16,%ymm0{%k7}{z}
+ vpermq $85,%ymm16,%ymm1{%k7}{z}
+ vpermq $170,%ymm16,%ymm2{%k7}{z}
+
+ vpxord %ymm16,%ymm16,%ymm16
+ vpxord %ymm17,%ymm17,%ymm17
+
+ vpmadd52luq %ymm3,%ymm0,%ymm16
+ vpmadd52huq %ymm3,%ymm0,%ymm17
+
+ vpmadd52luq %ymm4,%ymm1,%ymm16
+ vpmadd52huq %ymm4,%ymm1,%ymm17
+
+ vpmadd52luq %ymm5,%ymm2,%ymm16
+ vpmadd52huq %ymm5,%ymm2,%ymm17
+
+ vpsrlvq %ymm23,%ymm16,%ymm18
+ vpsllvq %ymm24,%ymm17,%ymm17
+ vpandq %ymm22,%ymm16,%ymm16
+
+ vpaddq %ymm18,%ymm17,%ymm17
+
+ vpermq $147,%ymm17,%ymm17
+
+ vpaddq %ymm17,%ymm16,%ymm16
+
+ vpsrlvq %ymm23,%ymm16,%ymm18
+ vpandq %ymm22,%ymm16,%ymm16
+
+ vpermq $147,%ymm18,%ymm18
+
+ vpaddq %ymm18,%ymm16,%ymm16
+
+ vpermq $147,%ymm16,%ymm18{%k1}{z}
+
+ vpaddq %ymm18,%ymm16,%ymm16
+ vpsllq $2,%ymm18,%ymm18
+
+ vpaddq %ymm18,%ymm16,%ymm16
+
+ decq %rax
+ jnz .Loop_vpmadd52
+
+ vmovdqu64 %ymm16,0(%rdi){%k7}
+
+ testq %rdx,%rdx
+ jnz .Lblocks_vpmadd52_4x
+
+.Lno_data_vpmadd52:
+ .byte 0xf3,0xc3
+.size poly1305_blocks_vpmadd52,.-poly1305_blocks_vpmadd52
+.type poly1305_blocks_vpmadd52_4x,@function
+.align 32
+poly1305_blocks_vpmadd52_4x:
+ shrq $4,%rdx
+ jz .Lno_data_vpmadd52_4x
+
+ shlq $40,%rcx
+ movq 64(%rdi),%r8
+
+.Lblocks_vpmadd52_4x:
+ vpbroadcastq %rcx,%ymm31
+
+ vmovdqa64 .Lx_mask44(%rip),%ymm28
+ movl $5,%eax
+ vmovdqa64 .Lx_mask42(%rip),%ymm29
+ kmovw %eax,%k1
+
+ testq %r8,%r8
+ js .Linit_vpmadd52
+
+ vmovq 0(%rdi),%xmm0
+ vmovq 8(%rdi),%xmm1
+ vmovq 16(%rdi),%xmm2
+
+ testq $3,%rdx
+ jnz .Lblocks_vpmadd52_2x_do
+
+.Lblocks_vpmadd52_4x_do:
+ vpbroadcastq 64(%rdi),%ymm3
+ vpbroadcastq 96(%rdi),%ymm4
+ vpbroadcastq 128(%rdi),%ymm5
+ vpbroadcastq 160(%rdi),%ymm16
+
+.Lblocks_vpmadd52_4x_key_loaded:
+ vpsllq $2,%ymm5,%ymm17
+ vpaddq %ymm5,%ymm17,%ymm17
+ vpsllq $2,%ymm17,%ymm17
+
+ testq $7,%rdx
+ jz .Lblocks_vpmadd52_8x
+
+ vmovdqu64 0(%rsi),%ymm26
+ vmovdqu64 32(%rsi),%ymm27
+ leaq 64(%rsi),%rsi
+
+ vpunpcklqdq %ymm27,%ymm26,%ymm25
+ vpunpckhqdq %ymm27,%ymm26,%ymm27
+
+
+
+ vpsrlq $24,%ymm27,%ymm26
+ vporq %ymm31,%ymm26,%ymm26
+ vpaddq %ymm26,%ymm2,%ymm2
+ vpandq %ymm28,%ymm25,%ymm24
+ vpsrlq $44,%ymm25,%ymm25
+ vpsllq $20,%ymm27,%ymm27
+ vporq %ymm27,%ymm25,%ymm25
+ vpandq %ymm28,%ymm25,%ymm25
+
+ subq $4,%rdx
+ jz .Ltail_vpmadd52_4x
+ jmp .Loop_vpmadd52_4x
+ ud2
+
+.align 32
+.Linit_vpmadd52:
+ vmovq 24(%rdi),%xmm16
+ vmovq 56(%rdi),%xmm2
+ vmovq 32(%rdi),%xmm17
+ vmovq 40(%rdi),%xmm3
+ vmovq 48(%rdi),%xmm4
+
+ vmovdqa %ymm3,%ymm0
+ vmovdqa %ymm4,%ymm1
+ vmovdqa %ymm2,%ymm5
+
+ movl $2,%eax
+
+.Lmul_init_vpmadd52:
+ vpxorq %ymm18,%ymm18,%ymm18
+ vpmadd52luq %ymm2,%ymm16,%ymm18
+ vpxorq %ymm19,%ymm19,%ymm19
+ vpmadd52huq %ymm2,%ymm16,%ymm19
+ vpxorq %ymm20,%ymm20,%ymm20
+ vpmadd52luq %ymm2,%ymm17,%ymm20
+ vpxorq %ymm21,%ymm21,%ymm21
+ vpmadd52huq %ymm2,%ymm17,%ymm21
+ vpxorq %ymm22,%ymm22,%ymm22
+ vpmadd52luq %ymm2,%ymm3,%ymm22
+ vpxorq %ymm23,%ymm23,%ymm23
+ vpmadd52huq %ymm2,%ymm3,%ymm23
+
+ vpmadd52luq %ymm0,%ymm3,%ymm18
+ vpmadd52huq %ymm0,%ymm3,%ymm19
+ vpmadd52luq %ymm0,%ymm4,%ymm20
+ vpmadd52huq %ymm0,%ymm4,%ymm21
+ vpmadd52luq %ymm0,%ymm5,%ymm22
+ vpmadd52huq %ymm0,%ymm5,%ymm23
+
+ vpmadd52luq %ymm1,%ymm17,%ymm18
+ vpmadd52huq %ymm1,%ymm17,%ymm19
+ vpmadd52luq %ymm1,%ymm3,%ymm20
+ vpmadd52huq %ymm1,%ymm3,%ymm21
+ vpmadd52luq %ymm1,%ymm4,%ymm22
+ vpmadd52huq %ymm1,%ymm4,%ymm23
+
+
+
+ vpsrlq $44,%ymm18,%ymm30
+ vpsllq $8,%ymm19,%ymm19
+ vpandq %ymm28,%ymm18,%ymm0
+ vpaddq %ymm30,%ymm19,%ymm19
+
+ vpaddq %ymm19,%ymm20,%ymm20
+
+ vpsrlq $44,%ymm20,%ymm30
+ vpsllq $8,%ymm21,%ymm21
+ vpandq %ymm28,%ymm20,%ymm1
+ vpaddq %ymm30,%ymm21,%ymm21
+
+ vpaddq %ymm21,%ymm22,%ymm22
+
+ vpsrlq $42,%ymm22,%ymm30
+ vpsllq $10,%ymm23,%ymm23
+ vpandq %ymm29,%ymm22,%ymm2
+ vpaddq %ymm30,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+ vpsllq $2,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+
+ vpsrlq $44,%ymm0,%ymm30
+ vpandq %ymm28,%ymm0,%ymm0
+
+ vpaddq %ymm30,%ymm1,%ymm1
+
+ decl %eax
+ jz .Ldone_init_vpmadd52
+
+ vpunpcklqdq %ymm4,%ymm1,%ymm4
+ vpbroadcastq %xmm1,%xmm1
+ vpunpcklqdq %ymm5,%ymm2,%ymm5
+ vpbroadcastq %xmm2,%xmm2
+ vpunpcklqdq %ymm3,%ymm0,%ymm3
+ vpbroadcastq %xmm0,%xmm0
+
+ vpsllq $2,%ymm4,%ymm16
+ vpsllq $2,%ymm5,%ymm17
+ vpaddq %ymm4,%ymm16,%ymm16
+ vpaddq %ymm5,%ymm17,%ymm17
+ vpsllq $2,%ymm16,%ymm16
+ vpsllq $2,%ymm17,%ymm17
+
+ jmp .Lmul_init_vpmadd52
+ ud2
+
+.align 32
+.Ldone_init_vpmadd52:
+ vinserti128 $1,%xmm4,%ymm1,%ymm4
+ vinserti128 $1,%xmm5,%ymm2,%ymm5
+ vinserti128 $1,%xmm3,%ymm0,%ymm3
+
+ vpermq $216,%ymm4,%ymm4
+ vpermq $216,%ymm5,%ymm5
+ vpermq $216,%ymm3,%ymm3
+
+ vpsllq $2,%ymm4,%ymm16
+ vpaddq %ymm4,%ymm16,%ymm16
+ vpsllq $2,%ymm16,%ymm16
+
+ vmovq 0(%rdi),%xmm0
+ vmovq 8(%rdi),%xmm1
+ vmovq 16(%rdi),%xmm2
+
+ testq $3,%rdx
+ jnz .Ldone_init_vpmadd52_2x
+
+ vmovdqu64 %ymm3,64(%rdi)
+ vpbroadcastq %xmm3,%ymm3
+ vmovdqu64 %ymm4,96(%rdi)
+ vpbroadcastq %xmm4,%ymm4
+ vmovdqu64 %ymm5,128(%rdi)
+ vpbroadcastq %xmm5,%ymm5
+ vmovdqu64 %ymm16,160(%rdi)
+ vpbroadcastq %xmm16,%ymm16
+
+ jmp .Lblocks_vpmadd52_4x_key_loaded
+ ud2
+
+.align 32
+.Ldone_init_vpmadd52_2x:
+ vmovdqu64 %ymm3,64(%rdi)
+ vpsrldq $8,%ymm3,%ymm3
+ vmovdqu64 %ymm4,96(%rdi)
+ vpsrldq $8,%ymm4,%ymm4
+ vmovdqu64 %ymm5,128(%rdi)
+ vpsrldq $8,%ymm5,%ymm5
+ vmovdqu64 %ymm16,160(%rdi)
+ vpsrldq $8,%ymm16,%ymm16
+ jmp .Lblocks_vpmadd52_2x_key_loaded
+ ud2
+
+.align 32
+.Lblocks_vpmadd52_2x_do:
+ vmovdqu64 128+8(%rdi),%ymm5{%k1}{z}
+ vmovdqu64 160+8(%rdi),%ymm16{%k1}{z}
+ vmovdqu64 64+8(%rdi),%ymm3{%k1}{z}
+ vmovdqu64 96+8(%rdi),%ymm4{%k1}{z}
+
+.Lblocks_vpmadd52_2x_key_loaded:
+ vmovdqu64 0(%rsi),%ymm26
+ vpxorq %ymm27,%ymm27,%ymm27
+ leaq 32(%rsi),%rsi
+
+ vpunpcklqdq %ymm27,%ymm26,%ymm25
+ vpunpckhqdq %ymm27,%ymm26,%ymm27
+
+
+
+ vpsrlq $24,%ymm27,%ymm26
+ vporq %ymm31,%ymm26,%ymm26
+ vpaddq %ymm26,%ymm2,%ymm2
+ vpandq %ymm28,%ymm25,%ymm24
+ vpsrlq $44,%ymm25,%ymm25
+ vpsllq $20,%ymm27,%ymm27
+ vporq %ymm27,%ymm25,%ymm25
+ vpandq %ymm28,%ymm25,%ymm25
+
+ jmp .Ltail_vpmadd52_2x
+ ud2
+
+.align 32
+.Loop_vpmadd52_4x:
+
+ vpaddq %ymm24,%ymm0,%ymm0
+ vpaddq %ymm25,%ymm1,%ymm1
+
+ vpxorq %ymm18,%ymm18,%ymm18
+ vpmadd52luq %ymm2,%ymm16,%ymm18
+ vpxorq %ymm19,%ymm19,%ymm19
+ vpmadd52huq %ymm2,%ymm16,%ymm19
+ vpxorq %ymm20,%ymm20,%ymm20
+ vpmadd52luq %ymm2,%ymm17,%ymm20
+ vpxorq %ymm21,%ymm21,%ymm21
+ vpmadd52huq %ymm2,%ymm17,%ymm21
+ vpxorq %ymm22,%ymm22,%ymm22
+ vpmadd52luq %ymm2,%ymm3,%ymm22
+ vpxorq %ymm23,%ymm23,%ymm23
+ vpmadd52huq %ymm2,%ymm3,%ymm23
+
+ vmovdqu64 0(%rsi),%ymm26
+ vmovdqu64 32(%rsi),%ymm27
+ leaq 64(%rsi),%rsi
+ vpmadd52luq %ymm0,%ymm3,%ymm18
+ vpmadd52huq %ymm0,%ymm3,%ymm19
+ vpmadd52luq %ymm0,%ymm4,%ymm20
+ vpmadd52huq %ymm0,%ymm4,%ymm21
+ vpmadd52luq %ymm0,%ymm5,%ymm22
+ vpmadd52huq %ymm0,%ymm5,%ymm23
+
+ vpunpcklqdq %ymm27,%ymm26,%ymm25
+ vpunpckhqdq %ymm27,%ymm26,%ymm27
+ vpmadd52luq %ymm1,%ymm17,%ymm18
+ vpmadd52huq %ymm1,%ymm17,%ymm19
+ vpmadd52luq %ymm1,%ymm3,%ymm20
+ vpmadd52huq %ymm1,%ymm3,%ymm21
+ vpmadd52luq %ymm1,%ymm4,%ymm22
+ vpmadd52huq %ymm1,%ymm4,%ymm23
+
+
+
+ vpsrlq $44,%ymm18,%ymm30
+ vpsllq $8,%ymm19,%ymm19
+ vpandq %ymm28,%ymm18,%ymm0
+ vpaddq %ymm30,%ymm19,%ymm19
+
+ vpsrlq $24,%ymm27,%ymm26
+ vporq %ymm31,%ymm26,%ymm26
+ vpaddq %ymm19,%ymm20,%ymm20
+
+ vpsrlq $44,%ymm20,%ymm30
+ vpsllq $8,%ymm21,%ymm21
+ vpandq %ymm28,%ymm20,%ymm1
+ vpaddq %ymm30,%ymm21,%ymm21
+
+ vpandq %ymm28,%ymm25,%ymm24
+ vpsrlq $44,%ymm25,%ymm25
+ vpsllq $20,%ymm27,%ymm27
+ vpaddq %ymm21,%ymm22,%ymm22
+
+ vpsrlq $42,%ymm22,%ymm30
+ vpsllq $10,%ymm23,%ymm23
+ vpandq %ymm29,%ymm22,%ymm2
+ vpaddq %ymm30,%ymm23,%ymm23
+
+ vpaddq %ymm26,%ymm2,%ymm2
+ vpaddq %ymm23,%ymm0,%ymm0
+ vpsllq $2,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+ vporq %ymm27,%ymm25,%ymm25
+ vpandq %ymm28,%ymm25,%ymm25
+
+ vpsrlq $44,%ymm0,%ymm30
+ vpandq %ymm28,%ymm0,%ymm0
+
+ vpaddq %ymm30,%ymm1,%ymm1
+
+ subq $4,%rdx
+ jnz .Loop_vpmadd52_4x
+
+.Ltail_vpmadd52_4x:
+ vmovdqu64 128(%rdi),%ymm5
+ vmovdqu64 160(%rdi),%ymm16
+ vmovdqu64 64(%rdi),%ymm3
+ vmovdqu64 96(%rdi),%ymm4
+
+.Ltail_vpmadd52_2x:
+ vpsllq $2,%ymm5,%ymm17
+ vpaddq %ymm5,%ymm17,%ymm17
+ vpsllq $2,%ymm17,%ymm17
+
+
+ vpaddq %ymm24,%ymm0,%ymm0
+ vpaddq %ymm25,%ymm1,%ymm1
+
+ vpxorq %ymm18,%ymm18,%ymm18
+ vpmadd52luq %ymm2,%ymm16,%ymm18
+ vpxorq %ymm19,%ymm19,%ymm19
+ vpmadd52huq %ymm2,%ymm16,%ymm19
+ vpxorq %ymm20,%ymm20,%ymm20
+ vpmadd52luq %ymm2,%ymm17,%ymm20
+ vpxorq %ymm21,%ymm21,%ymm21
+ vpmadd52huq %ymm2,%ymm17,%ymm21
+ vpxorq %ymm22,%ymm22,%ymm22
+ vpmadd52luq %ymm2,%ymm3,%ymm22
+ vpxorq %ymm23,%ymm23,%ymm23
+ vpmadd52huq %ymm2,%ymm3,%ymm23
+
+ vpmadd52luq %ymm0,%ymm3,%ymm18
+ vpmadd52huq %ymm0,%ymm3,%ymm19
+ vpmadd52luq %ymm0,%ymm4,%ymm20
+ vpmadd52huq %ymm0,%ymm4,%ymm21
+ vpmadd52luq %ymm0,%ymm5,%ymm22
+ vpmadd52huq %ymm0,%ymm5,%ymm23
+
+ vpmadd52luq %ymm1,%ymm17,%ymm18
+ vpmadd52huq %ymm1,%ymm17,%ymm19
+ vpmadd52luq %ymm1,%ymm3,%ymm20
+ vpmadd52huq %ymm1,%ymm3,%ymm21
+ vpmadd52luq %ymm1,%ymm4,%ymm22
+ vpmadd52huq %ymm1,%ymm4,%ymm23
+
+
+
+
+ movl $1,%eax
+ kmovw %eax,%k1
+ vpsrldq $8,%ymm18,%ymm24
+ vpsrldq $8,%ymm19,%ymm0
+ vpsrldq $8,%ymm20,%ymm25
+ vpsrldq $8,%ymm21,%ymm1
+ vpaddq %ymm24,%ymm18,%ymm18
+ vpaddq %ymm0,%ymm19,%ymm19
+ vpsrldq $8,%ymm22,%ymm26
+ vpsrldq $8,%ymm23,%ymm2
+ vpaddq %ymm25,%ymm20,%ymm20
+ vpaddq %ymm1,%ymm21,%ymm21
+ vpermq $0x2,%ymm18,%ymm24
+ vpermq $0x2,%ymm19,%ymm0
+ vpaddq %ymm26,%ymm22,%ymm22
+ vpaddq %ymm2,%ymm23,%ymm23
+
+ vpermq $0x2,%ymm20,%ymm25
+ vpermq $0x2,%ymm21,%ymm1
+ vpaddq %ymm24,%ymm18,%ymm18{%k1}{z}
+ vpaddq %ymm0,%ymm19,%ymm19{%k1}{z}
+ vpermq $0x2,%ymm22,%ymm26
+ vpermq $0x2,%ymm23,%ymm2
+ vpaddq %ymm25,%ymm20,%ymm20{%k1}{z}
+ vpaddq %ymm1,%ymm21,%ymm21{%k1}{z}
+ vpaddq %ymm26,%ymm22,%ymm22{%k1}{z}
+ vpaddq %ymm2,%ymm23,%ymm23{%k1}{z}
+
+
+
+ vpsrlq $44,%ymm18,%ymm30
+ vpsllq $8,%ymm19,%ymm19
+ vpandq %ymm28,%ymm18,%ymm0
+ vpaddq %ymm30,%ymm19,%ymm19
+
+ vpaddq %ymm19,%ymm20,%ymm20
+
+ vpsrlq $44,%ymm20,%ymm30
+ vpsllq $8,%ymm21,%ymm21
+ vpandq %ymm28,%ymm20,%ymm1
+ vpaddq %ymm30,%ymm21,%ymm21
+
+ vpaddq %ymm21,%ymm22,%ymm22
+
+ vpsrlq $42,%ymm22,%ymm30
+ vpsllq $10,%ymm23,%ymm23
+ vpandq %ymm29,%ymm22,%ymm2
+ vpaddq %ymm30,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+ vpsllq $2,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+
+ vpsrlq $44,%ymm0,%ymm30
+ vpandq %ymm28,%ymm0,%ymm0
+
+ vpaddq %ymm30,%ymm1,%ymm1
+
+
+ subq $2,%rdx
+ ja .Lblocks_vpmadd52_4x_do
+
+ vmovq %xmm0,0(%rdi)
+ vmovq %xmm1,8(%rdi)
+ vmovq %xmm2,16(%rdi)
+ vzeroall
+
+.Lno_data_vpmadd52_4x:
+ .byte 0xf3,0xc3
+.size poly1305_blocks_vpmadd52_4x,.-poly1305_blocks_vpmadd52_4x
+.type poly1305_blocks_vpmadd52_8x,@function
+.align 32
+poly1305_blocks_vpmadd52_8x:
+ shrq $4,%rdx
+ jz .Lno_data_vpmadd52_8x
+
+ shlq $40,%rcx
+ movq 64(%rdi),%r8
+
+ vmovdqa64 .Lx_mask44(%rip),%ymm28
+ vmovdqa64 .Lx_mask42(%rip),%ymm29
+
+ testq %r8,%r8
+ js .Linit_vpmadd52
+
+ vmovq 0(%rdi),%xmm0
+ vmovq 8(%rdi),%xmm1
+ vmovq 16(%rdi),%xmm2
+
+.Lblocks_vpmadd52_8x:
+
+
+
+ vmovdqu64 128(%rdi),%ymm5
+ vmovdqu64 160(%rdi),%ymm16
+ vmovdqu64 64(%rdi),%ymm3
+ vmovdqu64 96(%rdi),%ymm4
+
+ vpsllq $2,%ymm5,%ymm17
+ vpaddq %ymm5,%ymm17,%ymm17
+ vpsllq $2,%ymm17,%ymm17
+
+ vpbroadcastq %xmm5,%ymm8
+ vpbroadcastq %xmm3,%ymm6
+ vpbroadcastq %xmm4,%ymm7
+
+ vpxorq %ymm18,%ymm18,%ymm18
+ vpmadd52luq %ymm8,%ymm16,%ymm18
+ vpxorq %ymm19,%ymm19,%ymm19
+ vpmadd52huq %ymm8,%ymm16,%ymm19
+ vpxorq %ymm20,%ymm20,%ymm20
+ vpmadd52luq %ymm8,%ymm17,%ymm20
+ vpxorq %ymm21,%ymm21,%ymm21
+ vpmadd52huq %ymm8,%ymm17,%ymm21
+ vpxorq %ymm22,%ymm22,%ymm22
+ vpmadd52luq %ymm8,%ymm3,%ymm22
+ vpxorq %ymm23,%ymm23,%ymm23
+ vpmadd52huq %ymm8,%ymm3,%ymm23
+
+ vpmadd52luq %ymm6,%ymm3,%ymm18
+ vpmadd52huq %ymm6,%ymm3,%ymm19
+ vpmadd52luq %ymm6,%ymm4,%ymm20
+ vpmadd52huq %ymm6,%ymm4,%ymm21
+ vpmadd52luq %ymm6,%ymm5,%ymm22
+ vpmadd52huq %ymm6,%ymm5,%ymm23
+
+ vpmadd52luq %ymm7,%ymm17,%ymm18
+ vpmadd52huq %ymm7,%ymm17,%ymm19
+ vpmadd52luq %ymm7,%ymm3,%ymm20
+ vpmadd52huq %ymm7,%ymm3,%ymm21
+ vpmadd52luq %ymm7,%ymm4,%ymm22
+ vpmadd52huq %ymm7,%ymm4,%ymm23
+
+
+
+ vpsrlq $44,%ymm18,%ymm30
+ vpsllq $8,%ymm19,%ymm19
+ vpandq %ymm28,%ymm18,%ymm6
+ vpaddq %ymm30,%ymm19,%ymm19
+
+ vpaddq %ymm19,%ymm20,%ymm20
+
+ vpsrlq $44,%ymm20,%ymm30
+ vpsllq $8,%ymm21,%ymm21
+ vpandq %ymm28,%ymm20,%ymm7
+ vpaddq %ymm30,%ymm21,%ymm21
+
+ vpaddq %ymm21,%ymm22,%ymm22
+
+ vpsrlq $42,%ymm22,%ymm30
+ vpsllq $10,%ymm23,%ymm23
+ vpandq %ymm29,%ymm22,%ymm8
+ vpaddq %ymm30,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm6,%ymm6
+ vpsllq $2,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm6,%ymm6
+
+ vpsrlq $44,%ymm6,%ymm30
+ vpandq %ymm28,%ymm6,%ymm6
+
+ vpaddq %ymm30,%ymm7,%ymm7
+
+
+
+
+
+ vpunpcklqdq %ymm5,%ymm8,%ymm26
+ vpunpckhqdq %ymm5,%ymm8,%ymm5
+ vpunpcklqdq %ymm3,%ymm6,%ymm24
+ vpunpckhqdq %ymm3,%ymm6,%ymm3
+ vpunpcklqdq %ymm4,%ymm7,%ymm25
+ vpunpckhqdq %ymm4,%ymm7,%ymm4
+ vshufi64x2 $0x44,%zmm5,%zmm26,%zmm8
+ vshufi64x2 $0x44,%zmm3,%zmm24,%zmm6
+ vshufi64x2 $0x44,%zmm4,%zmm25,%zmm7
+
+ vmovdqu64 0(%rsi),%zmm26
+ vmovdqu64 64(%rsi),%zmm27
+ leaq 128(%rsi),%rsi
+
+ vpsllq $2,%zmm8,%zmm10
+ vpsllq $2,%zmm7,%zmm9
+ vpaddq %zmm8,%zmm10,%zmm10
+ vpaddq %zmm7,%zmm9,%zmm9
+ vpsllq $2,%zmm10,%zmm10
+ vpsllq $2,%zmm9,%zmm9
+
+ vpbroadcastq %rcx,%zmm31
+ vpbroadcastq %xmm28,%zmm28
+ vpbroadcastq %xmm29,%zmm29
+
+ vpbroadcastq %xmm9,%zmm16
+ vpbroadcastq %xmm10,%zmm17
+ vpbroadcastq %xmm6,%zmm3
+ vpbroadcastq %xmm7,%zmm4
+ vpbroadcastq %xmm8,%zmm5
+
+ vpunpcklqdq %zmm27,%zmm26,%zmm25
+ vpunpckhqdq %zmm27,%zmm26,%zmm27
+
+
+
+ vpsrlq $24,%zmm27,%zmm26
+ vporq %zmm31,%zmm26,%zmm26
+ vpaddq %zmm26,%zmm2,%zmm2
+ vpandq %zmm28,%zmm25,%zmm24
+ vpsrlq $44,%zmm25,%zmm25
+ vpsllq $20,%zmm27,%zmm27
+ vporq %zmm27,%zmm25,%zmm25
+ vpandq %zmm28,%zmm25,%zmm25
+
+ subq $8,%rdx
+ jz .Ltail_vpmadd52_8x
+ jmp .Loop_vpmadd52_8x
+
+.align 32
+.Loop_vpmadd52_8x:
+
+ vpaddq %zmm24,%zmm0,%zmm0
+ vpaddq %zmm25,%zmm1,%zmm1
+
+ vpxorq %zmm18,%zmm18,%zmm18
+ vpmadd52luq %zmm2,%zmm16,%zmm18
+ vpxorq %zmm19,%zmm19,%zmm19
+ vpmadd52huq %zmm2,%zmm16,%zmm19
+ vpxorq %zmm20,%zmm20,%zmm20
+ vpmadd52luq %zmm2,%zmm17,%zmm20
+ vpxorq %zmm21,%zmm21,%zmm21
+ vpmadd52huq %zmm2,%zmm17,%zmm21
+ vpxorq %zmm22,%zmm22,%zmm22
+ vpmadd52luq %zmm2,%zmm3,%zmm22
+ vpxorq %zmm23,%zmm23,%zmm23
+ vpmadd52huq %zmm2,%zmm3,%zmm23
+
+ vmovdqu64 0(%rsi),%zmm26
+ vmovdqu64 64(%rsi),%zmm27
+ leaq 128(%rsi),%rsi
+ vpmadd52luq %zmm0,%zmm3,%zmm18
+ vpmadd52huq %zmm0,%zmm3,%zmm19
+ vpmadd52luq %zmm0,%zmm4,%zmm20
+ vpmadd52huq %zmm0,%zmm4,%zmm21
+ vpmadd52luq %zmm0,%zmm5,%zmm22
+ vpmadd52huq %zmm0,%zmm5,%zmm23
+
+ vpunpcklqdq %zmm27,%zmm26,%zmm25
+ vpunpckhqdq %zmm27,%zmm26,%zmm27
+ vpmadd52luq %zmm1,%zmm17,%zmm18
+ vpmadd52huq %zmm1,%zmm17,%zmm19
+ vpmadd52luq %zmm1,%zmm3,%zmm20
+ vpmadd52huq %zmm1,%zmm3,%zmm21
+ vpmadd52luq %zmm1,%zmm4,%zmm22
+ vpmadd52huq %zmm1,%zmm4,%zmm23
+
+
+
+ vpsrlq $44,%zmm18,%zmm30
+ vpsllq $8,%zmm19,%zmm19
+ vpandq %zmm28,%zmm18,%zmm0
+ vpaddq %zmm30,%zmm19,%zmm19
+
+ vpsrlq $24,%zmm27,%zmm26
+ vporq %zmm31,%zmm26,%zmm26
+ vpaddq %zmm19,%zmm20,%zmm20
+
+ vpsrlq $44,%zmm20,%zmm30
+ vpsllq $8,%zmm21,%zmm21
+ vpandq %zmm28,%zmm20,%zmm1
+ vpaddq %zmm30,%zmm21,%zmm21
+
+ vpandq %zmm28,%zmm25,%zmm24
+ vpsrlq $44,%zmm25,%zmm25
+ vpsllq $20,%zmm27,%zmm27
+ vpaddq %zmm21,%zmm22,%zmm22
+
+ vpsrlq $42,%zmm22,%zmm30
+ vpsllq $10,%zmm23,%zmm23
+ vpandq %zmm29,%zmm22,%zmm2
+ vpaddq %zmm30,%zmm23,%zmm23
+
+ vpaddq %zmm26,%zmm2,%zmm2
+ vpaddq %zmm23,%zmm0,%zmm0
+ vpsllq $2,%zmm23,%zmm23
+
+ vpaddq %zmm23,%zmm0,%zmm0
+ vporq %zmm27,%zmm25,%zmm25
+ vpandq %zmm28,%zmm25,%zmm25
+
+ vpsrlq $44,%zmm0,%zmm30
+ vpandq %zmm28,%zmm0,%zmm0
+
+ vpaddq %zmm30,%zmm1,%zmm1
+
+ subq $8,%rdx
+ jnz .Loop_vpmadd52_8x
+
+.Ltail_vpmadd52_8x:
+
+ vpaddq %zmm24,%zmm0,%zmm0
+ vpaddq %zmm25,%zmm1,%zmm1
+
+ vpxorq %zmm18,%zmm18,%zmm18
+ vpmadd52luq %zmm2,%zmm9,%zmm18
+ vpxorq %zmm19,%zmm19,%zmm19
+ vpmadd52huq %zmm2,%zmm9,%zmm19
+ vpxorq %zmm20,%zmm20,%zmm20
+ vpmadd52luq %zmm2,%zmm10,%zmm20
+ vpxorq %zmm21,%zmm21,%zmm21
+ vpmadd52huq %zmm2,%zmm10,%zmm21
+ vpxorq %zmm22,%zmm22,%zmm22
+ vpmadd52luq %zmm2,%zmm6,%zmm22
+ vpxorq %zmm23,%zmm23,%zmm23
+ vpmadd52huq %zmm2,%zmm6,%zmm23
+
+ vpmadd52luq %zmm0,%zmm6,%zmm18
+ vpmadd52huq %zmm0,%zmm6,%zmm19
+ vpmadd52luq %zmm0,%zmm7,%zmm20
+ vpmadd52huq %zmm0,%zmm7,%zmm21
+ vpmadd52luq %zmm0,%zmm8,%zmm22
+ vpmadd52huq %zmm0,%zmm8,%zmm23
+
+ vpmadd52luq %zmm1,%zmm10,%zmm18
+ vpmadd52huq %zmm1,%zmm10,%zmm19
+ vpmadd52luq %zmm1,%zmm6,%zmm20
+ vpmadd52huq %zmm1,%zmm6,%zmm21
+ vpmadd52luq %zmm1,%zmm7,%zmm22
+ vpmadd52huq %zmm1,%zmm7,%zmm23
+
+
+
+
+ movl $1,%eax
+ kmovw %eax,%k1
+ vpsrldq $8,%zmm18,%zmm24
+ vpsrldq $8,%zmm19,%zmm0
+ vpsrldq $8,%zmm20,%zmm25
+ vpsrldq $8,%zmm21,%zmm1
+ vpaddq %zmm24,%zmm18,%zmm18
+ vpaddq %zmm0,%zmm19,%zmm19
+ vpsrldq $8,%zmm22,%zmm26
+ vpsrldq $8,%zmm23,%zmm2
+ vpaddq %zmm25,%zmm20,%zmm20
+ vpaddq %zmm1,%zmm21,%zmm21
+ vpermq $0x2,%zmm18,%zmm24
+ vpermq $0x2,%zmm19,%zmm0
+ vpaddq %zmm26,%zmm22,%zmm22
+ vpaddq %zmm2,%zmm23,%zmm23
+
+ vpermq $0x2,%zmm20,%zmm25
+ vpermq $0x2,%zmm21,%zmm1
+ vpaddq %zmm24,%zmm18,%zmm18
+ vpaddq %zmm0,%zmm19,%zmm19
+ vpermq $0x2,%zmm22,%zmm26
+ vpermq $0x2,%zmm23,%zmm2
+ vpaddq %zmm25,%zmm20,%zmm20
+ vpaddq %zmm1,%zmm21,%zmm21
+ vextracti64x4 $1,%zmm18,%ymm24
+ vextracti64x4 $1,%zmm19,%ymm0
+ vpaddq %zmm26,%zmm22,%zmm22
+ vpaddq %zmm2,%zmm23,%zmm23
+
+ vextracti64x4 $1,%zmm20,%ymm25
+ vextracti64x4 $1,%zmm21,%ymm1
+ vextracti64x4 $1,%zmm22,%ymm26
+ vextracti64x4 $1,%zmm23,%ymm2
+ vpaddq %ymm24,%ymm18,%ymm18{%k1}{z}
+ vpaddq %ymm0,%ymm19,%ymm19{%k1}{z}
+ vpaddq %ymm25,%ymm20,%ymm20{%k1}{z}
+ vpaddq %ymm1,%ymm21,%ymm21{%k1}{z}
+ vpaddq %ymm26,%ymm22,%ymm22{%k1}{z}
+ vpaddq %ymm2,%ymm23,%ymm23{%k1}{z}
+
+
+
+ vpsrlq $44,%ymm18,%ymm30
+ vpsllq $8,%ymm19,%ymm19
+ vpandq %ymm28,%ymm18,%ymm0
+ vpaddq %ymm30,%ymm19,%ymm19
+
+ vpaddq %ymm19,%ymm20,%ymm20
+
+ vpsrlq $44,%ymm20,%ymm30
+ vpsllq $8,%ymm21,%ymm21
+ vpandq %ymm28,%ymm20,%ymm1
+ vpaddq %ymm30,%ymm21,%ymm21
+
+ vpaddq %ymm21,%ymm22,%ymm22
+
+ vpsrlq $42,%ymm22,%ymm30
+ vpsllq $10,%ymm23,%ymm23
+ vpandq %ymm29,%ymm22,%ymm2
+ vpaddq %ymm30,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+ vpsllq $2,%ymm23,%ymm23
+
+ vpaddq %ymm23,%ymm0,%ymm0
+
+ vpsrlq $44,%ymm0,%ymm30
+ vpandq %ymm28,%ymm0,%ymm0
+
+ vpaddq %ymm30,%ymm1,%ymm1
+
+
+
+ vmovq %xmm0,0(%rdi)
+ vmovq %xmm1,8(%rdi)
+ vmovq %xmm2,16(%rdi)
+ vzeroall
+
+.Lno_data_vpmadd52_8x:
+ .byte 0xf3,0xc3
+.size poly1305_blocks_vpmadd52_8x,.-poly1305_blocks_vpmadd52_8x
+.type poly1305_emit_base2_44,@function
+.align 32
+poly1305_emit_base2_44:
+ movq 0(%rdi),%r8
+ movq 8(%rdi),%r9
+ movq 16(%rdi),%r10
+
+ movq %r9,%rax
+ shrq $20,%r9
+ shlq $44,%rax
+ movq %r10,%rcx
+ shrq $40,%r10
+ shlq $24,%rcx
+
+ addq %rax,%r8
+ adcq %rcx,%r9
+ adcq $0,%r10
+
+ movq %r8,%rax
+ addq $5,%r8
+ movq %r9,%rcx
+ adcq $0,%r9
+ adcq $0,%r10
+ shrq $2,%r10
+ cmovnzq %r8,%rax
+ cmovnzq %r9,%rcx
+
+ addq 0(%rdx),%rax
+ adcq 8(%rdx),%rcx
+ movq %rax,0(%rsi)
+ movq %rcx,8(%rsi)
+
+ .byte 0xf3,0xc3
+.size poly1305_emit_base2_44,.-poly1305_emit_base2_44
+.align 64
+.Lconst:
+.Lmask24:
+.long 0x0ffffff,0,0x0ffffff,0,0x0ffffff,0,0x0ffffff,0
+.L129:
+.long 16777216,0,16777216,0,16777216,0,16777216,0
+.Lmask26:
+.long 0x3ffffff,0,0x3ffffff,0,0x3ffffff,0,0x3ffffff,0
+.Lpermd_avx2:
+.long 2,2,2,3,2,0,2,1
+.Lpermd_avx512:
+.long 0,0,0,1, 0,2,0,3, 0,4,0,5, 0,6,0,7
+
+.L2_44_inp_permd:
+.long 0,1,1,2,2,3,7,7
+.L2_44_inp_shift:
+.quad 0,12,24,64
+.L2_44_mask:
+.quad 0xfffffffffff,0xfffffffffff,0x3ffffffffff,0xffffffffffffffff
+.L2_44_shift_rgt:
+.quad 44,44,42,64
+.L2_44_shift_lft:
+.quad 8,8,10,64
+
+.align 64
+.Lx_mask44:
+.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff
+.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff
+.Lx_mask42:
+.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff
+.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff
+.byte 80,111,108,121,49,51,48,53,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
+.align 16
+.globl xor128_encrypt_n_pad
+.type xor128_encrypt_n_pad,@function
+.align 16
+xor128_encrypt_n_pad:
+ subq %rdx,%rsi
+ subq %rdx,%rdi
+ movq %rcx,%r10
+ shrq $4,%rcx
+ jz .Ltail_enc
+ nop
+.Loop_enc_xmm:
+ movdqu (%rsi,%rdx,1),%xmm0
+ pxor (%rdx),%xmm0
+ movdqu %xmm0,(%rdi,%rdx,1)
+ movdqa %xmm0,(%rdx)
+ leaq 16(%rdx),%rdx
+ decq %rcx
+ jnz .Loop_enc_xmm
+
+ andq $15,%r10
+ jz .Ldone_enc
+
+.Ltail_enc:
+ movq $16,%rcx
+ subq %r10,%rcx
+ xorl %eax,%eax
+.Loop_enc_byte:
+ movb (%rsi,%rdx,1),%al
+ xorb (%rdx),%al
+ movb %al,(%rdi,%rdx,1)
+ movb %al,(%rdx)
+ leaq 1(%rdx),%rdx
+ decq %r10
+ jnz .Loop_enc_byte
+
+ xorl %eax,%eax
+.Loop_enc_pad:
+ movb %al,(%rdx)
+ leaq 1(%rdx),%rdx
+ decq %rcx
+ jnz .Loop_enc_pad
+
+.Ldone_enc:
+ movq %rdx,%rax
+ .byte 0xf3,0xc3
+.size xor128_encrypt_n_pad,.-xor128_encrypt_n_pad
+
+.globl xor128_decrypt_n_pad
+.type xor128_decrypt_n_pad,@function
+.align 16
+xor128_decrypt_n_pad:
+ subq %rdx,%rsi
+ subq %rdx,%rdi
+ movq %rcx,%r10
+ shrq $4,%rcx
+ jz .Ltail_dec
+ nop
+.Loop_dec_xmm:
+ movdqu (%rsi,%rdx,1),%xmm0
+ movdqa (%rdx),%xmm1
+ pxor %xmm0,%xmm1
+ movdqu %xmm1,(%rdi,%rdx,1)
+ movdqa %xmm0,(%rdx)
+ leaq 16(%rdx),%rdx
+ decq %rcx
+ jnz .Loop_dec_xmm
+
+ pxor %xmm1,%xmm1
+ andq $15,%r10
+ jz .Ldone_dec
+
+.Ltail_dec:
+ movq $16,%rcx
+ subq %r10,%rcx
+ xorl %eax,%eax
+ xorq %r11,%r11
+.Loop_dec_byte:
+ movb (%rsi,%rdx,1),%r11b
+ movb (%rdx),%al
+ xorb %r11b,%al
+ movb %al,(%rdi,%rdx,1)
+ movb %r11b,(%rdx)
+ leaq 1(%rdx),%rdx
+ decq %r10
+ jnz .Loop_dec_byte
+
+ xorl %eax,%eax
+.Loop_dec_pad:
+ movb %al,(%rdx)
+ leaq 1(%rdx),%rdx
+ decq %rcx
+ jnz .Loop_dec_pad
+
+.Ldone_dec:
+ movq %rdx,%rax
+ .byte 0xf3,0xc3
+.size xor128_decrypt_n_pad,.-xor128_decrypt_n_pad
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 13/28] zinc: Poly1305 x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (9 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 12/28] zinc: import Andy Polyakov's Poly1305 x86_64 implementation Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 14/28] zinc: import Andy Polyakov's Poly1305 ARM and ARM64 implementations Jason A. Donenfeld
` (12 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Thomas Gleixner, Ingo Molnar,
x86, Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This ports AVX, AVX-2, and AVX-512F implementations for Poly1305.
The AVX-512F implementation is disabled on Skylake, due to throttling.
These come from Andy Polyakov's implementation, with the following
modifications from Samuel Neves:
- Some cosmetic changes, like renaming labels to .Lname, constants,
and other Linux conventions.
- CPU feature checking is done in C by the glue code, so that has been
removed from the assembly.
- poly1305_blocks_avx512 jumped to the middle of the poly1305_blocks_avx2
for the final blocks. To appease objtool, the relevant tail avx2 code
was duplicated for the avx512 function.
- The original uses %rbp as a scratch register. However, the kernel
expects %rbp to be a valid frame pointer at any given time in order
to do proper unwinding. Thus we need to alter the code in order to
preserve it. The most straightforward manner in which this was
accomplished was by replacing $d3, formerly %r10, by %rdi, and
replacing %rbp by %r10. Because %rdi, a pointer to the context
structure, does not change and is not used by poly1305_iteration,
it is safe to use it here, and the overhead of saving and restoring
it should be minimal.
- The original hardcodes returns as .byte 0xf3,0xc3, aka "rep ret".
We replace this by "ret". "rep ret" was meant to help with AMD K8
chips, cf. http://repzret.org/p/repzret. It makes no sense to
continue to use this kludge for code that won't even run on ancient
AMD chips.
The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
the unfortunate situation of using AVX and then having to go back to scalar
-- because the user is silly and has called the update function from two
separate contexts -- then we need to convert back to the original base before
proceeding. It is possible to reason that the initial reduction below is
sufficient given the implementation invariants. However, for an avoidance of
doubt and because this is not performance critical, we do the full reduction
anyway. This conversion is found in the glue code, and a proof of
correctness may be easily obtained from Z3: <https://xn--4db.cc/ltPtHCKN/py>.
Cycle counts on a Core i7 6700HQ using the AVX-2 codepath, comparing
this implementation ("new") to the implementation in the current crypto
api ("old"):
size old new
---- ---- ----
0 70 68
16 92 90
32 134 104
48 172 120
64 218 136
80 254 158
96 298 174
112 342 192
128 388 212
144 428 228
160 466 246
176 510 264
192 550 282
208 594 302
224 628 316
240 676 334
256 716 354
272 764 374
288 802 352
304 420 366
320 428 360
336 484 378
352 426 384
368 478 400
384 488 394
400 542 408
416 486 416
432 534 430
448 544 422
464 600 438
480 540 448
496 594 464
512 602 456
528 656 476
544 600 480
560 650 494
576 664 490
592 714 508
608 656 514
624 708 532
640 716 524
656 770 536
672 716 548
688 770 562
704 774 552
720 826 568
736 768 574
752 822 592
768 830 584
784 884 602
800 828 610
816 884 628
832 888 618
848 942 632
864 884 644
880 936 660
896 948 652
912 1000 664
928 942 676
944 994 690
960 1002 680
976 1054 694
992 1002 706
1008 1052 720
Cycle counts on a Xeon Gold 5120 using the AVX-512 codepath:
size old new
---- ---- ----
0 74 70
16 96 92
32 136 106
48 184 124
64 218 138
80 260 160
96 300 176
112 342 194
128 384 212
144 420 226
160 464 248
176 504 264
192 544 282
208 582 300
224 624 318
240 662 338
256 708 358
272 748 372
288 788 358
304 422 370
320 432 364
336 486 380
352 434 390
368 480 408
384 490 398
400 542 412
416 492 426
432 538 436
448 546 432
464 600 448
480 548 456
496 594 476
512 606 470
528 656 480
544 606 498
560 652 512
576 662 508
592 716 522
608 664 538
624 710 552
640 720 516
656 772 526
672 722 544
688 768 556
704 778 556
720 832 568
736 780 584
752 826 600
768 836 560
784 888 572
800 838 588
816 884 604
832 894 598
848 946 612
864 896 628
880 942 644
896 952 608
912 1004 616
928 954 634
944 1000 646
960 1008 646
976 1062 658
992 1012 674
1008 1058 690
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 1 +
lib/zinc/poly1305/poly1305-x86_64-glue.c | 154 ++
...-x86_64-cryptogams.S => poly1305-x86_64.S} | 2459 ++++++-----------
lib/zinc/poly1305/poly1305.c | 4 +
4 files changed, 1002 insertions(+), 1616 deletions(-)
create mode 100644 lib/zinc/poly1305/poly1305-x86_64-glue.c
rename lib/zinc/poly1305/{poly1305-x86_64-cryptogams.S => poly1305-x86_64.S} (58%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 6fc9626c55fa..a8943d960b6a 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -11,4 +11,5 @@ AFLAGS_chacha20-mips.o += -O2 # This is required to fill the branch delay slots
obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
zinc_poly1305-y := poly1305/poly1305.o
+zinc_poly1305-$(CONFIG_ZINC_ARCH_X86_64) += poly1305/poly1305-x86_64.o
obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
diff --git a/lib/zinc/poly1305/poly1305-x86_64-glue.c b/lib/zinc/poly1305/poly1305-x86_64-glue.c
new file mode 100644
index 000000000000..ccf5f1952503
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-x86_64-glue.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/cpufeature.h>
+#include <asm/processor.h>
+#include <asm/intel-family.h>
+
+asmlinkage void poly1305_init_x86_64(void *ctx,
+ const u8 key[POLY1305_KEY_SIZE]);
+asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp,
+ const size_t len, const u32 padbit);
+asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4]);
+asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4]);
+asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len,
+ const u32 padbit);
+asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len,
+ const u32 padbit);
+asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp,
+ const size_t len, const u32 padbit);
+
+static bool poly1305_use_avx __ro_after_init;
+static bool poly1305_use_avx2 __ro_after_init;
+static bool poly1305_use_avx512 __ro_after_init;
+static bool *const poly1305_nobs[] __initconst = {
+ &poly1305_use_avx, &poly1305_use_avx2, &poly1305_use_avx512 };
+
+static void __init poly1305_fpu_init(void)
+{
+ poly1305_use_avx =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
+ poly1305_use_avx2 =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
+ poly1305_use_avx512 =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ boot_cpu_has(X86_FEATURE_AVX512F) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM |
+ XFEATURE_MASK_AVX512, NULL) &&
+ /* Skylake downclocks unacceptably much when using zmm. */
+ boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X;
+}
+
+static inline bool poly1305_init_arch(void *ctx,
+ const u8 key[POLY1305_KEY_SIZE])
+{
+ poly1305_init_x86_64(ctx, key);
+ return true;
+}
+
+struct poly1305_arch_internal {
+ union {
+ struct {
+ u32 h[5];
+ u32 is_base2_26;
+ };
+ u64 hs[3];
+ };
+ u64 r[2];
+ u64 pad;
+ struct { u32 r2, r1, r4, r3; } rn[9];
+};
+
+/* The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway.
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
+static inline bool poly1305_blocks_arch(void *ctx, const u8 *inp,
+ size_t len, const u32 padbit,
+ simd_context_t *simd_context)
+{
+ struct poly1305_arch_internal *state = ctx;
+
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(PAGE_SIZE < POLY1305_BLOCK_SIZE ||
+ PAGE_SIZE % POLY1305_BLOCK_SIZE);
+
+ if (!IS_ENABLED(CONFIG_AS_AVX) || !poly1305_use_avx ||
+ (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) ||
+ !simd_use(simd_context)) {
+ convert_to_base2_64(ctx);
+ poly1305_blocks_x86_64(ctx, inp, len, padbit);
+ return true;
+ }
+
+ for (;;) {
+ const size_t bytes = min_t(size_t, len, PAGE_SIZE);
+
+ if (IS_ENABLED(CONFIG_AS_AVX512) && poly1305_use_avx512)
+ poly1305_blocks_avx512(ctx, inp, bytes, padbit);
+ else if (IS_ENABLED(CONFIG_AS_AVX2) && poly1305_use_avx2)
+ poly1305_blocks_avx2(ctx, inp, bytes, padbit);
+ else
+ poly1305_blocks_avx(ctx, inp, bytes, padbit);
+ len -= bytes;
+ if (!len)
+ break;
+ inp += bytes;
+ simd_relax(simd_context);
+ }
+
+ return true;
+}
+
+static inline bool poly1305_emit_arch(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4],
+ simd_context_t *simd_context)
+{
+ struct poly1305_arch_internal *state = ctx;
+
+ if (!IS_ENABLED(CONFIG_AS_AVX) || !poly1305_use_avx ||
+ !state->is_base2_26 || !simd_use(simd_context)) {
+ convert_to_base2_64(ctx);
+ poly1305_emit_x86_64(ctx, mac, nonce);
+ } else
+ poly1305_emit_avx(ctx, mac, nonce);
+ return true;
+}
diff --git a/lib/zinc/poly1305/poly1305-x86_64-cryptogams.S b/lib/zinc/poly1305/poly1305-x86_64.S
similarity index 58%
rename from lib/zinc/poly1305/poly1305-x86_64-cryptogams.S
rename to lib/zinc/poly1305/poly1305-x86_64.S
index ed634757354b..3c3f2b4d880b 100644
--- a/lib/zinc/poly1305/poly1305-x86_64-cryptogams.S
+++ b/lib/zinc/poly1305/poly1305-x86_64.S
@@ -1,22 +1,27 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2017 Samuel Neves <sneves@dei.uc.pt>. All Rights Reserved.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-.text
-
+#include <linux/linkage.h>
+.section .rodata.cst192.Lconst, "aM", @progbits, 192
+.align 64
+.Lconst:
+.long 0x0ffffff,0,0x0ffffff,0,0x0ffffff,0,0x0ffffff,0
+.long 16777216,0,16777216,0,16777216,0,16777216,0
+.long 0x3ffffff,0,0x3ffffff,0,0x3ffffff,0,0x3ffffff,0
+.long 2,2,2,3,2,0,2,1
+.long 0,0,0,1, 0,2,0,3, 0,4,0,5, 0,6,0,7
-.globl poly1305_init
-.hidden poly1305_init
-.globl poly1305_blocks
-.hidden poly1305_blocks
-.globl poly1305_emit
-.hidden poly1305_emit
+.text
-.type poly1305_init,@function
.align 32
-poly1305_init:
+ENTRY(poly1305_init_x86_64)
xorq %rax,%rax
movq %rax,0(%rdi)
movq %rax,8(%rdi)
@@ -25,61 +30,30 @@ poly1305_init:
cmpq $0,%rsi
je .Lno_key
- leaq poly1305_blocks(%rip),%r10
- leaq poly1305_emit(%rip),%r11
- movq OPENSSL_ia32cap_P+4(%rip),%r9
- leaq poly1305_blocks_avx(%rip),%rax
- leaq poly1305_emit_avx(%rip),%rcx
- btq $28,%r9
- cmovcq %rax,%r10
- cmovcq %rcx,%r11
- leaq poly1305_blocks_avx2(%rip),%rax
- btq $37,%r9
- cmovcq %rax,%r10
- movq $2149646336,%rax
- shrq $32,%r9
- andq %rax,%r9
- cmpq %rax,%r9
- je .Linit_base2_44
movq $0x0ffffffc0fffffff,%rax
movq $0x0ffffffc0ffffffc,%rcx
andq 0(%rsi),%rax
andq 8(%rsi),%rcx
movq %rax,24(%rdi)
movq %rcx,32(%rdi)
- movq %r10,0(%rdx)
- movq %r11,8(%rdx)
movl $1,%eax
.Lno_key:
- .byte 0xf3,0xc3
-.size poly1305_init,.-poly1305_init
+ ret
+ENDPROC(poly1305_init_x86_64)
-.type poly1305_blocks,@function
.align 32
-poly1305_blocks:
-.cfi_startproc
+ENTRY(poly1305_blocks_x86_64)
.Lblocks:
shrq $4,%rdx
jz .Lno_data
pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
+ pushq %rdi
+
.Lblocks_body:
movq %rdx,%r15
@@ -89,7 +63,7 @@ poly1305_blocks:
movq 0(%rdi),%r14
movq 8(%rdi),%rbx
- movq 16(%rdi),%rbp
+ movq 16(%rdi),%r10
movq %r13,%r12
shrq $2,%r13
@@ -99,14 +73,15 @@ poly1305_blocks:
.align 32
.Loop:
+
addq 0(%rsi),%r14
adcq 8(%rsi),%rbx
leaq 16(%rsi),%rsi
- adcq %rcx,%rbp
+ adcq %rcx,%r10
mulq %r14
movq %rax,%r9
movq %r11,%rax
- movq %rdx,%r10
+ movq %rdx,%rdi
mulq %r14
movq %rax,%r14
@@ -116,62 +91,55 @@ poly1305_blocks:
mulq %rbx
addq %rax,%r9
movq %r13,%rax
- adcq %rdx,%r10
+ adcq %rdx,%rdi
mulq %rbx
- movq %rbp,%rbx
+ movq %r10,%rbx
addq %rax,%r14
adcq %rdx,%r8
imulq %r13,%rbx
addq %rbx,%r9
movq %r8,%rbx
- adcq $0,%r10
+ adcq $0,%rdi
- imulq %r11,%rbp
+ imulq %r11,%r10
addq %r9,%rbx
movq $-4,%rax
- adcq %rbp,%r10
+ adcq %r10,%rdi
- andq %r10,%rax
- movq %r10,%rbp
- shrq $2,%r10
- andq $3,%rbp
- addq %r10,%rax
+ andq %rdi,%rax
+ movq %rdi,%r10
+ shrq $2,%rdi
+ andq $3,%r10
+ addq %rdi,%rax
addq %rax,%r14
adcq $0,%rbx
- adcq $0,%rbp
+ adcq $0,%r10
+
movq %r12,%rax
decq %r15
jnz .Loop
+ movq 0(%rsp),%rdi
+
movq %r14,0(%rdi)
movq %rbx,8(%rdi)
- movq %rbp,16(%rdi)
-
- movq 0(%rsp),%r15
-.cfi_restore %r15
- movq 8(%rsp),%r14
-.cfi_restore %r14
- movq 16(%rsp),%r13
-.cfi_restore %r13
- movq 24(%rsp),%r12
-.cfi_restore %r12
- movq 32(%rsp),%rbp
-.cfi_restore %rbp
+ movq %r10,16(%rdi)
+
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
movq 40(%rsp),%rbx
-.cfi_restore %rbx
leaq 48(%rsp),%rsp
-.cfi_adjust_cfa_offset -48
.Lno_data:
.Lblocks_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
-.size poly1305_blocks,.-poly1305_blocks
+ ret
+ENDPROC(poly1305_blocks_x86_64)
-.type poly1305_emit,@function
.align 32
-poly1305_emit:
+ENTRY(poly1305_emit_x86_64)
.Lemit:
movq 0(%rdi),%r8
movq 8(%rdi),%r9
@@ -191,15 +159,14 @@ poly1305_emit:
movq %rax,0(%rsi)
movq %rcx,8(%rsi)
- .byte 0xf3,0xc3
-.size poly1305_emit,.-poly1305_emit
-.type __poly1305_block,@function
-.align 32
-__poly1305_block:
+ ret
+ENDPROC(poly1305_emit_x86_64)
+
+.macro __poly1305_block
mulq %r14
movq %rax,%r9
movq %r11,%rax
- movq %rdx,%r10
+ movq %rdx,%rdi
mulq %r14
movq %rax,%r14
@@ -209,45 +176,44 @@ __poly1305_block:
mulq %rbx
addq %rax,%r9
movq %r13,%rax
- adcq %rdx,%r10
+ adcq %rdx,%rdi
mulq %rbx
- movq %rbp,%rbx
+ movq %r10,%rbx
addq %rax,%r14
adcq %rdx,%r8
imulq %r13,%rbx
addq %rbx,%r9
movq %r8,%rbx
- adcq $0,%r10
+ adcq $0,%rdi
- imulq %r11,%rbp
+ imulq %r11,%r10
addq %r9,%rbx
movq $-4,%rax
- adcq %rbp,%r10
+ adcq %r10,%rdi
- andq %r10,%rax
- movq %r10,%rbp
- shrq $2,%r10
- andq $3,%rbp
- addq %r10,%rax
+ andq %rdi,%rax
+ movq %rdi,%r10
+ shrq $2,%rdi
+ andq $3,%r10
+ addq %rdi,%rax
addq %rax,%r14
adcq $0,%rbx
- adcq $0,%rbp
- .byte 0xf3,0xc3
-.size __poly1305_block,.-__poly1305_block
+ adcq $0,%r10
+.endm
-.type __poly1305_init_avx,@function
-.align 32
-__poly1305_init_avx:
+.macro __poly1305_init_avx
movq %r11,%r14
movq %r12,%rbx
- xorq %rbp,%rbp
+ xorq %r10,%r10
leaq 48+64(%rdi),%rdi
movq %r12,%rax
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
movl $0x3ffffff,%eax
movl $0x3ffffff,%edx
@@ -305,7 +271,7 @@ __poly1305_init_avx:
movl %edx,36(%rdi)
shrq $26,%r9
- movq %rbp,%rax
+ movq %r10,%rax
shlq $24,%rax
orq %rax,%r8
movl %r8d,48(%rdi)
@@ -316,7 +282,9 @@ __poly1305_init_avx:
movl %r9d,68(%rdi)
movq %r12,%rax
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
movl $0x3ffffff,%eax
movq %r14,%r8
@@ -348,7 +316,7 @@ __poly1305_init_avx:
shrq $26,%r8
movl %edx,44(%rdi)
- movq %rbp,%rax
+ movq %r10,%rax
shlq $24,%rax
orq %rax,%r8
movl %r8d,60(%rdi)
@@ -356,7 +324,9 @@ __poly1305_init_avx:
movl %r8d,76(%rdi)
movq %r12,%rax
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
movl $0x3ffffff,%eax
movq %r14,%r8
@@ -388,7 +358,7 @@ __poly1305_init_avx:
shrq $26,%r8
movl %edx,40(%rdi)
- movq %rbp,%rax
+ movq %r10,%rax
shlq $24,%rax
orq %rax,%r8
movl %r8d,56(%rdi)
@@ -396,13 +366,12 @@ __poly1305_init_avx:
movl %r8d,72(%rdi)
leaq -48-64(%rdi),%rdi
- .byte 0xf3,0xc3
-.size __poly1305_init_avx,.-__poly1305_init_avx
+.endm
-.type poly1305_blocks_avx,@function
+#ifdef CONFIG_AS_AVX
.align 32
-poly1305_blocks_avx:
-.cfi_startproc
+ENTRY(poly1305_blocks_avx)
+
movl 20(%rdi),%r8d
cmpq $128,%rdx
jae .Lblocks_avx
@@ -422,30 +391,19 @@ poly1305_blocks_avx:
jz .Leven_avx
pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
+ pushq %rdi
+
.Lblocks_avx_body:
movq %rdx,%r15
movq 0(%rdi),%r8
movq 8(%rdi),%r9
- movl 16(%rdi),%ebp
+ movl 16(%rdi),%r10d
movq 24(%rdi),%r11
movq 32(%rdi),%r13
@@ -465,21 +423,21 @@ poly1305_blocks_avx:
addq %r12,%r14
adcq %r9,%rbx
- movq %rbp,%r8
+ movq %r10,%r8
shlq $40,%r8
- shrq $24,%rbp
+ shrq $24,%r10
addq %r8,%rbx
- adcq $0,%rbp
+ adcq $0,%r10
movq $-4,%r9
- movq %rbp,%r8
- andq %rbp,%r9
+ movq %r10,%r8
+ andq %r10,%r9
shrq $2,%r8
- andq $3,%rbp
+ andq $3,%r10
addq %r9,%r8
addq %r8,%r14
adcq $0,%rbx
- adcq $0,%rbp
+ adcq $0,%r10
movq %r13,%r12
movq %r13,%rax
@@ -489,9 +447,11 @@ poly1305_blocks_avx:
addq 0(%rsi),%r14
adcq 8(%rsi),%rbx
leaq 16(%rsi),%rsi
- adcq %rcx,%rbp
+ adcq %rcx,%r10
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
testq %rcx,%rcx
jz .Lstore_base2_64_avx
@@ -508,11 +468,11 @@ poly1305_blocks_avx:
andq $0x3ffffff,%rdx
shrq $14,%rbx
orq %r11,%r14
- shlq $24,%rbp
+ shlq $24,%r10
andq $0x3ffffff,%r14
shrq $40,%r12
andq $0x3ffffff,%rbx
- orq %r12,%rbp
+ orq %r12,%r10
subq $16,%r15
jz .Lstore_base2_26_avx
@@ -521,14 +481,14 @@ poly1305_blocks_avx:
vmovd %edx,%xmm1
vmovd %r14d,%xmm2
vmovd %ebx,%xmm3
- vmovd %ebp,%xmm4
+ vmovd %r10d,%xmm4
jmp .Lproceed_avx
.align 32
.Lstore_base2_64_avx:
movq %r14,0(%rdi)
movq %rbx,8(%rdi)
- movq %rbp,16(%rdi)
+ movq %r10,16(%rdi)
jmp .Ldone_avx
.align 16
@@ -537,49 +497,30 @@ poly1305_blocks_avx:
movl %edx,4(%rdi)
movl %r14d,8(%rdi)
movl %ebx,12(%rdi)
- movl %ebp,16(%rdi)
+ movl %r10d,16(%rdi)
.align 16
.Ldone_avx:
- movq 0(%rsp),%r15
-.cfi_restore %r15
- movq 8(%rsp),%r14
-.cfi_restore %r14
- movq 16(%rsp),%r13
-.cfi_restore %r13
- movq 24(%rsp),%r12
-.cfi_restore %r12
- movq 32(%rsp),%rbp
-.cfi_restore %rbp
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
movq 40(%rsp),%rbx
-.cfi_restore %rbx
leaq 48(%rsp),%rsp
-.cfi_adjust_cfa_offset -48
+
.Lno_data_avx:
.Lblocks_avx_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
+ ret
.align 32
.Lbase2_64_avx:
-.cfi_startproc
+
pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
+ pushq %rdi
+
.Lbase2_64_avx_body:
movq %rdx,%r15
@@ -589,7 +530,7 @@ poly1305_blocks_avx:
movq 0(%rdi),%r14
movq 8(%rdi),%rbx
- movl 16(%rdi),%ebp
+ movl 16(%rdi),%r10d
movq %r13,%r12
movq %r13,%rax
@@ -602,10 +543,12 @@ poly1305_blocks_avx:
addq 0(%rsi),%r14
adcq 8(%rsi),%rbx
leaq 16(%rsi),%rsi
- adcq %rcx,%rbp
+ adcq %rcx,%r10
subq $16,%r15
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
.Linit_avx:
@@ -620,46 +563,38 @@ poly1305_blocks_avx:
andq $0x3ffffff,%rdx
shrq $14,%rbx
orq %r8,%r14
- shlq $24,%rbp
+ shlq $24,%r10
andq $0x3ffffff,%r14
shrq $40,%r9
andq $0x3ffffff,%rbx
- orq %r9,%rbp
+ orq %r9,%r10
vmovd %eax,%xmm0
vmovd %edx,%xmm1
vmovd %r14d,%xmm2
vmovd %ebx,%xmm3
- vmovd %ebp,%xmm4
+ vmovd %r10d,%xmm4
movl $1,20(%rdi)
- call __poly1305_init_avx
+ __poly1305_init_avx
.Lproceed_avx:
movq %r15,%rdx
- movq 0(%rsp),%r15
-.cfi_restore %r15
- movq 8(%rsp),%r14
-.cfi_restore %r14
- movq 16(%rsp),%r13
-.cfi_restore %r13
- movq 24(%rsp),%r12
-.cfi_restore %r12
- movq 32(%rsp),%rbp
-.cfi_restore %rbp
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
movq 40(%rsp),%rbx
-.cfi_restore %rbx
leaq 48(%rsp),%rax
leaq 48(%rsp),%rsp
-.cfi_adjust_cfa_offset -48
+
.Lbase2_64_avx_epilogue:
jmp .Ldo_avx
-.cfi_endproc
+
.align 32
.Leven_avx:
-.cfi_startproc
vmovd 0(%rdi),%xmm0
vmovd 4(%rdi),%xmm1
vmovd 8(%rdi),%xmm2
@@ -667,8 +602,10 @@ poly1305_blocks_avx:
vmovd 16(%rdi),%xmm4
.Ldo_avx:
+ leaq 8(%rsp),%r10
+ andq $-32,%rsp
+ subq $8,%rsp
leaq -88(%rsp),%r11
-.cfi_def_cfa %r11,0x60
subq $0x178,%rsp
subq $64,%rdx
leaq -32(%rsi),%rax
@@ -678,8 +615,6 @@ poly1305_blocks_avx:
leaq 112(%rdi),%rdi
leaq .Lconst(%rip),%rcx
-
-
vmovdqu 32(%rsi),%xmm5
vmovdqu 48(%rsi),%xmm6
vmovdqa 64(%rcx),%xmm15
@@ -754,25 +689,6 @@ poly1305_blocks_avx:
.align 32
.Loop_avx:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
vpmuludq %xmm5,%xmm14,%xmm10
vpmuludq %xmm6,%xmm14,%xmm11
vmovdqa %xmm2,32(%r11)
@@ -866,15 +782,6 @@ poly1305_blocks_avx:
subq $64,%rdx
cmovcq %rax,%rsi
-
-
-
-
-
-
-
-
-
vpmuludq %xmm0,%xmm9,%xmm5
vpmuludq %xmm1,%xmm9,%xmm6
vpaddq %xmm5,%xmm10,%xmm10
@@ -957,10 +864,6 @@ poly1305_blocks_avx:
vpand %xmm15,%xmm8,%xmm8
vpor 32(%rcx),%xmm9,%xmm9
-
-
-
-
vpsrlq $26,%xmm3,%xmm13
vpand %xmm15,%xmm3,%xmm3
vpaddq %xmm13,%xmm4,%xmm4
@@ -995,9 +898,6 @@ poly1305_blocks_avx:
ja .Loop_avx
.Lskip_loop_avx:
-
-
-
vpshufd $0x10,%xmm14,%xmm14
addq $32,%rdx
jnz .Long_tail_avx
@@ -1015,12 +915,6 @@ poly1305_blocks_avx:
vmovdqa %xmm3,48(%r11)
vmovdqa %xmm4,64(%r11)
-
-
-
-
-
-
vpmuludq %xmm7,%xmm14,%xmm12
vpmuludq %xmm5,%xmm14,%xmm10
vpshufd $0x10,-48(%rdi),%xmm2
@@ -1107,9 +1001,6 @@ poly1305_blocks_avx:
vpaddq 48(%r11),%xmm3,%xmm3
vpaddq 64(%r11),%xmm4,%xmm4
-
-
-
vpmuludq %xmm0,%xmm9,%xmm5
vpaddq %xmm5,%xmm10,%xmm10
vpmuludq %xmm1,%xmm9,%xmm6
@@ -1175,8 +1066,6 @@ poly1305_blocks_avx:
.Lshort_tail_avx:
-
-
vpsrldq $8,%xmm14,%xmm9
vpsrldq $8,%xmm13,%xmm8
vpsrldq $8,%xmm11,%xmm6
@@ -1188,9 +1077,6 @@ poly1305_blocks_avx:
vpaddq %xmm6,%xmm11,%xmm11
vpaddq %xmm7,%xmm12,%xmm12
-
-
-
vpsrlq $26,%xmm13,%xmm3
vpand %xmm15,%xmm13,%xmm13
vpaddq %xmm3,%xmm14,%xmm14
@@ -1227,16 +1113,14 @@ poly1305_blocks_avx:
vmovd %xmm12,-104(%rdi)
vmovd %xmm13,-100(%rdi)
vmovd %xmm14,-96(%rdi)
- leaq 88(%r11),%rsp
-.cfi_def_cfa %rsp,8
+ leaq -8(%r10),%rsp
+
vzeroupper
- .byte 0xf3,0xc3
-.cfi_endproc
-.size poly1305_blocks_avx,.-poly1305_blocks_avx
+ ret
+ENDPROC(poly1305_blocks_avx)
-.type poly1305_emit_avx,@function
.align 32
-poly1305_emit_avx:
+ENTRY(poly1305_emit_avx)
cmpl $0,20(%rdi)
je .Lemit
@@ -1286,12 +1170,14 @@ poly1305_emit_avx:
movq %rax,0(%rsi)
movq %rcx,8(%rsi)
- .byte 0xf3,0xc3
-.size poly1305_emit_avx,.-poly1305_emit_avx
-.type poly1305_blocks_avx2,@function
+ ret
+ENDPROC(poly1305_emit_avx)
+#endif /* CONFIG_AS_AVX */
+
+#ifdef CONFIG_AS_AVX2
.align 32
-poly1305_blocks_avx2:
-.cfi_startproc
+ENTRY(poly1305_blocks_avx2)
+
movl 20(%rdi),%r8d
cmpq $128,%rdx
jae .Lblocks_avx2
@@ -1311,30 +1197,19 @@ poly1305_blocks_avx2:
jz .Leven_avx2
pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
+ pushq %rdi
+
.Lblocks_avx2_body:
movq %rdx,%r15
movq 0(%rdi),%r8
movq 8(%rdi),%r9
- movl 16(%rdi),%ebp
+ movl 16(%rdi),%r10d
movq 24(%rdi),%r11
movq 32(%rdi),%r13
@@ -1354,21 +1229,21 @@ poly1305_blocks_avx2:
addq %r12,%r14
adcq %r9,%rbx
- movq %rbp,%r8
+ movq %r10,%r8
shlq $40,%r8
- shrq $24,%rbp
+ shrq $24,%r10
addq %r8,%rbx
- adcq $0,%rbp
+ adcq $0,%r10
movq $-4,%r9
- movq %rbp,%r8
- andq %rbp,%r9
+ movq %r10,%r8
+ andq %r10,%r9
shrq $2,%r8
- andq $3,%rbp
+ andq $3,%r10
addq %r9,%r8
addq %r8,%r14
adcq $0,%rbx
- adcq $0,%rbp
+ adcq $0,%r10
movq %r13,%r12
movq %r13,%rax
@@ -1379,10 +1254,12 @@ poly1305_blocks_avx2:
addq 0(%rsi),%r14
adcq 8(%rsi),%rbx
leaq 16(%rsi),%rsi
- adcq %rcx,%rbp
+ adcq %rcx,%r10
subq $16,%r15
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
movq %r12,%rax
testq $63,%r15
@@ -1403,11 +1280,11 @@ poly1305_blocks_avx2:
andq $0x3ffffff,%rdx
shrq $14,%rbx
orq %r11,%r14
- shlq $24,%rbp
+ shlq $24,%r10
andq $0x3ffffff,%r14
shrq $40,%r12
andq $0x3ffffff,%rbx
- orq %r12,%rbp
+ orq %r12,%r10
testq %r15,%r15
jz .Lstore_base2_26_avx2
@@ -1416,14 +1293,14 @@ poly1305_blocks_avx2:
vmovd %edx,%xmm1
vmovd %r14d,%xmm2
vmovd %ebx,%xmm3
- vmovd %ebp,%xmm4
+ vmovd %r10d,%xmm4
jmp .Lproceed_avx2
.align 32
.Lstore_base2_64_avx2:
movq %r14,0(%rdi)
movq %rbx,8(%rdi)
- movq %rbp,16(%rdi)
+ movq %r10,16(%rdi)
jmp .Ldone_avx2
.align 16
@@ -1432,49 +1309,32 @@ poly1305_blocks_avx2:
movl %edx,4(%rdi)
movl %r14d,8(%rdi)
movl %ebx,12(%rdi)
- movl %ebp,16(%rdi)
+ movl %r10d,16(%rdi)
.align 16
.Ldone_avx2:
- movq 0(%rsp),%r15
-.cfi_restore %r15
- movq 8(%rsp),%r14
-.cfi_restore %r14
- movq 16(%rsp),%r13
-.cfi_restore %r13
- movq 24(%rsp),%r12
-.cfi_restore %r12
- movq 32(%rsp),%rbp
-.cfi_restore %rbp
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
movq 40(%rsp),%rbx
-.cfi_restore %rbx
leaq 48(%rsp),%rsp
-.cfi_adjust_cfa_offset -48
+
.Lno_data_avx2:
.Lblocks_avx2_epilogue:
- .byte 0xf3,0xc3
-.cfi_endproc
+ ret
+
.align 32
.Lbase2_64_avx2:
-.cfi_startproc
+
+
pushq %rbx
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbx,-16
- pushq %rbp
-.cfi_adjust_cfa_offset 8
-.cfi_offset %rbp,-24
pushq %r12
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r12,-32
pushq %r13
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r13,-40
pushq %r14
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r14,-48
pushq %r15
-.cfi_adjust_cfa_offset 8
-.cfi_offset %r15,-56
+ pushq %rdi
+
.Lbase2_64_avx2_body:
movq %rdx,%r15
@@ -1484,7 +1344,7 @@ poly1305_blocks_avx2:
movq 0(%rdi),%r14
movq 8(%rdi),%rbx
- movl 16(%rdi),%ebp
+ movl 16(%rdi),%r10d
movq %r13,%r12
movq %r13,%rax
@@ -1498,10 +1358,12 @@ poly1305_blocks_avx2:
addq 0(%rsi),%r14
adcq 8(%rsi),%rbx
leaq 16(%rsi),%rsi
- adcq %rcx,%rbp
+ adcq %rcx,%r10
subq $16,%r15
- call __poly1305_block
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
movq %r12,%rax
testq $63,%r15
@@ -1520,49 +1382,39 @@ poly1305_blocks_avx2:
andq $0x3ffffff,%rdx
shrq $14,%rbx
orq %r8,%r14
- shlq $24,%rbp
+ shlq $24,%r10
andq $0x3ffffff,%r14
shrq $40,%r9
andq $0x3ffffff,%rbx
- orq %r9,%rbp
+ orq %r9,%r10
vmovd %eax,%xmm0
vmovd %edx,%xmm1
vmovd %r14d,%xmm2
vmovd %ebx,%xmm3
- vmovd %ebp,%xmm4
+ vmovd %r10d,%xmm4
movl $1,20(%rdi)
- call __poly1305_init_avx
+ __poly1305_init_avx
.Lproceed_avx2:
movq %r15,%rdx
- movl OPENSSL_ia32cap_P+8(%rip),%r10d
- movl $3221291008,%r11d
-
- movq 0(%rsp),%r15
-.cfi_restore %r15
- movq 8(%rsp),%r14
-.cfi_restore %r14
- movq 16(%rsp),%r13
-.cfi_restore %r13
- movq 24(%rsp),%r12
-.cfi_restore %r12
- movq 32(%rsp),%rbp
-.cfi_restore %rbp
+
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
movq 40(%rsp),%rbx
-.cfi_restore %rbx
leaq 48(%rsp),%rax
leaq 48(%rsp),%rsp
-.cfi_adjust_cfa_offset -48
+
.Lbase2_64_avx2_epilogue:
jmp .Ldo_avx2
-.cfi_endproc
+
.align 32
.Leven_avx2:
-.cfi_startproc
- movl OPENSSL_ia32cap_P+8(%rip),%r10d
+
vmovd 0(%rdi),%xmm0
vmovd 4(%rdi),%xmm1
vmovd 8(%rdi),%xmm2
@@ -1570,14 +1422,7 @@ poly1305_blocks_avx2:
vmovd 16(%rdi),%xmm4
.Ldo_avx2:
- cmpq $512,%rdx
- jb .Lskip_avx512
- andl %r11d,%r10d
- testl $65536,%r10d
- jnz .Lblocks_avx512
-.Lskip_avx512:
- leaq -8(%rsp),%r11
-.cfi_def_cfa %r11,16
+ leaq 8(%rsp),%r10
subq $0x128,%rsp
leaq .Lconst(%rip),%rcx
leaq 48+64(%rdi),%rdi
@@ -1647,13 +1492,6 @@ poly1305_blocks_avx2:
.align 32
.Loop_avx2:
-
-
-
-
-
-
-
vpaddq %ymm0,%ymm7,%ymm0
vmovdqa 0(%rsp),%ymm7
vpaddq %ymm1,%ymm8,%ymm1
@@ -1664,21 +1502,6 @@ poly1305_blocks_avx2:
vmovdqa 48(%rax),%ymm10
vmovdqa 112(%rax),%ymm5
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
vpmuludq %ymm2,%ymm7,%ymm13
vpmuludq %ymm2,%ymm8,%ymm14
vpmuludq %ymm2,%ymm9,%ymm15
@@ -1743,9 +1566,6 @@ poly1305_blocks_avx2:
vpaddq %ymm4,%ymm15,%ymm4
vpaddq %ymm0,%ymm11,%ymm0
-
-
-
vpsrlq $26,%ymm3,%ymm14
vpand %ymm5,%ymm3,%ymm3
vpaddq %ymm14,%ymm4,%ymm4
@@ -1798,12 +1618,6 @@ poly1305_blocks_avx2:
.byte 0x66,0x90
.Ltail_avx2:
-
-
-
-
-
-
vpaddq %ymm0,%ymm7,%ymm0
vmovdqu 4(%rsp),%ymm7
vpaddq %ymm1,%ymm8,%ymm1
@@ -1868,9 +1682,6 @@ poly1305_blocks_avx2:
vpaddq %ymm4,%ymm15,%ymm4
vpaddq %ymm0,%ymm11,%ymm0
-
-
-
vpsrldq $8,%ymm12,%ymm8
vpsrldq $8,%ymm2,%ymm9
vpsrldq $8,%ymm3,%ymm10
@@ -1893,9 +1704,6 @@ poly1305_blocks_avx2:
vpaddq %ymm8,%ymm12,%ymm12
vpaddq %ymm9,%ymm2,%ymm2
-
-
-
vpsrlq $26,%ymm3,%ymm14
vpand %ymm5,%ymm3,%ymm3
vpaddq %ymm14,%ymm4,%ymm4
@@ -1932,110 +1740,673 @@ poly1305_blocks_avx2:
vmovd %xmm2,-104(%rdi)
vmovd %xmm3,-100(%rdi)
vmovd %xmm4,-96(%rdi)
- leaq 8(%r11),%rsp
-.cfi_def_cfa %rsp,8
+ leaq -8(%r10),%rsp
+
vzeroupper
- .byte 0xf3,0xc3
-.cfi_endproc
-.size poly1305_blocks_avx2,.-poly1305_blocks_avx2
-.type poly1305_blocks_avx512,@function
+ ret
+
+ENDPROC(poly1305_blocks_avx2)
+#endif /* CONFIG_AS_AVX2 */
+
+#ifdef CONFIG_AS_AVX512
.align 32
-poly1305_blocks_avx512:
-.cfi_startproc
-.Lblocks_avx512:
- movl $15,%eax
- kmovw %eax,%k2
- leaq -8(%rsp),%r11
-.cfi_def_cfa %r11,16
- subq $0x128,%rsp
- leaq .Lconst(%rip),%rcx
- leaq 48+64(%rdi),%rdi
- vmovdqa 96(%rcx),%ymm9
+ENTRY(poly1305_blocks_avx512)
+ movl 20(%rdi),%r8d
+ cmpq $128,%rdx
+ jae .Lblocks_avx2_512
+ testl %r8d,%r8d
+ jz .Lblocks
- vmovdqu -64(%rdi),%xmm11
- andq $-512,%rsp
- vmovdqu -48(%rdi),%xmm12
- movq $0x20,%rax
- vmovdqu -32(%rdi),%xmm7
- vmovdqu -16(%rdi),%xmm13
- vmovdqu 0(%rdi),%xmm8
- vmovdqu 16(%rdi),%xmm14
- vmovdqu 32(%rdi),%xmm10
- vmovdqu 48(%rdi),%xmm15
- vmovdqu 64(%rdi),%xmm6
- vpermd %zmm11,%zmm9,%zmm16
- vpbroadcastq 64(%rcx),%zmm5
- vpermd %zmm12,%zmm9,%zmm17
- vpermd %zmm7,%zmm9,%zmm21
- vpermd %zmm13,%zmm9,%zmm18
- vmovdqa64 %zmm16,0(%rsp){%k2}
- vpsrlq $32,%zmm16,%zmm7
- vpermd %zmm8,%zmm9,%zmm22
- vmovdqu64 %zmm17,0(%rsp,%rax,1){%k2}
- vpsrlq $32,%zmm17,%zmm8
- vpermd %zmm14,%zmm9,%zmm19
- vmovdqa64 %zmm21,64(%rsp){%k2}
- vpermd %zmm10,%zmm9,%zmm23
- vpermd %zmm15,%zmm9,%zmm20
- vmovdqu64 %zmm18,64(%rsp,%rax,1){%k2}
- vpermd %zmm6,%zmm9,%zmm24
- vmovdqa64 %zmm22,128(%rsp){%k2}
- vmovdqu64 %zmm19,128(%rsp,%rax,1){%k2}
- vmovdqa64 %zmm23,192(%rsp){%k2}
- vmovdqu64 %zmm20,192(%rsp,%rax,1){%k2}
- vmovdqa64 %zmm24,256(%rsp){%k2}
+.Lblocks_avx2_512:
+ andq $-16,%rdx
+ jz .Lno_data_avx2_512
+ vzeroupper
+ testl %r8d,%r8d
+ jz .Lbase2_64_avx2_512
+ testq $63,%rdx
+ jz .Leven_avx2_512
+ pushq %rbx
+ pushq %r12
+ pushq %r13
+ pushq %r14
+ pushq %r15
+ pushq %rdi
+.Lblocks_avx2_body_512:
+ movq %rdx,%r15
+ movq 0(%rdi),%r8
+ movq 8(%rdi),%r9
+ movl 16(%rdi),%r10d
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
- vpmuludq %zmm7,%zmm16,%zmm11
- vpmuludq %zmm7,%zmm17,%zmm12
- vpmuludq %zmm7,%zmm18,%zmm13
- vpmuludq %zmm7,%zmm19,%zmm14
- vpmuludq %zmm7,%zmm20,%zmm15
- vpsrlq $32,%zmm18,%zmm9
+ movl %r8d,%r14d
+ andq $-2147483648,%r8
+ movq %r9,%r12
+ movl %r9d,%ebx
+ andq $-2147483648,%r9
- vpmuludq %zmm8,%zmm24,%zmm25
- vpmuludq %zmm8,%zmm16,%zmm26
- vpmuludq %zmm8,%zmm17,%zmm27
- vpmuludq %zmm8,%zmm18,%zmm28
- vpmuludq %zmm8,%zmm19,%zmm29
- vpsrlq $32,%zmm19,%zmm10
- vpaddq %zmm25,%zmm11,%zmm11
- vpaddq %zmm26,%zmm12,%zmm12
- vpaddq %zmm27,%zmm13,%zmm13
- vpaddq %zmm28,%zmm14,%zmm14
- vpaddq %zmm29,%zmm15,%zmm15
+ shrq $6,%r8
+ shlq $52,%r12
+ addq %r8,%r14
+ shrq $12,%rbx
+ shrq $18,%r9
+ addq %r12,%r14
+ adcq %r9,%rbx
- vpmuludq %zmm9,%zmm23,%zmm25
- vpmuludq %zmm9,%zmm24,%zmm26
- vpmuludq %zmm9,%zmm17,%zmm28
- vpmuludq %zmm9,%zmm18,%zmm29
- vpmuludq %zmm9,%zmm16,%zmm27
- vpsrlq $32,%zmm20,%zmm6
- vpaddq %zmm25,%zmm11,%zmm11
- vpaddq %zmm26,%zmm12,%zmm12
- vpaddq %zmm28,%zmm14,%zmm14
- vpaddq %zmm29,%zmm15,%zmm15
- vpaddq %zmm27,%zmm13,%zmm13
+ movq %r10,%r8
+ shlq $40,%r8
+ shrq $24,%r10
+ addq %r8,%rbx
+ adcq $0,%r10
- vpmuludq %zmm10,%zmm22,%zmm25
- vpmuludq %zmm10,%zmm16,%zmm28
- vpmuludq %zmm10,%zmm17,%zmm29
- vpmuludq %zmm10,%zmm23,%zmm26
- vpmuludq %zmm10,%zmm24,%zmm27
- vpaddq %zmm25,%zmm11,%zmm11
- vpaddq %zmm28,%zmm14,%zmm14
- vpaddq %zmm29,%zmm15,%zmm15
- vpaddq %zmm26,%zmm12,%zmm12
- vpaddq %zmm27,%zmm13,%zmm13
+ movq $-4,%r9
+ movq %r10,%r8
+ andq %r10,%r9
+ shrq $2,%r8
+ andq $3,%r10
+ addq %r9,%r8
+ addq %r8,%r14
+ adcq $0,%rbx
+ adcq $0,%r10
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+.Lbase2_26_pre_avx2_512:
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%r10
+ subq $16,%r15
+
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
+ movq %r12,%rax
+
+ testq $63,%r15
+ jnz .Lbase2_26_pre_avx2_512
+
+ testq %rcx,%rcx
+ jz .Lstore_base2_64_avx2_512
+
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r11
+ movq %rbx,%r12
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r11
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r11,%r14
+ shlq $24,%r10
+ andq $0x3ffffff,%r14
+ shrq $40,%r12
+ andq $0x3ffffff,%rbx
+ orq %r12,%r10
+
+ testq %r15,%r15
+ jz .Lstore_base2_26_avx2_512
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %r10d,%xmm4
+ jmp .Lproceed_avx2_512
+
+.align 32
+.Lstore_base2_64_avx2_512:
+ movq %r14,0(%rdi)
+ movq %rbx,8(%rdi)
+ movq %r10,16(%rdi)
+ jmp .Ldone_avx2_512
+
+.align 16
+.Lstore_base2_26_avx2_512:
+ movl %eax,0(%rdi)
+ movl %edx,4(%rdi)
+ movl %r14d,8(%rdi)
+ movl %ebx,12(%rdi)
+ movl %r10d,16(%rdi)
+.align 16
+.Ldone_avx2_512:
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
+ movq 40(%rsp),%rbx
+ leaq 48(%rsp),%rsp
+
+.Lno_data_avx2_512:
+.Lblocks_avx2_epilogue_512:
+ ret
+
+
+.align 32
+.Lbase2_64_avx2_512:
+
+ pushq %rbx
+ pushq %r12
+ pushq %r13
+ pushq %r14
+ pushq %r15
+ pushq %rdi
+
+.Lbase2_64_avx2_body_512:
+
+ movq %rdx,%r15
+
+ movq 24(%rdi),%r11
+ movq 32(%rdi),%r13
+
+ movq 0(%rdi),%r14
+ movq 8(%rdi),%rbx
+ movl 16(%rdi),%r10d
+
+ movq %r13,%r12
+ movq %r13,%rax
+ shrq $2,%r13
+ addq %r12,%r13
+
+ testq $63,%rdx
+ jz .Linit_avx2_512
+
+.Lbase2_64_pre_avx2_512:
+ addq 0(%rsi),%r14
+ adcq 8(%rsi),%rbx
+ leaq 16(%rsi),%rsi
+ adcq %rcx,%r10
+ subq $16,%r15
+
+ movq %rdi,0(%rsp)
+ __poly1305_block
+ movq 0(%rsp),%rdi
+ movq %r12,%rax
+
+ testq $63,%r15
+ jnz .Lbase2_64_pre_avx2_512
+
+.Linit_avx2_512:
+
+ movq %r14,%rax
+ movq %r14,%rdx
+ shrq $52,%r14
+ movq %rbx,%r8
+ movq %rbx,%r9
+ shrq $26,%rdx
+ andq $0x3ffffff,%rax
+ shlq $12,%r8
+ andq $0x3ffffff,%rdx
+ shrq $14,%rbx
+ orq %r8,%r14
+ shlq $24,%r10
+ andq $0x3ffffff,%r14
+ shrq $40,%r9
+ andq $0x3ffffff,%rbx
+ orq %r9,%r10
+
+ vmovd %eax,%xmm0
+ vmovd %edx,%xmm1
+ vmovd %r14d,%xmm2
+ vmovd %ebx,%xmm3
+ vmovd %r10d,%xmm4
+ movl $1,20(%rdi)
+
+ __poly1305_init_avx
+
+.Lproceed_avx2_512:
+ movq %r15,%rdx
+
+ movq 8(%rsp),%r15
+ movq 16(%rsp),%r14
+ movq 24(%rsp),%r13
+ movq 32(%rsp),%r12
+ movq 40(%rsp),%rbx
+ leaq 48(%rsp),%rax
+ leaq 48(%rsp),%rsp
+
+.Lbase2_64_avx2_epilogue_512:
+ jmp .Ldo_avx2_512
+
+
+.align 32
+.Leven_avx2_512:
+
+ vmovd 0(%rdi),%xmm0
+ vmovd 4(%rdi),%xmm1
+ vmovd 8(%rdi),%xmm2
+ vmovd 12(%rdi),%xmm3
+ vmovd 16(%rdi),%xmm4
+
+.Ldo_avx2_512:
+ cmpq $512,%rdx
+ jae .Lblocks_avx512
+.Lskip_avx512:
+ leaq 8(%rsp),%r10
+
+ subq $0x128,%rsp
+ leaq .Lconst(%rip),%rcx
+ leaq 48+64(%rdi),%rdi
+ vmovdqa 96(%rcx),%ymm7
+
+
+ vmovdqu -64(%rdi),%xmm9
+ andq $-512,%rsp
+ vmovdqu -48(%rdi),%xmm10
+ vmovdqu -32(%rdi),%xmm6
+ vmovdqu -16(%rdi),%xmm11
+ vmovdqu 0(%rdi),%xmm12
+ vmovdqu 16(%rdi),%xmm13
+ leaq 144(%rsp),%rax
+ vmovdqu 32(%rdi),%xmm14
+ vpermd %ymm9,%ymm7,%ymm9
+ vmovdqu 48(%rdi),%xmm15
+ vpermd %ymm10,%ymm7,%ymm10
+ vmovdqu 64(%rdi),%xmm5
+ vpermd %ymm6,%ymm7,%ymm6
+ vmovdqa %ymm9,0(%rsp)
+ vpermd %ymm11,%ymm7,%ymm11
+ vmovdqa %ymm10,32-144(%rax)
+ vpermd %ymm12,%ymm7,%ymm12
+ vmovdqa %ymm6,64-144(%rax)
+ vpermd %ymm13,%ymm7,%ymm13
+ vmovdqa %ymm11,96-144(%rax)
+ vpermd %ymm14,%ymm7,%ymm14
+ vmovdqa %ymm12,128-144(%rax)
+ vpermd %ymm15,%ymm7,%ymm15
+ vmovdqa %ymm13,160-144(%rax)
+ vpermd %ymm5,%ymm7,%ymm5
+ vmovdqa %ymm14,192-144(%rax)
+ vmovdqa %ymm15,224-144(%rax)
+ vmovdqa %ymm5,256-144(%rax)
+ vmovdqa 64(%rcx),%ymm5
+
+
+
+ vmovdqu 0(%rsi),%xmm7
+ vmovdqu 16(%rsi),%xmm8
+ vinserti128 $1,32(%rsi),%ymm7,%ymm7
+ vinserti128 $1,48(%rsi),%ymm8,%ymm8
+ leaq 64(%rsi),%rsi
+
+ vpsrldq $6,%ymm7,%ymm9
+ vpsrldq $6,%ymm8,%ymm10
+ vpunpckhqdq %ymm8,%ymm7,%ymm6
+ vpunpcklqdq %ymm10,%ymm9,%ymm9
+ vpunpcklqdq %ymm8,%ymm7,%ymm7
+
+ vpsrlq $30,%ymm9,%ymm10
+ vpsrlq $4,%ymm9,%ymm9
+ vpsrlq $26,%ymm7,%ymm8
+ vpsrlq $40,%ymm6,%ymm6
+ vpand %ymm5,%ymm9,%ymm9
+ vpand %ymm5,%ymm7,%ymm7
+ vpand %ymm5,%ymm8,%ymm8
+ vpand %ymm5,%ymm10,%ymm10
+ vpor 32(%rcx),%ymm6,%ymm6
+
+ vpaddq %ymm2,%ymm9,%ymm2
+ subq $64,%rdx
+ jz .Ltail_avx2_512
+ jmp .Loop_avx2_512
+
+.align 32
+.Loop_avx2_512:
+
+ vpaddq %ymm0,%ymm7,%ymm0
+ vmovdqa 0(%rsp),%ymm7
+ vpaddq %ymm1,%ymm8,%ymm1
+ vmovdqa 32(%rsp),%ymm8
+ vpaddq %ymm3,%ymm10,%ymm3
+ vmovdqa 96(%rsp),%ymm9
+ vpaddq %ymm4,%ymm6,%ymm4
+ vmovdqa 48(%rax),%ymm10
+ vmovdqa 112(%rax),%ymm5
+
+ vpmuludq %ymm2,%ymm7,%ymm13
+ vpmuludq %ymm2,%ymm8,%ymm14
+ vpmuludq %ymm2,%ymm9,%ymm15
+ vpmuludq %ymm2,%ymm10,%ymm11
+ vpmuludq %ymm2,%ymm5,%ymm12
+
+ vpmuludq %ymm0,%ymm8,%ymm6
+ vpmuludq %ymm1,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq 64(%rsp),%ymm4,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm11,%ymm11
+ vmovdqa -16(%rax),%ymm8
+
+ vpmuludq %ymm0,%ymm7,%ymm6
+ vpmuludq %ymm1,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vpmuludq %ymm3,%ymm7,%ymm6
+ vpmuludq %ymm4,%ymm7,%ymm2
+ vmovdqu 0(%rsi),%xmm7
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm2,%ymm15,%ymm15
+ vinserti128 $1,32(%rsi),%ymm7,%ymm7
+
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq %ymm4,%ymm8,%ymm2
+ vmovdqu 16(%rsi),%xmm8
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vmovdqa 16(%rax),%ymm2
+ vpmuludq %ymm1,%ymm9,%ymm6
+ vpmuludq %ymm0,%ymm9,%ymm9
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm9,%ymm13,%ymm13
+ vinserti128 $1,48(%rsi),%ymm8,%ymm8
+ leaq 64(%rsi),%rsi
+
+ vpmuludq %ymm1,%ymm2,%ymm6
+ vpmuludq %ymm0,%ymm2,%ymm2
+ vpsrldq $6,%ymm7,%ymm9
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm14,%ymm14
+ vpmuludq %ymm3,%ymm10,%ymm6
+ vpmuludq %ymm4,%ymm10,%ymm2
+ vpsrldq $6,%ymm8,%ymm10
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpunpckhqdq %ymm8,%ymm7,%ymm6
+
+ vpmuludq %ymm3,%ymm5,%ymm3
+ vpmuludq %ymm4,%ymm5,%ymm4
+ vpunpcklqdq %ymm8,%ymm7,%ymm7
+ vpaddq %ymm3,%ymm13,%ymm2
+ vpaddq %ymm4,%ymm14,%ymm3
+ vpunpcklqdq %ymm10,%ymm9,%ymm10
+ vpmuludq 80(%rax),%ymm0,%ymm4
+ vpmuludq %ymm1,%ymm5,%ymm0
+ vmovdqa 64(%rcx),%ymm5
+ vpaddq %ymm4,%ymm15,%ymm4
+ vpaddq %ymm0,%ymm11,%ymm0
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm12,%ymm1
+
+ vpsrlq $26,%ymm4,%ymm15
+ vpand %ymm5,%ymm4,%ymm4
+
+ vpsrlq $4,%ymm10,%ymm9
+
+ vpsrlq $26,%ymm1,%ymm12
+ vpand %ymm5,%ymm1,%ymm1
+ vpaddq %ymm12,%ymm2,%ymm2
+
+ vpaddq %ymm15,%ymm0,%ymm0
+ vpsllq $2,%ymm15,%ymm15
+ vpaddq %ymm15,%ymm0,%ymm0
+
+ vpand %ymm5,%ymm9,%ymm9
+ vpsrlq $26,%ymm7,%ymm8
+
+ vpsrlq $26,%ymm2,%ymm13
+ vpand %ymm5,%ymm2,%ymm2
+ vpaddq %ymm13,%ymm3,%ymm3
+
+ vpaddq %ymm9,%ymm2,%ymm2
+ vpsrlq $30,%ymm10,%ymm10
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $40,%ymm6,%ymm6
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpand %ymm5,%ymm7,%ymm7
+ vpand %ymm5,%ymm8,%ymm8
+ vpand %ymm5,%ymm10,%ymm10
+ vpor 32(%rcx),%ymm6,%ymm6
+
+ subq $64,%rdx
+ jnz .Loop_avx2_512
+
+.byte 0x66,0x90
+.Ltail_avx2_512:
+
+ vpaddq %ymm0,%ymm7,%ymm0
+ vmovdqu 4(%rsp),%ymm7
+ vpaddq %ymm1,%ymm8,%ymm1
+ vmovdqu 36(%rsp),%ymm8
+ vpaddq %ymm3,%ymm10,%ymm3
+ vmovdqu 100(%rsp),%ymm9
+ vpaddq %ymm4,%ymm6,%ymm4
+ vmovdqu 52(%rax),%ymm10
+ vmovdqu 116(%rax),%ymm5
+
+ vpmuludq %ymm2,%ymm7,%ymm13
+ vpmuludq %ymm2,%ymm8,%ymm14
+ vpmuludq %ymm2,%ymm9,%ymm15
+ vpmuludq %ymm2,%ymm10,%ymm11
+ vpmuludq %ymm2,%ymm5,%ymm12
+
+ vpmuludq %ymm0,%ymm8,%ymm6
+ vpmuludq %ymm1,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq 68(%rsp),%ymm4,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm11,%ymm11
+
+ vpmuludq %ymm0,%ymm7,%ymm6
+ vpmuludq %ymm1,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vmovdqu -12(%rax),%ymm8
+ vpaddq %ymm2,%ymm12,%ymm12
+ vpmuludq %ymm3,%ymm7,%ymm6
+ vpmuludq %ymm4,%ymm7,%ymm2
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm2,%ymm15,%ymm15
+
+ vpmuludq %ymm3,%ymm8,%ymm6
+ vpmuludq %ymm4,%ymm8,%ymm2
+ vpaddq %ymm6,%ymm11,%ymm11
+ vpaddq %ymm2,%ymm12,%ymm12
+ vmovdqu 20(%rax),%ymm2
+ vpmuludq %ymm1,%ymm9,%ymm6
+ vpmuludq %ymm0,%ymm9,%ymm9
+ vpaddq %ymm6,%ymm14,%ymm14
+ vpaddq %ymm9,%ymm13,%ymm13
+
+ vpmuludq %ymm1,%ymm2,%ymm6
+ vpmuludq %ymm0,%ymm2,%ymm2
+ vpaddq %ymm6,%ymm15,%ymm15
+ vpaddq %ymm2,%ymm14,%ymm14
+ vpmuludq %ymm3,%ymm10,%ymm6
+ vpmuludq %ymm4,%ymm10,%ymm2
+ vpaddq %ymm6,%ymm12,%ymm12
+ vpaddq %ymm2,%ymm13,%ymm13
+
+ vpmuludq %ymm3,%ymm5,%ymm3
+ vpmuludq %ymm4,%ymm5,%ymm4
+ vpaddq %ymm3,%ymm13,%ymm2
+ vpaddq %ymm4,%ymm14,%ymm3
+ vpmuludq 84(%rax),%ymm0,%ymm4
+ vpmuludq %ymm1,%ymm5,%ymm0
+ vmovdqa 64(%rcx),%ymm5
+ vpaddq %ymm4,%ymm15,%ymm4
+ vpaddq %ymm0,%ymm11,%ymm0
+
+ vpsrldq $8,%ymm12,%ymm8
+ vpsrldq $8,%ymm2,%ymm9
+ vpsrldq $8,%ymm3,%ymm10
+ vpsrldq $8,%ymm4,%ymm6
+ vpsrldq $8,%ymm0,%ymm7
+ vpaddq %ymm8,%ymm12,%ymm12
+ vpaddq %ymm9,%ymm2,%ymm2
+ vpaddq %ymm10,%ymm3,%ymm3
+ vpaddq %ymm6,%ymm4,%ymm4
+ vpaddq %ymm7,%ymm0,%ymm0
+
+ vpermq $0x2,%ymm3,%ymm10
+ vpermq $0x2,%ymm4,%ymm6
+ vpermq $0x2,%ymm0,%ymm7
+ vpermq $0x2,%ymm12,%ymm8
+ vpermq $0x2,%ymm2,%ymm9
+ vpaddq %ymm10,%ymm3,%ymm3
+ vpaddq %ymm6,%ymm4,%ymm4
+ vpaddq %ymm7,%ymm0,%ymm0
+ vpaddq %ymm8,%ymm12,%ymm12
+ vpaddq %ymm9,%ymm2,%ymm2
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm12,%ymm1
+
+ vpsrlq $26,%ymm4,%ymm15
+ vpand %ymm5,%ymm4,%ymm4
+
+ vpsrlq $26,%ymm1,%ymm12
+ vpand %ymm5,%ymm1,%ymm1
+ vpaddq %ymm12,%ymm2,%ymm2
+
+ vpaddq %ymm15,%ymm0,%ymm0
+ vpsllq $2,%ymm15,%ymm15
+ vpaddq %ymm15,%ymm0,%ymm0
+
+ vpsrlq $26,%ymm2,%ymm13
+ vpand %ymm5,%ymm2,%ymm2
+ vpaddq %ymm13,%ymm3,%ymm3
+
+ vpsrlq $26,%ymm0,%ymm11
+ vpand %ymm5,%ymm0,%ymm0
+ vpaddq %ymm11,%ymm1,%ymm1
+
+ vpsrlq $26,%ymm3,%ymm14
+ vpand %ymm5,%ymm3,%ymm3
+ vpaddq %ymm14,%ymm4,%ymm4
+
+ vmovd %xmm0,-112(%rdi)
+ vmovd %xmm1,-108(%rdi)
+ vmovd %xmm2,-104(%rdi)
+ vmovd %xmm3,-100(%rdi)
+ vmovd %xmm4,-96(%rdi)
+ leaq -8(%r10),%rsp
+
+ vzeroupper
+ ret
+
+.Lblocks_avx512:
+
+ movl $15,%eax
+ kmovw %eax,%k2
+ leaq 8(%rsp),%r10
+
+ subq $0x128,%rsp
+ leaq .Lconst(%rip),%rcx
+ leaq 48+64(%rdi),%rdi
+ vmovdqa 96(%rcx),%ymm9
+
+ vmovdqu32 -64(%rdi),%zmm16{%k2}{z}
+ andq $-512,%rsp
+ vmovdqu32 -48(%rdi),%zmm17{%k2}{z}
+ movq $0x20,%rax
+ vmovdqu32 -32(%rdi),%zmm21{%k2}{z}
+ vmovdqu32 -16(%rdi),%zmm18{%k2}{z}
+ vmovdqu32 0(%rdi),%zmm22{%k2}{z}
+ vmovdqu32 16(%rdi),%zmm19{%k2}{z}
+ vmovdqu32 32(%rdi),%zmm23{%k2}{z}
+ vmovdqu32 48(%rdi),%zmm20{%k2}{z}
+ vmovdqu32 64(%rdi),%zmm24{%k2}{z}
+ vpermd %zmm16,%zmm9,%zmm16
+ vpbroadcastq 64(%rcx),%zmm5
+ vpermd %zmm17,%zmm9,%zmm17
+ vpermd %zmm21,%zmm9,%zmm21
+ vpermd %zmm18,%zmm9,%zmm18
+ vmovdqa64 %zmm16,0(%rsp){%k2}
+ vpsrlq $32,%zmm16,%zmm7
+ vpermd %zmm22,%zmm9,%zmm22
+ vmovdqu64 %zmm17,0(%rsp,%rax,1){%k2}
+ vpsrlq $32,%zmm17,%zmm8
+ vpermd %zmm19,%zmm9,%zmm19
+ vmovdqa64 %zmm21,64(%rsp){%k2}
+ vpermd %zmm23,%zmm9,%zmm23
+ vpermd %zmm20,%zmm9,%zmm20
+ vmovdqu64 %zmm18,64(%rsp,%rax,1){%k2}
+ vpermd %zmm24,%zmm9,%zmm24
+ vmovdqa64 %zmm22,128(%rsp){%k2}
+ vmovdqu64 %zmm19,128(%rsp,%rax,1){%k2}
+ vmovdqa64 %zmm23,192(%rsp){%k2}
+ vmovdqu64 %zmm20,192(%rsp,%rax,1){%k2}
+ vmovdqa64 %zmm24,256(%rsp){%k2}
+
+ vpmuludq %zmm7,%zmm16,%zmm11
+ vpmuludq %zmm7,%zmm17,%zmm12
+ vpmuludq %zmm7,%zmm18,%zmm13
+ vpmuludq %zmm7,%zmm19,%zmm14
+ vpmuludq %zmm7,%zmm20,%zmm15
+ vpsrlq $32,%zmm18,%zmm9
+
+ vpmuludq %zmm8,%zmm24,%zmm25
+ vpmuludq %zmm8,%zmm16,%zmm26
+ vpmuludq %zmm8,%zmm17,%zmm27
+ vpmuludq %zmm8,%zmm18,%zmm28
+ vpmuludq %zmm8,%zmm19,%zmm29
+ vpsrlq $32,%zmm19,%zmm10
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+
+ vpmuludq %zmm9,%zmm23,%zmm25
+ vpmuludq %zmm9,%zmm24,%zmm26
+ vpmuludq %zmm9,%zmm17,%zmm28
+ vpmuludq %zmm9,%zmm18,%zmm29
+ vpmuludq %zmm9,%zmm16,%zmm27
+ vpsrlq $32,%zmm20,%zmm6
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm27,%zmm13,%zmm13
+
+ vpmuludq %zmm10,%zmm22,%zmm25
+ vpmuludq %zmm10,%zmm16,%zmm28
+ vpmuludq %zmm10,%zmm17,%zmm29
+ vpmuludq %zmm10,%zmm23,%zmm26
+ vpmuludq %zmm10,%zmm24,%zmm27
+ vpaddq %zmm25,%zmm11,%zmm11
+ vpaddq %zmm28,%zmm14,%zmm14
+ vpaddq %zmm29,%zmm15,%zmm15
+ vpaddq %zmm26,%zmm12,%zmm12
+ vpaddq %zmm27,%zmm13,%zmm13
vpmuludq %zmm6,%zmm24,%zmm28
vpmuludq %zmm6,%zmm16,%zmm29
@@ -2048,15 +2419,10 @@ poly1305_blocks_avx512:
vpaddq %zmm26,%zmm12,%zmm12
vpaddq %zmm27,%zmm13,%zmm13
-
-
vmovdqu64 0(%rsi),%zmm10
vmovdqu64 64(%rsi),%zmm6
leaq 128(%rsi),%rsi
-
-
-
vpsrlq $26,%zmm14,%zmm28
vpandq %zmm5,%zmm14,%zmm14
vpaddq %zmm28,%zmm15,%zmm15
@@ -2088,18 +2454,9 @@ poly1305_blocks_avx512:
vpandq %zmm5,%zmm14,%zmm14
vpaddq %zmm28,%zmm15,%zmm15
-
-
-
-
vpunpcklqdq %zmm6,%zmm10,%zmm7
vpunpckhqdq %zmm6,%zmm10,%zmm6
-
-
-
-
-
vmovdqa32 128(%rcx),%zmm25
movl $0x7777,%eax
kmovw %eax,%k1
@@ -2136,9 +2493,6 @@ poly1305_blocks_avx512:
vpandq %zmm5,%zmm9,%zmm9
vpandq %zmm5,%zmm7,%zmm7
-
-
-
vpaddq %zmm2,%zmm9,%zmm2
subq $192,%rdx
jbe .Ltail_avx512
@@ -2147,33 +2501,6 @@ poly1305_blocks_avx512:
.align 32
.Loop_avx512:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
vpmuludq %zmm2,%zmm17,%zmm14
vpaddq %zmm0,%zmm7,%zmm0
vpmuludq %zmm2,%zmm18,%zmm15
@@ -2238,9 +2565,6 @@ poly1305_blocks_avx512:
vpaddq %zmm26,%zmm12,%zmm1
vpaddq %zmm27,%zmm13,%zmm2
-
-
-
vpsrlq $52,%zmm7,%zmm9
vpsllq $12,%zmm6,%zmm10
@@ -2288,18 +2612,11 @@ poly1305_blocks_avx512:
vpandq %zmm5,%zmm7,%zmm7
-
-
-
subq $128,%rdx
ja .Loop_avx512
.Ltail_avx512:
-
-
-
-
vpsrlq $32,%zmm16,%zmm16
vpsrlq $32,%zmm17,%zmm17
vpsrlq $32,%zmm18,%zmm18
@@ -2310,11 +2627,8 @@ poly1305_blocks_avx512:
vpsrlq $32,%zmm21,%zmm21
vpsrlq $32,%zmm22,%zmm22
-
-
leaq (%rsi,%rdx,1),%rsi
-
vpaddq %zmm0,%zmm7,%zmm0
vpmuludq %zmm2,%zmm17,%zmm14
@@ -2378,9 +2692,6 @@ poly1305_blocks_avx512:
vpaddq %zmm26,%zmm12,%zmm1
vpaddq %zmm27,%zmm13,%zmm2
-
-
-
movl $1,%eax
vpermq $0xb1,%zmm3,%zmm14
vpermq $0xb1,%zmm15,%zmm4
@@ -2416,8 +2727,6 @@ poly1305_blocks_avx512:
vpaddq %zmm12,%zmm1,%zmm1{%k3}{z}
vpaddq %zmm13,%zmm2,%zmm2{%k3}{z}
-
-
vpsrlq $26,%ymm3,%ymm14
vpand %ymm5,%ymm3,%ymm3
vpsrldq $6,%ymm7,%ymm9
@@ -2466,7 +2775,7 @@ poly1305_blocks_avx512:
leaq 144(%rsp),%rax
addq $64,%rdx
- jnz .Ltail_avx2
+ jnz .Ltail_avx2_512
vpsubq %ymm9,%ymm2,%ymm2
vmovd %xmm0,-112(%rdi)
@@ -2475,1091 +2784,9 @@ poly1305_blocks_avx512:
vmovd %xmm3,-100(%rdi)
vmovd %xmm4,-96(%rdi)
vzeroall
- leaq 8(%r11),%rsp
-.cfi_def_cfa %rsp,8
- .byte 0xf3,0xc3
-.cfi_endproc
-.size poly1305_blocks_avx512,.-poly1305_blocks_avx512
-.type poly1305_init_base2_44,@function
-.align 32
-poly1305_init_base2_44:
- xorq %rax,%rax
- movq %rax,0(%rdi)
- movq %rax,8(%rdi)
- movq %rax,16(%rdi)
-
-.Linit_base2_44:
- leaq poly1305_blocks_vpmadd52(%rip),%r10
- leaq poly1305_emit_base2_44(%rip),%r11
-
- movq $0x0ffffffc0fffffff,%rax
- movq $0x0ffffffc0ffffffc,%rcx
- andq 0(%rsi),%rax
- movq $0x00000fffffffffff,%r8
- andq 8(%rsi),%rcx
- movq $0x00000fffffffffff,%r9
- andq %rax,%r8
- shrdq $44,%rcx,%rax
- movq %r8,40(%rdi)
- andq %r9,%rax
- shrq $24,%rcx
- movq %rax,48(%rdi)
- leaq (%rax,%rax,4),%rax
- movq %rcx,56(%rdi)
- shlq $2,%rax
- leaq (%rcx,%rcx,4),%rcx
- shlq $2,%rcx
- movq %rax,24(%rdi)
- movq %rcx,32(%rdi)
- movq $-1,64(%rdi)
- movq %r10,0(%rdx)
- movq %r11,8(%rdx)
- movl $1,%eax
- .byte 0xf3,0xc3
-.size poly1305_init_base2_44,.-poly1305_init_base2_44
-.type poly1305_blocks_vpmadd52,@function
-.align 32
-poly1305_blocks_vpmadd52:
- shrq $4,%rdx
- jz .Lno_data_vpmadd52
-
- shlq $40,%rcx
- movq 64(%rdi),%r8
-
-
-
-
-
-
- movq $3,%rax
- movq $1,%r10
- cmpq $4,%rdx
- cmovaeq %r10,%rax
- testq %r8,%r8
- cmovnsq %r10,%rax
-
- andq %rdx,%rax
- jz .Lblocks_vpmadd52_4x
-
- subq %rax,%rdx
- movl $7,%r10d
- movl $1,%r11d
- kmovw %r10d,%k7
- leaq .L2_44_inp_permd(%rip),%r10
- kmovw %r11d,%k1
-
- vmovq %rcx,%xmm21
- vmovdqa64 0(%r10),%ymm19
- vmovdqa64 32(%r10),%ymm20
- vpermq $0xcf,%ymm21,%ymm21
- vmovdqa64 64(%r10),%ymm22
-
- vmovdqu64 0(%rdi),%ymm16{%k7}{z}
- vmovdqu64 40(%rdi),%ymm3{%k7}{z}
- vmovdqu64 32(%rdi),%ymm4{%k7}{z}
- vmovdqu64 24(%rdi),%ymm5{%k7}{z}
-
- vmovdqa64 96(%r10),%ymm23
- vmovdqa64 128(%r10),%ymm24
-
- jmp .Loop_vpmadd52
-
-.align 32
-.Loop_vpmadd52:
- vmovdqu32 0(%rsi),%xmm18
- leaq 16(%rsi),%rsi
-
- vpermd %ymm18,%ymm19,%ymm18
- vpsrlvq %ymm20,%ymm18,%ymm18
- vpandq %ymm22,%ymm18,%ymm18
- vporq %ymm21,%ymm18,%ymm18
-
- vpaddq %ymm18,%ymm16,%ymm16
-
- vpermq $0,%ymm16,%ymm0{%k7}{z}
- vpermq $85,%ymm16,%ymm1{%k7}{z}
- vpermq $170,%ymm16,%ymm2{%k7}{z}
-
- vpxord %ymm16,%ymm16,%ymm16
- vpxord %ymm17,%ymm17,%ymm17
-
- vpmadd52luq %ymm3,%ymm0,%ymm16
- vpmadd52huq %ymm3,%ymm0,%ymm17
-
- vpmadd52luq %ymm4,%ymm1,%ymm16
- vpmadd52huq %ymm4,%ymm1,%ymm17
-
- vpmadd52luq %ymm5,%ymm2,%ymm16
- vpmadd52huq %ymm5,%ymm2,%ymm17
-
- vpsrlvq %ymm23,%ymm16,%ymm18
- vpsllvq %ymm24,%ymm17,%ymm17
- vpandq %ymm22,%ymm16,%ymm16
-
- vpaddq %ymm18,%ymm17,%ymm17
-
- vpermq $147,%ymm17,%ymm17
-
- vpaddq %ymm17,%ymm16,%ymm16
-
- vpsrlvq %ymm23,%ymm16,%ymm18
- vpandq %ymm22,%ymm16,%ymm16
-
- vpermq $147,%ymm18,%ymm18
-
- vpaddq %ymm18,%ymm16,%ymm16
-
- vpermq $147,%ymm16,%ymm18{%k1}{z}
-
- vpaddq %ymm18,%ymm16,%ymm16
- vpsllq $2,%ymm18,%ymm18
-
- vpaddq %ymm18,%ymm16,%ymm16
-
- decq %rax
- jnz .Loop_vpmadd52
-
- vmovdqu64 %ymm16,0(%rdi){%k7}
-
- testq %rdx,%rdx
- jnz .Lblocks_vpmadd52_4x
-
-.Lno_data_vpmadd52:
- .byte 0xf3,0xc3
-.size poly1305_blocks_vpmadd52,.-poly1305_blocks_vpmadd52
-.type poly1305_blocks_vpmadd52_4x,@function
-.align 32
-poly1305_blocks_vpmadd52_4x:
- shrq $4,%rdx
- jz .Lno_data_vpmadd52_4x
-
- shlq $40,%rcx
- movq 64(%rdi),%r8
-
-.Lblocks_vpmadd52_4x:
- vpbroadcastq %rcx,%ymm31
-
- vmovdqa64 .Lx_mask44(%rip),%ymm28
- movl $5,%eax
- vmovdqa64 .Lx_mask42(%rip),%ymm29
- kmovw %eax,%k1
-
- testq %r8,%r8
- js .Linit_vpmadd52
-
- vmovq 0(%rdi),%xmm0
- vmovq 8(%rdi),%xmm1
- vmovq 16(%rdi),%xmm2
-
- testq $3,%rdx
- jnz .Lblocks_vpmadd52_2x_do
-
-.Lblocks_vpmadd52_4x_do:
- vpbroadcastq 64(%rdi),%ymm3
- vpbroadcastq 96(%rdi),%ymm4
- vpbroadcastq 128(%rdi),%ymm5
- vpbroadcastq 160(%rdi),%ymm16
-
-.Lblocks_vpmadd52_4x_key_loaded:
- vpsllq $2,%ymm5,%ymm17
- vpaddq %ymm5,%ymm17,%ymm17
- vpsllq $2,%ymm17,%ymm17
-
- testq $7,%rdx
- jz .Lblocks_vpmadd52_8x
-
- vmovdqu64 0(%rsi),%ymm26
- vmovdqu64 32(%rsi),%ymm27
- leaq 64(%rsi),%rsi
-
- vpunpcklqdq %ymm27,%ymm26,%ymm25
- vpunpckhqdq %ymm27,%ymm26,%ymm27
-
-
-
- vpsrlq $24,%ymm27,%ymm26
- vporq %ymm31,%ymm26,%ymm26
- vpaddq %ymm26,%ymm2,%ymm2
- vpandq %ymm28,%ymm25,%ymm24
- vpsrlq $44,%ymm25,%ymm25
- vpsllq $20,%ymm27,%ymm27
- vporq %ymm27,%ymm25,%ymm25
- vpandq %ymm28,%ymm25,%ymm25
-
- subq $4,%rdx
- jz .Ltail_vpmadd52_4x
- jmp .Loop_vpmadd52_4x
- ud2
-
-.align 32
-.Linit_vpmadd52:
- vmovq 24(%rdi),%xmm16
- vmovq 56(%rdi),%xmm2
- vmovq 32(%rdi),%xmm17
- vmovq 40(%rdi),%xmm3
- vmovq 48(%rdi),%xmm4
-
- vmovdqa %ymm3,%ymm0
- vmovdqa %ymm4,%ymm1
- vmovdqa %ymm2,%ymm5
-
- movl $2,%eax
-
-.Lmul_init_vpmadd52:
- vpxorq %ymm18,%ymm18,%ymm18
- vpmadd52luq %ymm2,%ymm16,%ymm18
- vpxorq %ymm19,%ymm19,%ymm19
- vpmadd52huq %ymm2,%ymm16,%ymm19
- vpxorq %ymm20,%ymm20,%ymm20
- vpmadd52luq %ymm2,%ymm17,%ymm20
- vpxorq %ymm21,%ymm21,%ymm21
- vpmadd52huq %ymm2,%ymm17,%ymm21
- vpxorq %ymm22,%ymm22,%ymm22
- vpmadd52luq %ymm2,%ymm3,%ymm22
- vpxorq %ymm23,%ymm23,%ymm23
- vpmadd52huq %ymm2,%ymm3,%ymm23
-
- vpmadd52luq %ymm0,%ymm3,%ymm18
- vpmadd52huq %ymm0,%ymm3,%ymm19
- vpmadd52luq %ymm0,%ymm4,%ymm20
- vpmadd52huq %ymm0,%ymm4,%ymm21
- vpmadd52luq %ymm0,%ymm5,%ymm22
- vpmadd52huq %ymm0,%ymm5,%ymm23
-
- vpmadd52luq %ymm1,%ymm17,%ymm18
- vpmadd52huq %ymm1,%ymm17,%ymm19
- vpmadd52luq %ymm1,%ymm3,%ymm20
- vpmadd52huq %ymm1,%ymm3,%ymm21
- vpmadd52luq %ymm1,%ymm4,%ymm22
- vpmadd52huq %ymm1,%ymm4,%ymm23
-
-
-
- vpsrlq $44,%ymm18,%ymm30
- vpsllq $8,%ymm19,%ymm19
- vpandq %ymm28,%ymm18,%ymm0
- vpaddq %ymm30,%ymm19,%ymm19
-
- vpaddq %ymm19,%ymm20,%ymm20
-
- vpsrlq $44,%ymm20,%ymm30
- vpsllq $8,%ymm21,%ymm21
- vpandq %ymm28,%ymm20,%ymm1
- vpaddq %ymm30,%ymm21,%ymm21
-
- vpaddq %ymm21,%ymm22,%ymm22
-
- vpsrlq $42,%ymm22,%ymm30
- vpsllq $10,%ymm23,%ymm23
- vpandq %ymm29,%ymm22,%ymm2
- vpaddq %ymm30,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
- vpsllq $2,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
-
- vpsrlq $44,%ymm0,%ymm30
- vpandq %ymm28,%ymm0,%ymm0
-
- vpaddq %ymm30,%ymm1,%ymm1
-
- decl %eax
- jz .Ldone_init_vpmadd52
-
- vpunpcklqdq %ymm4,%ymm1,%ymm4
- vpbroadcastq %xmm1,%xmm1
- vpunpcklqdq %ymm5,%ymm2,%ymm5
- vpbroadcastq %xmm2,%xmm2
- vpunpcklqdq %ymm3,%ymm0,%ymm3
- vpbroadcastq %xmm0,%xmm0
-
- vpsllq $2,%ymm4,%ymm16
- vpsllq $2,%ymm5,%ymm17
- vpaddq %ymm4,%ymm16,%ymm16
- vpaddq %ymm5,%ymm17,%ymm17
- vpsllq $2,%ymm16,%ymm16
- vpsllq $2,%ymm17,%ymm17
-
- jmp .Lmul_init_vpmadd52
- ud2
-
-.align 32
-.Ldone_init_vpmadd52:
- vinserti128 $1,%xmm4,%ymm1,%ymm4
- vinserti128 $1,%xmm5,%ymm2,%ymm5
- vinserti128 $1,%xmm3,%ymm0,%ymm3
-
- vpermq $216,%ymm4,%ymm4
- vpermq $216,%ymm5,%ymm5
- vpermq $216,%ymm3,%ymm3
-
- vpsllq $2,%ymm4,%ymm16
- vpaddq %ymm4,%ymm16,%ymm16
- vpsllq $2,%ymm16,%ymm16
-
- vmovq 0(%rdi),%xmm0
- vmovq 8(%rdi),%xmm1
- vmovq 16(%rdi),%xmm2
-
- testq $3,%rdx
- jnz .Ldone_init_vpmadd52_2x
-
- vmovdqu64 %ymm3,64(%rdi)
- vpbroadcastq %xmm3,%ymm3
- vmovdqu64 %ymm4,96(%rdi)
- vpbroadcastq %xmm4,%ymm4
- vmovdqu64 %ymm5,128(%rdi)
- vpbroadcastq %xmm5,%ymm5
- vmovdqu64 %ymm16,160(%rdi)
- vpbroadcastq %xmm16,%ymm16
-
- jmp .Lblocks_vpmadd52_4x_key_loaded
- ud2
-
-.align 32
-.Ldone_init_vpmadd52_2x:
- vmovdqu64 %ymm3,64(%rdi)
- vpsrldq $8,%ymm3,%ymm3
- vmovdqu64 %ymm4,96(%rdi)
- vpsrldq $8,%ymm4,%ymm4
- vmovdqu64 %ymm5,128(%rdi)
- vpsrldq $8,%ymm5,%ymm5
- vmovdqu64 %ymm16,160(%rdi)
- vpsrldq $8,%ymm16,%ymm16
- jmp .Lblocks_vpmadd52_2x_key_loaded
- ud2
-
-.align 32
-.Lblocks_vpmadd52_2x_do:
- vmovdqu64 128+8(%rdi),%ymm5{%k1}{z}
- vmovdqu64 160+8(%rdi),%ymm16{%k1}{z}
- vmovdqu64 64+8(%rdi),%ymm3{%k1}{z}
- vmovdqu64 96+8(%rdi),%ymm4{%k1}{z}
-
-.Lblocks_vpmadd52_2x_key_loaded:
- vmovdqu64 0(%rsi),%ymm26
- vpxorq %ymm27,%ymm27,%ymm27
- leaq 32(%rsi),%rsi
-
- vpunpcklqdq %ymm27,%ymm26,%ymm25
- vpunpckhqdq %ymm27,%ymm26,%ymm27
-
-
-
- vpsrlq $24,%ymm27,%ymm26
- vporq %ymm31,%ymm26,%ymm26
- vpaddq %ymm26,%ymm2,%ymm2
- vpandq %ymm28,%ymm25,%ymm24
- vpsrlq $44,%ymm25,%ymm25
- vpsllq $20,%ymm27,%ymm27
- vporq %ymm27,%ymm25,%ymm25
- vpandq %ymm28,%ymm25,%ymm25
-
- jmp .Ltail_vpmadd52_2x
- ud2
-
-.align 32
-.Loop_vpmadd52_4x:
-
- vpaddq %ymm24,%ymm0,%ymm0
- vpaddq %ymm25,%ymm1,%ymm1
-
- vpxorq %ymm18,%ymm18,%ymm18
- vpmadd52luq %ymm2,%ymm16,%ymm18
- vpxorq %ymm19,%ymm19,%ymm19
- vpmadd52huq %ymm2,%ymm16,%ymm19
- vpxorq %ymm20,%ymm20,%ymm20
- vpmadd52luq %ymm2,%ymm17,%ymm20
- vpxorq %ymm21,%ymm21,%ymm21
- vpmadd52huq %ymm2,%ymm17,%ymm21
- vpxorq %ymm22,%ymm22,%ymm22
- vpmadd52luq %ymm2,%ymm3,%ymm22
- vpxorq %ymm23,%ymm23,%ymm23
- vpmadd52huq %ymm2,%ymm3,%ymm23
-
- vmovdqu64 0(%rsi),%ymm26
- vmovdqu64 32(%rsi),%ymm27
- leaq 64(%rsi),%rsi
- vpmadd52luq %ymm0,%ymm3,%ymm18
- vpmadd52huq %ymm0,%ymm3,%ymm19
- vpmadd52luq %ymm0,%ymm4,%ymm20
- vpmadd52huq %ymm0,%ymm4,%ymm21
- vpmadd52luq %ymm0,%ymm5,%ymm22
- vpmadd52huq %ymm0,%ymm5,%ymm23
-
- vpunpcklqdq %ymm27,%ymm26,%ymm25
- vpunpckhqdq %ymm27,%ymm26,%ymm27
- vpmadd52luq %ymm1,%ymm17,%ymm18
- vpmadd52huq %ymm1,%ymm17,%ymm19
- vpmadd52luq %ymm1,%ymm3,%ymm20
- vpmadd52huq %ymm1,%ymm3,%ymm21
- vpmadd52luq %ymm1,%ymm4,%ymm22
- vpmadd52huq %ymm1,%ymm4,%ymm23
-
-
-
- vpsrlq $44,%ymm18,%ymm30
- vpsllq $8,%ymm19,%ymm19
- vpandq %ymm28,%ymm18,%ymm0
- vpaddq %ymm30,%ymm19,%ymm19
-
- vpsrlq $24,%ymm27,%ymm26
- vporq %ymm31,%ymm26,%ymm26
- vpaddq %ymm19,%ymm20,%ymm20
-
- vpsrlq $44,%ymm20,%ymm30
- vpsllq $8,%ymm21,%ymm21
- vpandq %ymm28,%ymm20,%ymm1
- vpaddq %ymm30,%ymm21,%ymm21
-
- vpandq %ymm28,%ymm25,%ymm24
- vpsrlq $44,%ymm25,%ymm25
- vpsllq $20,%ymm27,%ymm27
- vpaddq %ymm21,%ymm22,%ymm22
-
- vpsrlq $42,%ymm22,%ymm30
- vpsllq $10,%ymm23,%ymm23
- vpandq %ymm29,%ymm22,%ymm2
- vpaddq %ymm30,%ymm23,%ymm23
-
- vpaddq %ymm26,%ymm2,%ymm2
- vpaddq %ymm23,%ymm0,%ymm0
- vpsllq $2,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
- vporq %ymm27,%ymm25,%ymm25
- vpandq %ymm28,%ymm25,%ymm25
-
- vpsrlq $44,%ymm0,%ymm30
- vpandq %ymm28,%ymm0,%ymm0
-
- vpaddq %ymm30,%ymm1,%ymm1
-
- subq $4,%rdx
- jnz .Loop_vpmadd52_4x
-
-.Ltail_vpmadd52_4x:
- vmovdqu64 128(%rdi),%ymm5
- vmovdqu64 160(%rdi),%ymm16
- vmovdqu64 64(%rdi),%ymm3
- vmovdqu64 96(%rdi),%ymm4
-
-.Ltail_vpmadd52_2x:
- vpsllq $2,%ymm5,%ymm17
- vpaddq %ymm5,%ymm17,%ymm17
- vpsllq $2,%ymm17,%ymm17
-
-
- vpaddq %ymm24,%ymm0,%ymm0
- vpaddq %ymm25,%ymm1,%ymm1
-
- vpxorq %ymm18,%ymm18,%ymm18
- vpmadd52luq %ymm2,%ymm16,%ymm18
- vpxorq %ymm19,%ymm19,%ymm19
- vpmadd52huq %ymm2,%ymm16,%ymm19
- vpxorq %ymm20,%ymm20,%ymm20
- vpmadd52luq %ymm2,%ymm17,%ymm20
- vpxorq %ymm21,%ymm21,%ymm21
- vpmadd52huq %ymm2,%ymm17,%ymm21
- vpxorq %ymm22,%ymm22,%ymm22
- vpmadd52luq %ymm2,%ymm3,%ymm22
- vpxorq %ymm23,%ymm23,%ymm23
- vpmadd52huq %ymm2,%ymm3,%ymm23
-
- vpmadd52luq %ymm0,%ymm3,%ymm18
- vpmadd52huq %ymm0,%ymm3,%ymm19
- vpmadd52luq %ymm0,%ymm4,%ymm20
- vpmadd52huq %ymm0,%ymm4,%ymm21
- vpmadd52luq %ymm0,%ymm5,%ymm22
- vpmadd52huq %ymm0,%ymm5,%ymm23
-
- vpmadd52luq %ymm1,%ymm17,%ymm18
- vpmadd52huq %ymm1,%ymm17,%ymm19
- vpmadd52luq %ymm1,%ymm3,%ymm20
- vpmadd52huq %ymm1,%ymm3,%ymm21
- vpmadd52luq %ymm1,%ymm4,%ymm22
- vpmadd52huq %ymm1,%ymm4,%ymm23
-
-
-
-
- movl $1,%eax
- kmovw %eax,%k1
- vpsrldq $8,%ymm18,%ymm24
- vpsrldq $8,%ymm19,%ymm0
- vpsrldq $8,%ymm20,%ymm25
- vpsrldq $8,%ymm21,%ymm1
- vpaddq %ymm24,%ymm18,%ymm18
- vpaddq %ymm0,%ymm19,%ymm19
- vpsrldq $8,%ymm22,%ymm26
- vpsrldq $8,%ymm23,%ymm2
- vpaddq %ymm25,%ymm20,%ymm20
- vpaddq %ymm1,%ymm21,%ymm21
- vpermq $0x2,%ymm18,%ymm24
- vpermq $0x2,%ymm19,%ymm0
- vpaddq %ymm26,%ymm22,%ymm22
- vpaddq %ymm2,%ymm23,%ymm23
-
- vpermq $0x2,%ymm20,%ymm25
- vpermq $0x2,%ymm21,%ymm1
- vpaddq %ymm24,%ymm18,%ymm18{%k1}{z}
- vpaddq %ymm0,%ymm19,%ymm19{%k1}{z}
- vpermq $0x2,%ymm22,%ymm26
- vpermq $0x2,%ymm23,%ymm2
- vpaddq %ymm25,%ymm20,%ymm20{%k1}{z}
- vpaddq %ymm1,%ymm21,%ymm21{%k1}{z}
- vpaddq %ymm26,%ymm22,%ymm22{%k1}{z}
- vpaddq %ymm2,%ymm23,%ymm23{%k1}{z}
-
-
-
- vpsrlq $44,%ymm18,%ymm30
- vpsllq $8,%ymm19,%ymm19
- vpandq %ymm28,%ymm18,%ymm0
- vpaddq %ymm30,%ymm19,%ymm19
-
- vpaddq %ymm19,%ymm20,%ymm20
-
- vpsrlq $44,%ymm20,%ymm30
- vpsllq $8,%ymm21,%ymm21
- vpandq %ymm28,%ymm20,%ymm1
- vpaddq %ymm30,%ymm21,%ymm21
-
- vpaddq %ymm21,%ymm22,%ymm22
-
- vpsrlq $42,%ymm22,%ymm30
- vpsllq $10,%ymm23,%ymm23
- vpandq %ymm29,%ymm22,%ymm2
- vpaddq %ymm30,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
- vpsllq $2,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
-
- vpsrlq $44,%ymm0,%ymm30
- vpandq %ymm28,%ymm0,%ymm0
-
- vpaddq %ymm30,%ymm1,%ymm1
-
-
- subq $2,%rdx
- ja .Lblocks_vpmadd52_4x_do
-
- vmovq %xmm0,0(%rdi)
- vmovq %xmm1,8(%rdi)
- vmovq %xmm2,16(%rdi)
- vzeroall
-
-.Lno_data_vpmadd52_4x:
- .byte 0xf3,0xc3
-.size poly1305_blocks_vpmadd52_4x,.-poly1305_blocks_vpmadd52_4x
-.type poly1305_blocks_vpmadd52_8x,@function
-.align 32
-poly1305_blocks_vpmadd52_8x:
- shrq $4,%rdx
- jz .Lno_data_vpmadd52_8x
-
- shlq $40,%rcx
- movq 64(%rdi),%r8
-
- vmovdqa64 .Lx_mask44(%rip),%ymm28
- vmovdqa64 .Lx_mask42(%rip),%ymm29
-
- testq %r8,%r8
- js .Linit_vpmadd52
-
- vmovq 0(%rdi),%xmm0
- vmovq 8(%rdi),%xmm1
- vmovq 16(%rdi),%xmm2
-
-.Lblocks_vpmadd52_8x:
-
-
-
- vmovdqu64 128(%rdi),%ymm5
- vmovdqu64 160(%rdi),%ymm16
- vmovdqu64 64(%rdi),%ymm3
- vmovdqu64 96(%rdi),%ymm4
-
- vpsllq $2,%ymm5,%ymm17
- vpaddq %ymm5,%ymm17,%ymm17
- vpsllq $2,%ymm17,%ymm17
-
- vpbroadcastq %xmm5,%ymm8
- vpbroadcastq %xmm3,%ymm6
- vpbroadcastq %xmm4,%ymm7
-
- vpxorq %ymm18,%ymm18,%ymm18
- vpmadd52luq %ymm8,%ymm16,%ymm18
- vpxorq %ymm19,%ymm19,%ymm19
- vpmadd52huq %ymm8,%ymm16,%ymm19
- vpxorq %ymm20,%ymm20,%ymm20
- vpmadd52luq %ymm8,%ymm17,%ymm20
- vpxorq %ymm21,%ymm21,%ymm21
- vpmadd52huq %ymm8,%ymm17,%ymm21
- vpxorq %ymm22,%ymm22,%ymm22
- vpmadd52luq %ymm8,%ymm3,%ymm22
- vpxorq %ymm23,%ymm23,%ymm23
- vpmadd52huq %ymm8,%ymm3,%ymm23
-
- vpmadd52luq %ymm6,%ymm3,%ymm18
- vpmadd52huq %ymm6,%ymm3,%ymm19
- vpmadd52luq %ymm6,%ymm4,%ymm20
- vpmadd52huq %ymm6,%ymm4,%ymm21
- vpmadd52luq %ymm6,%ymm5,%ymm22
- vpmadd52huq %ymm6,%ymm5,%ymm23
-
- vpmadd52luq %ymm7,%ymm17,%ymm18
- vpmadd52huq %ymm7,%ymm17,%ymm19
- vpmadd52luq %ymm7,%ymm3,%ymm20
- vpmadd52huq %ymm7,%ymm3,%ymm21
- vpmadd52luq %ymm7,%ymm4,%ymm22
- vpmadd52huq %ymm7,%ymm4,%ymm23
-
-
-
- vpsrlq $44,%ymm18,%ymm30
- vpsllq $8,%ymm19,%ymm19
- vpandq %ymm28,%ymm18,%ymm6
- vpaddq %ymm30,%ymm19,%ymm19
-
- vpaddq %ymm19,%ymm20,%ymm20
-
- vpsrlq $44,%ymm20,%ymm30
- vpsllq $8,%ymm21,%ymm21
- vpandq %ymm28,%ymm20,%ymm7
- vpaddq %ymm30,%ymm21,%ymm21
-
- vpaddq %ymm21,%ymm22,%ymm22
-
- vpsrlq $42,%ymm22,%ymm30
- vpsllq $10,%ymm23,%ymm23
- vpandq %ymm29,%ymm22,%ymm8
- vpaddq %ymm30,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm6,%ymm6
- vpsllq $2,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm6,%ymm6
-
- vpsrlq $44,%ymm6,%ymm30
- vpandq %ymm28,%ymm6,%ymm6
-
- vpaddq %ymm30,%ymm7,%ymm7
-
-
-
-
-
- vpunpcklqdq %ymm5,%ymm8,%ymm26
- vpunpckhqdq %ymm5,%ymm8,%ymm5
- vpunpcklqdq %ymm3,%ymm6,%ymm24
- vpunpckhqdq %ymm3,%ymm6,%ymm3
- vpunpcklqdq %ymm4,%ymm7,%ymm25
- vpunpckhqdq %ymm4,%ymm7,%ymm4
- vshufi64x2 $0x44,%zmm5,%zmm26,%zmm8
- vshufi64x2 $0x44,%zmm3,%zmm24,%zmm6
- vshufi64x2 $0x44,%zmm4,%zmm25,%zmm7
-
- vmovdqu64 0(%rsi),%zmm26
- vmovdqu64 64(%rsi),%zmm27
- leaq 128(%rsi),%rsi
-
- vpsllq $2,%zmm8,%zmm10
- vpsllq $2,%zmm7,%zmm9
- vpaddq %zmm8,%zmm10,%zmm10
- vpaddq %zmm7,%zmm9,%zmm9
- vpsllq $2,%zmm10,%zmm10
- vpsllq $2,%zmm9,%zmm9
-
- vpbroadcastq %rcx,%zmm31
- vpbroadcastq %xmm28,%zmm28
- vpbroadcastq %xmm29,%zmm29
-
- vpbroadcastq %xmm9,%zmm16
- vpbroadcastq %xmm10,%zmm17
- vpbroadcastq %xmm6,%zmm3
- vpbroadcastq %xmm7,%zmm4
- vpbroadcastq %xmm8,%zmm5
-
- vpunpcklqdq %zmm27,%zmm26,%zmm25
- vpunpckhqdq %zmm27,%zmm26,%zmm27
-
-
-
- vpsrlq $24,%zmm27,%zmm26
- vporq %zmm31,%zmm26,%zmm26
- vpaddq %zmm26,%zmm2,%zmm2
- vpandq %zmm28,%zmm25,%zmm24
- vpsrlq $44,%zmm25,%zmm25
- vpsllq $20,%zmm27,%zmm27
- vporq %zmm27,%zmm25,%zmm25
- vpandq %zmm28,%zmm25,%zmm25
-
- subq $8,%rdx
- jz .Ltail_vpmadd52_8x
- jmp .Loop_vpmadd52_8x
-
-.align 32
-.Loop_vpmadd52_8x:
-
- vpaddq %zmm24,%zmm0,%zmm0
- vpaddq %zmm25,%zmm1,%zmm1
-
- vpxorq %zmm18,%zmm18,%zmm18
- vpmadd52luq %zmm2,%zmm16,%zmm18
- vpxorq %zmm19,%zmm19,%zmm19
- vpmadd52huq %zmm2,%zmm16,%zmm19
- vpxorq %zmm20,%zmm20,%zmm20
- vpmadd52luq %zmm2,%zmm17,%zmm20
- vpxorq %zmm21,%zmm21,%zmm21
- vpmadd52huq %zmm2,%zmm17,%zmm21
- vpxorq %zmm22,%zmm22,%zmm22
- vpmadd52luq %zmm2,%zmm3,%zmm22
- vpxorq %zmm23,%zmm23,%zmm23
- vpmadd52huq %zmm2,%zmm3,%zmm23
-
- vmovdqu64 0(%rsi),%zmm26
- vmovdqu64 64(%rsi),%zmm27
- leaq 128(%rsi),%rsi
- vpmadd52luq %zmm0,%zmm3,%zmm18
- vpmadd52huq %zmm0,%zmm3,%zmm19
- vpmadd52luq %zmm0,%zmm4,%zmm20
- vpmadd52huq %zmm0,%zmm4,%zmm21
- vpmadd52luq %zmm0,%zmm5,%zmm22
- vpmadd52huq %zmm0,%zmm5,%zmm23
-
- vpunpcklqdq %zmm27,%zmm26,%zmm25
- vpunpckhqdq %zmm27,%zmm26,%zmm27
- vpmadd52luq %zmm1,%zmm17,%zmm18
- vpmadd52huq %zmm1,%zmm17,%zmm19
- vpmadd52luq %zmm1,%zmm3,%zmm20
- vpmadd52huq %zmm1,%zmm3,%zmm21
- vpmadd52luq %zmm1,%zmm4,%zmm22
- vpmadd52huq %zmm1,%zmm4,%zmm23
-
-
-
- vpsrlq $44,%zmm18,%zmm30
- vpsllq $8,%zmm19,%zmm19
- vpandq %zmm28,%zmm18,%zmm0
- vpaddq %zmm30,%zmm19,%zmm19
-
- vpsrlq $24,%zmm27,%zmm26
- vporq %zmm31,%zmm26,%zmm26
- vpaddq %zmm19,%zmm20,%zmm20
-
- vpsrlq $44,%zmm20,%zmm30
- vpsllq $8,%zmm21,%zmm21
- vpandq %zmm28,%zmm20,%zmm1
- vpaddq %zmm30,%zmm21,%zmm21
-
- vpandq %zmm28,%zmm25,%zmm24
- vpsrlq $44,%zmm25,%zmm25
- vpsllq $20,%zmm27,%zmm27
- vpaddq %zmm21,%zmm22,%zmm22
-
- vpsrlq $42,%zmm22,%zmm30
- vpsllq $10,%zmm23,%zmm23
- vpandq %zmm29,%zmm22,%zmm2
- vpaddq %zmm30,%zmm23,%zmm23
-
- vpaddq %zmm26,%zmm2,%zmm2
- vpaddq %zmm23,%zmm0,%zmm0
- vpsllq $2,%zmm23,%zmm23
-
- vpaddq %zmm23,%zmm0,%zmm0
- vporq %zmm27,%zmm25,%zmm25
- vpandq %zmm28,%zmm25,%zmm25
-
- vpsrlq $44,%zmm0,%zmm30
- vpandq %zmm28,%zmm0,%zmm0
-
- vpaddq %zmm30,%zmm1,%zmm1
-
- subq $8,%rdx
- jnz .Loop_vpmadd52_8x
-
-.Ltail_vpmadd52_8x:
-
- vpaddq %zmm24,%zmm0,%zmm0
- vpaddq %zmm25,%zmm1,%zmm1
-
- vpxorq %zmm18,%zmm18,%zmm18
- vpmadd52luq %zmm2,%zmm9,%zmm18
- vpxorq %zmm19,%zmm19,%zmm19
- vpmadd52huq %zmm2,%zmm9,%zmm19
- vpxorq %zmm20,%zmm20,%zmm20
- vpmadd52luq %zmm2,%zmm10,%zmm20
- vpxorq %zmm21,%zmm21,%zmm21
- vpmadd52huq %zmm2,%zmm10,%zmm21
- vpxorq %zmm22,%zmm22,%zmm22
- vpmadd52luq %zmm2,%zmm6,%zmm22
- vpxorq %zmm23,%zmm23,%zmm23
- vpmadd52huq %zmm2,%zmm6,%zmm23
-
- vpmadd52luq %zmm0,%zmm6,%zmm18
- vpmadd52huq %zmm0,%zmm6,%zmm19
- vpmadd52luq %zmm0,%zmm7,%zmm20
- vpmadd52huq %zmm0,%zmm7,%zmm21
- vpmadd52luq %zmm0,%zmm8,%zmm22
- vpmadd52huq %zmm0,%zmm8,%zmm23
-
- vpmadd52luq %zmm1,%zmm10,%zmm18
- vpmadd52huq %zmm1,%zmm10,%zmm19
- vpmadd52luq %zmm1,%zmm6,%zmm20
- vpmadd52huq %zmm1,%zmm6,%zmm21
- vpmadd52luq %zmm1,%zmm7,%zmm22
- vpmadd52huq %zmm1,%zmm7,%zmm23
-
-
-
-
- movl $1,%eax
- kmovw %eax,%k1
- vpsrldq $8,%zmm18,%zmm24
- vpsrldq $8,%zmm19,%zmm0
- vpsrldq $8,%zmm20,%zmm25
- vpsrldq $8,%zmm21,%zmm1
- vpaddq %zmm24,%zmm18,%zmm18
- vpaddq %zmm0,%zmm19,%zmm19
- vpsrldq $8,%zmm22,%zmm26
- vpsrldq $8,%zmm23,%zmm2
- vpaddq %zmm25,%zmm20,%zmm20
- vpaddq %zmm1,%zmm21,%zmm21
- vpermq $0x2,%zmm18,%zmm24
- vpermq $0x2,%zmm19,%zmm0
- vpaddq %zmm26,%zmm22,%zmm22
- vpaddq %zmm2,%zmm23,%zmm23
-
- vpermq $0x2,%zmm20,%zmm25
- vpermq $0x2,%zmm21,%zmm1
- vpaddq %zmm24,%zmm18,%zmm18
- vpaddq %zmm0,%zmm19,%zmm19
- vpermq $0x2,%zmm22,%zmm26
- vpermq $0x2,%zmm23,%zmm2
- vpaddq %zmm25,%zmm20,%zmm20
- vpaddq %zmm1,%zmm21,%zmm21
- vextracti64x4 $1,%zmm18,%ymm24
- vextracti64x4 $1,%zmm19,%ymm0
- vpaddq %zmm26,%zmm22,%zmm22
- vpaddq %zmm2,%zmm23,%zmm23
-
- vextracti64x4 $1,%zmm20,%ymm25
- vextracti64x4 $1,%zmm21,%ymm1
- vextracti64x4 $1,%zmm22,%ymm26
- vextracti64x4 $1,%zmm23,%ymm2
- vpaddq %ymm24,%ymm18,%ymm18{%k1}{z}
- vpaddq %ymm0,%ymm19,%ymm19{%k1}{z}
- vpaddq %ymm25,%ymm20,%ymm20{%k1}{z}
- vpaddq %ymm1,%ymm21,%ymm21{%k1}{z}
- vpaddq %ymm26,%ymm22,%ymm22{%k1}{z}
- vpaddq %ymm2,%ymm23,%ymm23{%k1}{z}
-
-
-
- vpsrlq $44,%ymm18,%ymm30
- vpsllq $8,%ymm19,%ymm19
- vpandq %ymm28,%ymm18,%ymm0
- vpaddq %ymm30,%ymm19,%ymm19
-
- vpaddq %ymm19,%ymm20,%ymm20
-
- vpsrlq $44,%ymm20,%ymm30
- vpsllq $8,%ymm21,%ymm21
- vpandq %ymm28,%ymm20,%ymm1
- vpaddq %ymm30,%ymm21,%ymm21
-
- vpaddq %ymm21,%ymm22,%ymm22
-
- vpsrlq $42,%ymm22,%ymm30
- vpsllq $10,%ymm23,%ymm23
- vpandq %ymm29,%ymm22,%ymm2
- vpaddq %ymm30,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
- vpsllq $2,%ymm23,%ymm23
-
- vpaddq %ymm23,%ymm0,%ymm0
-
- vpsrlq $44,%ymm0,%ymm30
- vpandq %ymm28,%ymm0,%ymm0
-
- vpaddq %ymm30,%ymm1,%ymm1
-
-
-
- vmovq %xmm0,0(%rdi)
- vmovq %xmm1,8(%rdi)
- vmovq %xmm2,16(%rdi)
- vzeroall
-
-.Lno_data_vpmadd52_8x:
- .byte 0xf3,0xc3
-.size poly1305_blocks_vpmadd52_8x,.-poly1305_blocks_vpmadd52_8x
-.type poly1305_emit_base2_44,@function
-.align 32
-poly1305_emit_base2_44:
- movq 0(%rdi),%r8
- movq 8(%rdi),%r9
- movq 16(%rdi),%r10
-
- movq %r9,%rax
- shrq $20,%r9
- shlq $44,%rax
- movq %r10,%rcx
- shrq $40,%r10
- shlq $24,%rcx
-
- addq %rax,%r8
- adcq %rcx,%r9
- adcq $0,%r10
-
- movq %r8,%rax
- addq $5,%r8
- movq %r9,%rcx
- adcq $0,%r9
- adcq $0,%r10
- shrq $2,%r10
- cmovnzq %r8,%rax
- cmovnzq %r9,%rcx
-
- addq 0(%rdx),%rax
- adcq 8(%rdx),%rcx
- movq %rax,0(%rsi)
- movq %rcx,8(%rsi)
-
- .byte 0xf3,0xc3
-.size poly1305_emit_base2_44,.-poly1305_emit_base2_44
-.align 64
-.Lconst:
-.Lmask24:
-.long 0x0ffffff,0,0x0ffffff,0,0x0ffffff,0,0x0ffffff,0
-.L129:
-.long 16777216,0,16777216,0,16777216,0,16777216,0
-.Lmask26:
-.long 0x3ffffff,0,0x3ffffff,0,0x3ffffff,0,0x3ffffff,0
-.Lpermd_avx2:
-.long 2,2,2,3,2,0,2,1
-.Lpermd_avx512:
-.long 0,0,0,1, 0,2,0,3, 0,4,0,5, 0,6,0,7
+ leaq -8(%r10),%rsp
-.L2_44_inp_permd:
-.long 0,1,1,2,2,3,7,7
-.L2_44_inp_shift:
-.quad 0,12,24,64
-.L2_44_mask:
-.quad 0xfffffffffff,0xfffffffffff,0x3ffffffffff,0xffffffffffffffff
-.L2_44_shift_rgt:
-.quad 44,44,42,64
-.L2_44_shift_lft:
-.quad 8,8,10,64
+ ret
-.align 64
-.Lx_mask44:
-.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff
-.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff
-.Lx_mask42:
-.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff
-.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff
-.byte 80,111,108,121,49,51,48,53,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
-.align 16
-.globl xor128_encrypt_n_pad
-.type xor128_encrypt_n_pad,@function
-.align 16
-xor128_encrypt_n_pad:
- subq %rdx,%rsi
- subq %rdx,%rdi
- movq %rcx,%r10
- shrq $4,%rcx
- jz .Ltail_enc
- nop
-.Loop_enc_xmm:
- movdqu (%rsi,%rdx,1),%xmm0
- pxor (%rdx),%xmm0
- movdqu %xmm0,(%rdi,%rdx,1)
- movdqa %xmm0,(%rdx)
- leaq 16(%rdx),%rdx
- decq %rcx
- jnz .Loop_enc_xmm
-
- andq $15,%r10
- jz .Ldone_enc
-
-.Ltail_enc:
- movq $16,%rcx
- subq %r10,%rcx
- xorl %eax,%eax
-.Loop_enc_byte:
- movb (%rsi,%rdx,1),%al
- xorb (%rdx),%al
- movb %al,(%rdi,%rdx,1)
- movb %al,(%rdx)
- leaq 1(%rdx),%rdx
- decq %r10
- jnz .Loop_enc_byte
-
- xorl %eax,%eax
-.Loop_enc_pad:
- movb %al,(%rdx)
- leaq 1(%rdx),%rdx
- decq %rcx
- jnz .Loop_enc_pad
-
-.Ldone_enc:
- movq %rdx,%rax
- .byte 0xf3,0xc3
-.size xor128_encrypt_n_pad,.-xor128_encrypt_n_pad
-
-.globl xor128_decrypt_n_pad
-.type xor128_decrypt_n_pad,@function
-.align 16
-xor128_decrypt_n_pad:
- subq %rdx,%rsi
- subq %rdx,%rdi
- movq %rcx,%r10
- shrq $4,%rcx
- jz .Ltail_dec
- nop
-.Loop_dec_xmm:
- movdqu (%rsi,%rdx,1),%xmm0
- movdqa (%rdx),%xmm1
- pxor %xmm0,%xmm1
- movdqu %xmm1,(%rdi,%rdx,1)
- movdqa %xmm0,(%rdx)
- leaq 16(%rdx),%rdx
- decq %rcx
- jnz .Loop_dec_xmm
-
- pxor %xmm1,%xmm1
- andq $15,%r10
- jz .Ldone_dec
-
-.Ltail_dec:
- movq $16,%rcx
- subq %r10,%rcx
- xorl %eax,%eax
- xorq %r11,%r11
-.Loop_dec_byte:
- movb (%rsi,%rdx,1),%r11b
- movb (%rdx),%al
- xorb %r11b,%al
- movb %al,(%rdi,%rdx,1)
- movb %r11b,(%rdx)
- leaq 1(%rdx),%rdx
- decq %r10
- jnz .Loop_dec_byte
-
- xorl %eax,%eax
-.Loop_dec_pad:
- movb %al,(%rdx)
- leaq 1(%rdx),%rdx
- decq %rcx
- jnz .Loop_dec_pad
-
-.Ldone_dec:
- movq %rdx,%rax
- .byte 0xf3,0xc3
-.size xor128_decrypt_n_pad,.-xor128_decrypt_n_pad
+ENDPROC(poly1305_blocks_avx512)
+#endif /* CONFIG_AS_AVX512 */
diff --git a/lib/zinc/poly1305/poly1305.c b/lib/zinc/poly1305/poly1305.c
index 6c6c64035efb..51af7045cac8 100644
--- a/lib/zinc/poly1305/poly1305.c
+++ b/lib/zinc/poly1305/poly1305.c
@@ -16,6 +16,9 @@
#include <linux/module.h>
#include <linux/init.h>
+#if defined(CONFIG_ZINC_ARCH_X86_64)
+#include "poly1305-x86_64-glue.c"
+#else
static inline bool poly1305_init_arch(void *ctx,
const u8 key[POLY1305_KEY_SIZE])
{
@@ -37,6 +40,7 @@ static bool *const poly1305_nobs[] __initconst = { };
static void __init poly1305_fpu_init(void)
{
}
+#endif
#if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__)
#include "poly1305-donna64.h"
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 14/28] zinc: import Andy Polyakov's Poly1305 ARM and ARM64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (10 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 13/28] zinc: " Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 15/28] zinc: " Jason A. Donenfeld
` (11 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Andy Polyakov, Russell King, linux-arm-kernel,
Samuel Neves, Jean-Philippe Aumasson, Andy Lutomirski,
Andrew Morton, Linus Torvalds, kernel-hardening, linux-crypto
These NEON and non-NEON implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
be the same as OpenSSL's commit 5bb1cd2292b388263a0cc05392bb99141212aa53
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Based-on-code-from: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/poly1305/poly1305-arm-cryptogams.S | 1172 +++++++++++++++++
lib/zinc/poly1305/poly1305-arm64-cryptogams.S | 869 ++++++++++++
2 files changed, 2041 insertions(+)
create mode 100644 lib/zinc/poly1305/poly1305-arm-cryptogams.S
create mode 100644 lib/zinc/poly1305/poly1305-arm64-cryptogams.S
diff --git a/lib/zinc/poly1305/poly1305-arm-cryptogams.S b/lib/zinc/poly1305/poly1305-arm-cryptogams.S
new file mode 100644
index 000000000000..884b465030e4
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-arm-cryptogams.S
@@ -0,0 +1,1172 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+#include "arm_arch.h"
+
+.text
+#if defined(__thumb2__)
+.syntax unified
+.thumb
+#else
+.code 32
+#endif
+
+.globl poly1305_emit
+.globl poly1305_blocks
+.globl poly1305_init
+.type poly1305_init,%function
+.align 5
+poly1305_init:
+.Lpoly1305_init:
+ stmdb sp!,{r4-r11}
+
+ eor r3,r3,r3
+ cmp r1,#0
+ str r3,[r0,#0] @ zero hash value
+ str r3,[r0,#4]
+ str r3,[r0,#8]
+ str r3,[r0,#12]
+ str r3,[r0,#16]
+ str r3,[r0,#36] @ is_base2_26
+ add r0,r0,#20
+
+#ifdef __thumb2__
+ it eq
+#endif
+ moveq r0,#0
+ beq .Lno_key
+
+#if __ARM_MAX_ARCH__>=7
+ adr r11,.Lpoly1305_init
+ ldr r12,.LOPENSSL_armcap
+#endif
+ ldrb r4,[r1,#0]
+ mov r10,#0x0fffffff
+ ldrb r5,[r1,#1]
+ and r3,r10,#-4 @ 0x0ffffffc
+ ldrb r6,[r1,#2]
+ ldrb r7,[r1,#3]
+ orr r4,r4,r5,lsl#8
+ ldrb r5,[r1,#4]
+ orr r4,r4,r6,lsl#16
+ ldrb r6,[r1,#5]
+ orr r4,r4,r7,lsl#24
+ ldrb r7,[r1,#6]
+ and r4,r4,r10
+
+#if __ARM_MAX_ARCH__>=7
+ ldr r12,[r11,r12] @ OPENSSL_armcap_P
+# ifdef __APPLE__
+ ldr r12,[r12]
+# endif
+#endif
+ ldrb r8,[r1,#7]
+ orr r5,r5,r6,lsl#8
+ ldrb r6,[r1,#8]
+ orr r5,r5,r7,lsl#16
+ ldrb r7,[r1,#9]
+ orr r5,r5,r8,lsl#24
+ ldrb r8,[r1,#10]
+ and r5,r5,r3
+
+#if __ARM_MAX_ARCH__>=7
+ tst r12,#ARMV7_NEON @ check for NEON
+# ifdef __APPLE__
+ adr r9,poly1305_blocks_neon
+ adr r11,poly1305_blocks
+# ifdef __thumb2__
+ it ne
+# endif
+ movne r11,r9
+ adr r12,poly1305_emit
+ adr r10,poly1305_emit_neon
+# ifdef __thumb2__
+ it ne
+# endif
+ movne r12,r10
+# else
+# ifdef __thumb2__
+ itete eq
+# endif
+ addeq r12,r11,#(poly1305_emit-.Lpoly1305_init)
+ addne r12,r11,#(poly1305_emit_neon-.Lpoly1305_init)
+ addeq r11,r11,#(poly1305_blocks-.Lpoly1305_init)
+ addne r11,r11,#(poly1305_blocks_neon-.Lpoly1305_init)
+# endif
+# ifdef __thumb2__
+ orr r12,r12,#1 @ thumb-ify address
+ orr r11,r11,#1
+# endif
+#endif
+ ldrb r9,[r1,#11]
+ orr r6,r6,r7,lsl#8
+ ldrb r7,[r1,#12]
+ orr r6,r6,r8,lsl#16
+ ldrb r8,[r1,#13]
+ orr r6,r6,r9,lsl#24
+ ldrb r9,[r1,#14]
+ and r6,r6,r3
+
+ ldrb r10,[r1,#15]
+ orr r7,r7,r8,lsl#8
+ str r4,[r0,#0]
+ orr r7,r7,r9,lsl#16
+ str r5,[r0,#4]
+ orr r7,r7,r10,lsl#24
+ str r6,[r0,#8]
+ and r7,r7,r3
+ str r7,[r0,#12]
+#if __ARM_MAX_ARCH__>=7
+ stmia r2,{r11,r12} @ fill functions table
+ mov r0,#1
+#else
+ mov r0,#0
+#endif
+.Lno_key:
+ ldmia sp!,{r4-r11}
+#if __ARM_ARCH__>=5
+ bx lr @ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr @ be binary compatible with V4, yet
+ .word 0xe12fff1e @ interoperable with Thumb ISA:-)
+#endif
+.size poly1305_init,.-poly1305_init
+.type poly1305_blocks,%function
+.align 5
+poly1305_blocks:
+.Lpoly1305_blocks:
+ stmdb sp!,{r3-r11,lr}
+
+ ands r2,r2,#-16
+ beq .Lno_data
+
+ cmp r3,#0
+ add r2,r2,r1 @ end pointer
+ sub sp,sp,#32
+
+ ldmia r0,{r4-r12} @ load context
+
+ str r0,[sp,#12] @ offload stuff
+ mov lr,r1
+ str r2,[sp,#16]
+ str r10,[sp,#20]
+ str r11,[sp,#24]
+ str r12,[sp,#28]
+ b .Loop
+
+.Loop:
+#if __ARM_ARCH__<7
+ ldrb r0,[lr],#16 @ load input
+# ifdef __thumb2__
+ it hi
+# endif
+ addhi r8,r8,#1 @ 1<<128
+ ldrb r1,[lr,#-15]
+ ldrb r2,[lr,#-14]
+ ldrb r3,[lr,#-13]
+ orr r1,r0,r1,lsl#8
+ ldrb r0,[lr,#-12]
+ orr r2,r1,r2,lsl#16
+ ldrb r1,[lr,#-11]
+ orr r3,r2,r3,lsl#24
+ ldrb r2,[lr,#-10]
+ adds r4,r4,r3 @ accumulate input
+
+ ldrb r3,[lr,#-9]
+ orr r1,r0,r1,lsl#8
+ ldrb r0,[lr,#-8]
+ orr r2,r1,r2,lsl#16
+ ldrb r1,[lr,#-7]
+ orr r3,r2,r3,lsl#24
+ ldrb r2,[lr,#-6]
+ adcs r5,r5,r3
+
+ ldrb r3,[lr,#-5]
+ orr r1,r0,r1,lsl#8
+ ldrb r0,[lr,#-4]
+ orr r2,r1,r2,lsl#16
+ ldrb r1,[lr,#-3]
+ orr r3,r2,r3,lsl#24
+ ldrb r2,[lr,#-2]
+ adcs r6,r6,r3
+
+ ldrb r3,[lr,#-1]
+ orr r1,r0,r1,lsl#8
+ str lr,[sp,#8] @ offload input pointer
+ orr r2,r1,r2,lsl#16
+ add r10,r10,r10,lsr#2
+ orr r3,r2,r3,lsl#24
+#else
+ ldr r0,[lr],#16 @ load input
+# ifdef __thumb2__
+ it hi
+# endif
+ addhi r8,r8,#1 @ padbit
+ ldr r1,[lr,#-12]
+ ldr r2,[lr,#-8]
+ ldr r3,[lr,#-4]
+# ifdef __ARMEB__
+ rev r0,r0
+ rev r1,r1
+ rev r2,r2
+ rev r3,r3
+# endif
+ adds r4,r4,r0 @ accumulate input
+ str lr,[sp,#8] @ offload input pointer
+ adcs r5,r5,r1
+ add r10,r10,r10,lsr#2
+ adcs r6,r6,r2
+#endif
+ add r11,r11,r11,lsr#2
+ adcs r7,r7,r3
+ add r12,r12,r12,lsr#2
+
+ umull r2,r3,r5,r9
+ adc r8,r8,#0
+ umull r0,r1,r4,r9
+ umlal r2,r3,r8,r10
+ umlal r0,r1,r7,r10
+ ldr r10,[sp,#20] @ reload r10
+ umlal r2,r3,r6,r12
+ umlal r0,r1,r5,r12
+ umlal r2,r3,r7,r11
+ umlal r0,r1,r6,r11
+ umlal r2,r3,r4,r10
+ str r0,[sp,#0] @ future r4
+ mul r0,r11,r8
+ ldr r11,[sp,#24] @ reload r11
+ adds r2,r2,r1 @ d1+=d0>>32
+ eor r1,r1,r1
+ adc lr,r3,#0 @ future r6
+ str r2,[sp,#4] @ future r5
+
+ mul r2,r12,r8
+ eor r3,r3,r3
+ umlal r0,r1,r7,r12
+ ldr r12,[sp,#28] @ reload r12
+ umlal r2,r3,r7,r9
+ umlal r0,r1,r6,r9
+ umlal r2,r3,r6,r10
+ umlal r0,r1,r5,r10
+ umlal r2,r3,r5,r11
+ umlal r0,r1,r4,r11
+ umlal r2,r3,r4,r12
+ ldr r4,[sp,#0]
+ mul r8,r9,r8
+ ldr r5,[sp,#4]
+
+ adds r6,lr,r0 @ d2+=d1>>32
+ ldr lr,[sp,#8] @ reload input pointer
+ adc r1,r1,#0
+ adds r7,r2,r1 @ d3+=d2>>32
+ ldr r0,[sp,#16] @ reload end pointer
+ adc r3,r3,#0
+ add r8,r8,r3 @ h4+=d3>>32
+
+ and r1,r8,#-4
+ and r8,r8,#3
+ add r1,r1,r1,lsr#2 @ *=5
+ adds r4,r4,r1
+ adcs r5,r5,#0
+ adcs r6,r6,#0
+ adcs r7,r7,#0
+ adc r8,r8,#0
+
+ cmp r0,lr @ done yet?
+ bhi .Loop
+
+ ldr r0,[sp,#12]
+ add sp,sp,#32
+ stmia r0,{r4-r8} @ store the result
+
+.Lno_data:
+#if __ARM_ARCH__>=5
+ ldmia sp!,{r3-r11,pc}
+#else
+ ldmia sp!,{r3-r11,lr}
+ tst lr,#1
+ moveq pc,lr @ be binary compatible with V4, yet
+ .word 0xe12fff1e @ interoperable with Thumb ISA:-)
+#endif
+.size poly1305_blocks,.-poly1305_blocks
+.type poly1305_emit,%function
+.align 5
+poly1305_emit:
+ stmdb sp!,{r4-r11}
+.Lpoly1305_emit_enter:
+
+ ldmia r0,{r3-r7}
+ adds r8,r3,#5 @ compare to modulus
+ adcs r9,r4,#0
+ adcs r10,r5,#0
+ adcs r11,r6,#0
+ adc r7,r7,#0
+ tst r7,#4 @ did it carry/borrow?
+
+#ifdef __thumb2__
+ it ne
+#endif
+ movne r3,r8
+ ldr r8,[r2,#0]
+#ifdef __thumb2__
+ it ne
+#endif
+ movne r4,r9
+ ldr r9,[r2,#4]
+#ifdef __thumb2__
+ it ne
+#endif
+ movne r5,r10
+ ldr r10,[r2,#8]
+#ifdef __thumb2__
+ it ne
+#endif
+ movne r6,r11
+ ldr r11,[r2,#12]
+
+ adds r3,r3,r8
+ adcs r4,r4,r9
+ adcs r5,r5,r10
+ adc r6,r6,r11
+
+#if __ARM_ARCH__>=7
+# ifdef __ARMEB__
+ rev r3,r3
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+# endif
+ str r3,[r1,#0]
+ str r4,[r1,#4]
+ str r5,[r1,#8]
+ str r6,[r1,#12]
+#else
+ strb r3,[r1,#0]
+ mov r3,r3,lsr#8
+ strb r4,[r1,#4]
+ mov r4,r4,lsr#8
+ strb r5,[r1,#8]
+ mov r5,r5,lsr#8
+ strb r6,[r1,#12]
+ mov r6,r6,lsr#8
+
+ strb r3,[r1,#1]
+ mov r3,r3,lsr#8
+ strb r4,[r1,#5]
+ mov r4,r4,lsr#8
+ strb r5,[r1,#9]
+ mov r5,r5,lsr#8
+ strb r6,[r1,#13]
+ mov r6,r6,lsr#8
+
+ strb r3,[r1,#2]
+ mov r3,r3,lsr#8
+ strb r4,[r1,#6]
+ mov r4,r4,lsr#8
+ strb r5,[r1,#10]
+ mov r5,r5,lsr#8
+ strb r6,[r1,#14]
+ mov r6,r6,lsr#8
+
+ strb r3,[r1,#3]
+ strb r4,[r1,#7]
+ strb r5,[r1,#11]
+ strb r6,[r1,#15]
+#endif
+ ldmia sp!,{r4-r11}
+#if __ARM_ARCH__>=5
+ bx lr @ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr @ be binary compatible with V4, yet
+ .word 0xe12fff1e @ interoperable with Thumb ISA:-)
+#endif
+.size poly1305_emit,.-poly1305_emit
+#if __ARM_MAX_ARCH__>=7
+.fpu neon
+
+.type poly1305_init_neon,%function
+.align 5
+poly1305_init_neon:
+ ldr r4,[r0,#20] @ load key base 2^32
+ ldr r5,[r0,#24]
+ ldr r6,[r0,#28]
+ ldr r7,[r0,#32]
+
+ and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26
+ mov r3,r4,lsr#26
+ mov r4,r5,lsr#20
+ orr r3,r3,r5,lsl#6
+ mov r5,r6,lsr#14
+ orr r4,r4,r6,lsl#12
+ mov r6,r7,lsr#8
+ orr r5,r5,r7,lsl#18
+ and r3,r3,#0x03ffffff
+ and r4,r4,#0x03ffffff
+ and r5,r5,#0x03ffffff
+
+ vdup.32 d0,r2 @ r^1 in both lanes
+ add r2,r3,r3,lsl#2 @ *5
+ vdup.32 d1,r3
+ add r3,r4,r4,lsl#2
+ vdup.32 d2,r2
+ vdup.32 d3,r4
+ add r4,r5,r5,lsl#2
+ vdup.32 d4,r3
+ vdup.32 d5,r5
+ add r5,r6,r6,lsl#2
+ vdup.32 d6,r4
+ vdup.32 d7,r6
+ vdup.32 d8,r5
+
+ mov r5,#2 @ counter
+
+.Lsquare_neon:
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4
+ @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4
+ @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4
+ @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4
+ @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4
+
+ vmull.u32 q5,d0,d0[1]
+ vmull.u32 q6,d1,d0[1]
+ vmull.u32 q7,d3,d0[1]
+ vmull.u32 q8,d5,d0[1]
+ vmull.u32 q9,d7,d0[1]
+
+ vmlal.u32 q5,d7,d2[1]
+ vmlal.u32 q6,d0,d1[1]
+ vmlal.u32 q7,d1,d1[1]
+ vmlal.u32 q8,d3,d1[1]
+ vmlal.u32 q9,d5,d1[1]
+
+ vmlal.u32 q5,d5,d4[1]
+ vmlal.u32 q6,d7,d4[1]
+ vmlal.u32 q8,d1,d3[1]
+ vmlal.u32 q7,d0,d3[1]
+ vmlal.u32 q9,d3,d3[1]
+
+ vmlal.u32 q5,d3,d6[1]
+ vmlal.u32 q8,d0,d5[1]
+ vmlal.u32 q6,d5,d6[1]
+ vmlal.u32 q7,d7,d6[1]
+ vmlal.u32 q9,d1,d5[1]
+
+ vmlal.u32 q8,d7,d8[1]
+ vmlal.u32 q5,d1,d8[1]
+ vmlal.u32 q6,d3,d8[1]
+ vmlal.u32 q7,d5,d8[1]
+ vmlal.u32 q9,d0,d7[1]
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ lazy reduction as discussed in "NEON crypto" by D.J. Bernstein
+ @ and P. Schwabe
+ @
+ @ H0>>+H1>>+H2>>+H3>>+H4
+ @ H3>>+H4>>*5+H0>>+H1
+ @
+ @ Trivia.
+ @
+ @ Result of multiplication of n-bit number by m-bit number is
+ @ n+m bits wide. However! Even though 2^n is a n+1-bit number,
+ @ m-bit number multiplied by 2^n is still n+m bits wide.
+ @
+ @ Sum of two n-bit numbers is n+1 bits wide, sum of three - n+2,
+ @ and so is sum of four. Sum of 2^m n-m-bit numbers and n-bit
+ @ one is n+1 bits wide.
+ @
+ @ >>+ denotes Hnext += Hn>>26, Hn &= 0x3ffffff. This means that
+ @ H0, H2, H3 are guaranteed to be 26 bits wide, while H1 and H4
+ @ can be 27. However! In cases when their width exceeds 26 bits
+ @ they are limited by 2^26+2^6. This in turn means that *sum*
+ @ of the products with these values can still be viewed as sum
+ @ of 52-bit numbers as long as the amount of addends is not a
+ @ power of 2. For example,
+ @
+ @ H4 = H4*R0 + H3*R1 + H2*R2 + H1*R3 + H0 * R4,
+ @
+ @ which can't be larger than 5 * (2^26 + 2^6) * (2^26 + 2^6), or
+ @ 5 * (2^52 + 2*2^32 + 2^12), which in turn is smaller than
+ @ 8 * (2^52) or 2^55. However, the value is then multiplied by
+ @ by 5, so we should be looking at 5 * 5 * (2^52 + 2^33 + 2^12),
+ @ which is less than 32 * (2^52) or 2^57. And when processing
+ @ data we are looking at triple as many addends...
+ @
+ @ In key setup procedure pre-reduced H0 is limited by 5*4+1 and
+ @ 5*H4 - by 5*5 52-bit addends, or 57 bits. But when hashing the
+ @ input H0 is limited by (5*4+1)*3 addends, or 58 bits, while
+ @ 5*H4 by 5*5*3, or 59[!] bits. How is this relevant? vmlal.u32
+ @ instruction accepts 2x32-bit input and writes 2x64-bit result.
+ @ This means that result of reduction have to be compressed upon
+ @ loop wrap-around. This can be done in the process of reduction
+ @ to minimize amount of instructions [as well as amount of
+ @ 128-bit instructions, which benefits low-end processors], but
+ @ one has to watch for H2 (which is narrower than H0) and 5*H4
+ @ not being wider than 58 bits, so that result of right shift
+ @ by 26 bits fits in 32 bits. This is also useful on x86,
+ @ because it allows to use paddd in place for paddq, which
+ @ benefits Atom, where paddq is ridiculously slow.
+
+ vshr.u64 q15,q8,#26
+ vmovn.i64 d16,q8
+ vshr.u64 q4,q5,#26
+ vmovn.i64 d10,q5
+ vadd.i64 q9,q9,q15 @ h3 -> h4
+ vbic.i32 d16,#0xfc000000 @ &=0x03ffffff
+ vadd.i64 q6,q6,q4 @ h0 -> h1
+ vbic.i32 d10,#0xfc000000
+
+ vshrn.u64 d30,q9,#26
+ vmovn.i64 d18,q9
+ vshr.u64 q4,q6,#26
+ vmovn.i64 d12,q6
+ vadd.i64 q7,q7,q4 @ h1 -> h2
+ vbic.i32 d18,#0xfc000000
+ vbic.i32 d12,#0xfc000000
+
+ vadd.i32 d10,d10,d30
+ vshl.u32 d30,d30,#2
+ vshrn.u64 d8,q7,#26
+ vmovn.i64 d14,q7
+ vadd.i32 d10,d10,d30 @ h4 -> h0
+ vadd.i32 d16,d16,d8 @ h2 -> h3
+ vbic.i32 d14,#0xfc000000
+
+ vshr.u32 d30,d10,#26
+ vbic.i32 d10,#0xfc000000
+ vshr.u32 d8,d16,#26
+ vbic.i32 d16,#0xfc000000
+ vadd.i32 d12,d12,d30 @ h0 -> h1
+ vadd.i32 d18,d18,d8 @ h3 -> h4
+
+ subs r5,r5,#1
+ beq .Lsquare_break_neon
+
+ add r6,r0,#(48+0*9*4)
+ add r7,r0,#(48+1*9*4)
+
+ vtrn.32 d0,d10 @ r^2:r^1
+ vtrn.32 d3,d14
+ vtrn.32 d5,d16
+ vtrn.32 d1,d12
+ vtrn.32 d7,d18
+
+ vshl.u32 d4,d3,#2 @ *5
+ vshl.u32 d6,d5,#2
+ vshl.u32 d2,d1,#2
+ vshl.u32 d8,d7,#2
+ vadd.i32 d4,d4,d3
+ vadd.i32 d2,d2,d1
+ vadd.i32 d6,d6,d5
+ vadd.i32 d8,d8,d7
+
+ vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]!
+ vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]!
+ vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
+ vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
+ vst1.32 {d8[0]},[r6,:32]
+ vst1.32 {d8[1]},[r7,:32]
+
+ b .Lsquare_neon
+
+.align 4
+.Lsquare_break_neon:
+ add r6,r0,#(48+2*4*9)
+ add r7,r0,#(48+3*4*9)
+
+ vmov d0,d10 @ r^4:r^3
+ vshl.u32 d2,d12,#2 @ *5
+ vmov d1,d12
+ vshl.u32 d4,d14,#2
+ vmov d3,d14
+ vshl.u32 d6,d16,#2
+ vmov d5,d16
+ vshl.u32 d8,d18,#2
+ vmov d7,d18
+ vadd.i32 d2,d2,d12
+ vadd.i32 d4,d4,d14
+ vadd.i32 d6,d6,d16
+ vadd.i32 d8,d8,d18
+
+ vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]!
+ vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]!
+ vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
+ vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
+ vst1.32 {d8[0]},[r6]
+ vst1.32 {d8[1]},[r7]
+
+ bx lr @ bx lr
+.size poly1305_init_neon,.-poly1305_init_neon
+
+.type poly1305_blocks_neon,%function
+.align 5
+poly1305_blocks_neon:
+ ldr ip,[r0,#36] @ is_base2_26
+ ands r2,r2,#-16
+ beq .Lno_data_neon
+
+ cmp r2,#64
+ bhs .Lenter_neon
+ tst ip,ip @ is_base2_26?
+ beq .Lpoly1305_blocks
+
+.Lenter_neon:
+ stmdb sp!,{r4-r7}
+ vstmdb sp!,{d8-d15} @ ABI specification says so
+
+ tst ip,ip @ is_base2_26?
+ bne .Lbase2_26_neon
+
+ stmdb sp!,{r1-r3,lr}
+ bl poly1305_init_neon
+
+ ldr r4,[r0,#0] @ load hash value base 2^32
+ ldr r5,[r0,#4]
+ ldr r6,[r0,#8]
+ ldr r7,[r0,#12]
+ ldr ip,[r0,#16]
+
+ and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26
+ mov r3,r4,lsr#26
+ veor d10,d10,d10
+ mov r4,r5,lsr#20
+ orr r3,r3,r5,lsl#6
+ veor d12,d12,d12
+ mov r5,r6,lsr#14
+ orr r4,r4,r6,lsl#12
+ veor d14,d14,d14
+ mov r6,r7,lsr#8
+ orr r5,r5,r7,lsl#18
+ veor d16,d16,d16
+ and r3,r3,#0x03ffffff
+ orr r6,r6,ip,lsl#24
+ veor d18,d18,d18
+ and r4,r4,#0x03ffffff
+ mov r1,#1
+ and r5,r5,#0x03ffffff
+ str r1,[r0,#36] @ is_base2_26
+
+ vmov.32 d10[0],r2
+ vmov.32 d12[0],r3
+ vmov.32 d14[0],r4
+ vmov.32 d16[0],r5
+ vmov.32 d18[0],r6
+ adr r5,.Lzeros
+
+ ldmia sp!,{r1-r3,lr}
+ b .Lbase2_32_neon
+
+.align 4
+.Lbase2_26_neon:
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ load hash value
+
+ veor d10,d10,d10
+ veor d12,d12,d12
+ veor d14,d14,d14
+ veor d16,d16,d16
+ veor d18,d18,d18
+ vld4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]!
+ adr r5,.Lzeros
+ vld1.32 {d18[0]},[r0]
+ sub r0,r0,#16 @ rewind
+
+.Lbase2_32_neon:
+ add r4,r1,#32
+ mov r3,r3,lsl#24
+ tst r2,#31
+ beq .Leven
+
+ vld4.32 {d20[0],d22[0],d24[0],d26[0]},[r1]!
+ vmov.32 d28[0],r3
+ sub r2,r2,#16
+ add r4,r1,#32
+
+# ifdef __ARMEB__
+ vrev32.8 q10,q10
+ vrev32.8 q13,q13
+ vrev32.8 q11,q11
+ vrev32.8 q12,q12
+# endif
+ vsri.u32 d28,d26,#8 @ base 2^32 -> base 2^26
+ vshl.u32 d26,d26,#18
+
+ vsri.u32 d26,d24,#14
+ vshl.u32 d24,d24,#12
+ vadd.i32 d29,d28,d18 @ add hash value and move to #hi
+
+ vbic.i32 d26,#0xfc000000
+ vsri.u32 d24,d22,#20
+ vshl.u32 d22,d22,#6
+
+ vbic.i32 d24,#0xfc000000
+ vsri.u32 d22,d20,#26
+ vadd.i32 d27,d26,d16
+
+ vbic.i32 d20,#0xfc000000
+ vbic.i32 d22,#0xfc000000
+ vadd.i32 d25,d24,d14
+
+ vadd.i32 d21,d20,d10
+ vadd.i32 d23,d22,d12
+
+ mov r7,r5
+ add r6,r0,#48
+
+ cmp r2,r2
+ b .Long_tail
+
+.align 4
+.Leven:
+ subs r2,r2,#64
+ it lo
+ movlo r4,r5
+
+ vmov.i32 q14,#1<<24 @ padbit, yes, always
+ vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1]
+ add r1,r1,#64
+ vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0)
+ add r4,r4,#64
+ itt hi
+ addhi r7,r0,#(48+1*9*4)
+ addhi r6,r0,#(48+3*9*4)
+
+# ifdef __ARMEB__
+ vrev32.8 q10,q10
+ vrev32.8 q13,q13
+ vrev32.8 q11,q11
+ vrev32.8 q12,q12
+# endif
+ vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26
+ vshl.u32 q13,q13,#18
+
+ vsri.u32 q13,q12,#14
+ vshl.u32 q12,q12,#12
+
+ vbic.i32 q13,#0xfc000000
+ vsri.u32 q12,q11,#20
+ vshl.u32 q11,q11,#6
+
+ vbic.i32 q12,#0xfc000000
+ vsri.u32 q11,q10,#26
+
+ vbic.i32 q10,#0xfc000000
+ vbic.i32 q11,#0xfc000000
+
+ bls .Lskip_loop
+
+ vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^2
+ vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4
+ vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
+ vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
+ b .Loop_neon
+
+.align 5
+.Loop_neon:
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2
+ @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r
+ @ ___________________/
+ @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2
+ @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r
+ @ ___________________/ ____________________/
+ @
+ @ Note that we start with inp[2:3]*r^2. This is because it
+ @ doesn't depend on reduction in previous iteration.
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4
+ @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4
+ @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4
+ @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4
+ @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ inp[2:3]*r^2
+
+ vadd.i32 d24,d24,d14 @ accumulate inp[0:1]
+ vmull.u32 q7,d25,d0[1]
+ vadd.i32 d20,d20,d10
+ vmull.u32 q5,d21,d0[1]
+ vadd.i32 d26,d26,d16
+ vmull.u32 q8,d27,d0[1]
+ vmlal.u32 q7,d23,d1[1]
+ vadd.i32 d22,d22,d12
+ vmull.u32 q6,d23,d0[1]
+
+ vadd.i32 d28,d28,d18
+ vmull.u32 q9,d29,d0[1]
+ subs r2,r2,#64
+ vmlal.u32 q5,d29,d2[1]
+ it lo
+ movlo r4,r5
+ vmlal.u32 q8,d25,d1[1]
+ vld1.32 d8[1],[r7,:32]
+ vmlal.u32 q6,d21,d1[1]
+ vmlal.u32 q9,d27,d1[1]
+
+ vmlal.u32 q5,d27,d4[1]
+ vmlal.u32 q8,d23,d3[1]
+ vmlal.u32 q9,d25,d3[1]
+ vmlal.u32 q6,d29,d4[1]
+ vmlal.u32 q7,d21,d3[1]
+
+ vmlal.u32 q8,d21,d5[1]
+ vmlal.u32 q5,d25,d6[1]
+ vmlal.u32 q9,d23,d5[1]
+ vmlal.u32 q6,d27,d6[1]
+ vmlal.u32 q7,d29,d6[1]
+
+ vmlal.u32 q8,d29,d8[1]
+ vmlal.u32 q5,d23,d8[1]
+ vmlal.u32 q9,d21,d7[1]
+ vmlal.u32 q6,d25,d8[1]
+ vmlal.u32 q7,d27,d8[1]
+
+ vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0)
+ add r4,r4,#64
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ (hash+inp[0:1])*r^4 and accumulate
+
+ vmlal.u32 q8,d26,d0[0]
+ vmlal.u32 q5,d20,d0[0]
+ vmlal.u32 q9,d28,d0[0]
+ vmlal.u32 q6,d22,d0[0]
+ vmlal.u32 q7,d24,d0[0]
+ vld1.32 d8[0],[r6,:32]
+
+ vmlal.u32 q8,d24,d1[0]
+ vmlal.u32 q5,d28,d2[0]
+ vmlal.u32 q9,d26,d1[0]
+ vmlal.u32 q6,d20,d1[0]
+ vmlal.u32 q7,d22,d1[0]
+
+ vmlal.u32 q8,d22,d3[0]
+ vmlal.u32 q5,d26,d4[0]
+ vmlal.u32 q9,d24,d3[0]
+ vmlal.u32 q6,d28,d4[0]
+ vmlal.u32 q7,d20,d3[0]
+
+ vmlal.u32 q8,d20,d5[0]
+ vmlal.u32 q5,d24,d6[0]
+ vmlal.u32 q9,d22,d5[0]
+ vmlal.u32 q6,d26,d6[0]
+ vmlal.u32 q8,d28,d8[0]
+
+ vmlal.u32 q7,d28,d6[0]
+ vmlal.u32 q5,d22,d8[0]
+ vmlal.u32 q9,d20,d7[0]
+ vmov.i32 q14,#1<<24 @ padbit, yes, always
+ vmlal.u32 q6,d24,d8[0]
+ vmlal.u32 q7,d26,d8[0]
+
+ vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1]
+ add r1,r1,#64
+# ifdef __ARMEB__
+ vrev32.8 q10,q10
+ vrev32.8 q11,q11
+ vrev32.8 q12,q12
+ vrev32.8 q13,q13
+# endif
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ lazy reduction interleaved with base 2^32 -> base 2^26 of
+ @ inp[0:3] previously loaded to q10-q13 and smashed to q10-q14.
+
+ vshr.u64 q15,q8,#26
+ vmovn.i64 d16,q8
+ vshr.u64 q4,q5,#26
+ vmovn.i64 d10,q5
+ vadd.i64 q9,q9,q15 @ h3 -> h4
+ vbic.i32 d16,#0xfc000000
+ vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26
+ vadd.i64 q6,q6,q4 @ h0 -> h1
+ vshl.u32 q13,q13,#18
+ vbic.i32 d10,#0xfc000000
+
+ vshrn.u64 d30,q9,#26
+ vmovn.i64 d18,q9
+ vshr.u64 q4,q6,#26
+ vmovn.i64 d12,q6
+ vadd.i64 q7,q7,q4 @ h1 -> h2
+ vsri.u32 q13,q12,#14
+ vbic.i32 d18,#0xfc000000
+ vshl.u32 q12,q12,#12
+ vbic.i32 d12,#0xfc000000
+
+ vadd.i32 d10,d10,d30
+ vshl.u32 d30,d30,#2
+ vbic.i32 q13,#0xfc000000
+ vshrn.u64 d8,q7,#26
+ vmovn.i64 d14,q7
+ vaddl.u32 q5,d10,d30 @ h4 -> h0 [widen for a sec]
+ vsri.u32 q12,q11,#20
+ vadd.i32 d16,d16,d8 @ h2 -> h3
+ vshl.u32 q11,q11,#6
+ vbic.i32 d14,#0xfc000000
+ vbic.i32 q12,#0xfc000000
+
+ vshrn.u64 d30,q5,#26 @ re-narrow
+ vmovn.i64 d10,q5
+ vsri.u32 q11,q10,#26
+ vbic.i32 q10,#0xfc000000
+ vshr.u32 d8,d16,#26
+ vbic.i32 d16,#0xfc000000
+ vbic.i32 d10,#0xfc000000
+ vadd.i32 d12,d12,d30 @ h0 -> h1
+ vadd.i32 d18,d18,d8 @ h3 -> h4
+ vbic.i32 q11,#0xfc000000
+
+ bhi .Loop_neon
+
+.Lskip_loop:
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1
+
+ add r7,r0,#(48+0*9*4)
+ add r6,r0,#(48+1*9*4)
+ adds r2,r2,#32
+ it ne
+ movne r2,#0
+ bne .Long_tail
+
+ vadd.i32 d25,d24,d14 @ add hash value and move to #hi
+ vadd.i32 d21,d20,d10
+ vadd.i32 d27,d26,d16
+ vadd.i32 d23,d22,d12
+ vadd.i32 d29,d28,d18
+
+.Long_tail:
+ vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^1
+ vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^2
+
+ vadd.i32 d24,d24,d14 @ can be redundant
+ vmull.u32 q7,d25,d0
+ vadd.i32 d20,d20,d10
+ vmull.u32 q5,d21,d0
+ vadd.i32 d26,d26,d16
+ vmull.u32 q8,d27,d0
+ vadd.i32 d22,d22,d12
+ vmull.u32 q6,d23,d0
+ vadd.i32 d28,d28,d18
+ vmull.u32 q9,d29,d0
+
+ vmlal.u32 q5,d29,d2
+ vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
+ vmlal.u32 q8,d25,d1
+ vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
+ vmlal.u32 q6,d21,d1
+ vmlal.u32 q9,d27,d1
+ vmlal.u32 q7,d23,d1
+
+ vmlal.u32 q8,d23,d3
+ vld1.32 d8[1],[r7,:32]
+ vmlal.u32 q5,d27,d4
+ vld1.32 d8[0],[r6,:32]
+ vmlal.u32 q9,d25,d3
+ vmlal.u32 q6,d29,d4
+ vmlal.u32 q7,d21,d3
+
+ vmlal.u32 q8,d21,d5
+ it ne
+ addne r7,r0,#(48+2*9*4)
+ vmlal.u32 q5,d25,d6
+ it ne
+ addne r6,r0,#(48+3*9*4)
+ vmlal.u32 q9,d23,d5
+ vmlal.u32 q6,d27,d6
+ vmlal.u32 q7,d29,d6
+
+ vmlal.u32 q8,d29,d8
+ vorn q0,q0,q0 @ all-ones, can be redundant
+ vmlal.u32 q5,d23,d8
+ vshr.u64 q0,q0,#38
+ vmlal.u32 q9,d21,d7
+ vmlal.u32 q6,d25,d8
+ vmlal.u32 q7,d27,d8
+
+ beq .Lshort_tail
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ (hash+inp[0:1])*r^4:r^3 and accumulate
+
+ vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^3
+ vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4
+
+ vmlal.u32 q7,d24,d0
+ vmlal.u32 q5,d20,d0
+ vmlal.u32 q8,d26,d0
+ vmlal.u32 q6,d22,d0
+ vmlal.u32 q9,d28,d0
+
+ vmlal.u32 q5,d28,d2
+ vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
+ vmlal.u32 q8,d24,d1
+ vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
+ vmlal.u32 q6,d20,d1
+ vmlal.u32 q9,d26,d1
+ vmlal.u32 q7,d22,d1
+
+ vmlal.u32 q8,d22,d3
+ vld1.32 d8[1],[r7,:32]
+ vmlal.u32 q5,d26,d4
+ vld1.32 d8[0],[r6,:32]
+ vmlal.u32 q9,d24,d3
+ vmlal.u32 q6,d28,d4
+ vmlal.u32 q7,d20,d3
+
+ vmlal.u32 q8,d20,d5
+ vmlal.u32 q5,d24,d6
+ vmlal.u32 q9,d22,d5
+ vmlal.u32 q6,d26,d6
+ vmlal.u32 q7,d28,d6
+
+ vmlal.u32 q8,d28,d8
+ vorn q0,q0,q0 @ all-ones
+ vmlal.u32 q5,d22,d8
+ vshr.u64 q0,q0,#38
+ vmlal.u32 q9,d20,d7
+ vmlal.u32 q6,d24,d8
+ vmlal.u32 q7,d26,d8
+
+.Lshort_tail:
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ horizontal addition
+
+ vadd.i64 d16,d16,d17
+ vadd.i64 d10,d10,d11
+ vadd.i64 d18,d18,d19
+ vadd.i64 d12,d12,d13
+ vadd.i64 d14,d14,d15
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ lazy reduction, but without narrowing
+
+ vshr.u64 q15,q8,#26
+ vand.i64 q8,q8,q0
+ vshr.u64 q4,q5,#26
+ vand.i64 q5,q5,q0
+ vadd.i64 q9,q9,q15 @ h3 -> h4
+ vadd.i64 q6,q6,q4 @ h0 -> h1
+
+ vshr.u64 q15,q9,#26
+ vand.i64 q9,q9,q0
+ vshr.u64 q4,q6,#26
+ vand.i64 q6,q6,q0
+ vadd.i64 q7,q7,q4 @ h1 -> h2
+
+ vadd.i64 q5,q5,q15
+ vshl.u64 q15,q15,#2
+ vshr.u64 q4,q7,#26
+ vand.i64 q7,q7,q0
+ vadd.i64 q5,q5,q15 @ h4 -> h0
+ vadd.i64 q8,q8,q4 @ h2 -> h3
+
+ vshr.u64 q15,q5,#26
+ vand.i64 q5,q5,q0
+ vshr.u64 q4,q8,#26
+ vand.i64 q8,q8,q0
+ vadd.i64 q6,q6,q15 @ h0 -> h1
+ vadd.i64 q9,q9,q4 @ h3 -> h4
+
+ cmp r2,#0
+ bne .Leven
+
+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
+ @ store hash value
+
+ vst4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]!
+ vst1.32 {d18[0]},[r0]
+
+ vldmia sp!,{d8-d15} @ epilogue
+ ldmia sp!,{r4-r7}
+.Lno_data_neon:
+ bx lr @ bx lr
+.size poly1305_blocks_neon,.-poly1305_blocks_neon
+
+.type poly1305_emit_neon,%function
+.align 5
+poly1305_emit_neon:
+ ldr ip,[r0,#36] @ is_base2_26
+
+ stmdb sp!,{r4-r11}
+
+ tst ip,ip
+ beq .Lpoly1305_emit_enter
+
+ ldmia r0,{r3-r7}
+ eor r8,r8,r8
+
+ adds r3,r3,r4,lsl#26 @ base 2^26 -> base 2^32
+ mov r4,r4,lsr#6
+ adcs r4,r4,r5,lsl#20
+ mov r5,r5,lsr#12
+ adcs r5,r5,r6,lsl#14
+ mov r6,r6,lsr#18
+ adcs r6,r6,r7,lsl#8
+ adc r7,r8,r7,lsr#24 @ can be partially reduced ...
+
+ and r8,r7,#-4 @ ... so reduce
+ and r7,r6,#3
+ add r8,r8,r8,lsr#2 @ *= 5
+ adds r3,r3,r8
+ adcs r4,r4,#0
+ adcs r5,r5,#0
+ adcs r6,r6,#0
+ adc r7,r7,#0
+
+ adds r8,r3,#5 @ compare to modulus
+ adcs r9,r4,#0
+ adcs r10,r5,#0
+ adcs r11,r6,#0
+ adc r7,r7,#0
+ tst r7,#4 @ did it carry/borrow?
+
+ it ne
+ movne r3,r8
+ ldr r8,[r2,#0]
+ it ne
+ movne r4,r9
+ ldr r9,[r2,#4]
+ it ne
+ movne r5,r10
+ ldr r10,[r2,#8]
+ it ne
+ movne r6,r11
+ ldr r11,[r2,#12]
+
+ adds r3,r3,r8 @ accumulate nonce
+ adcs r4,r4,r9
+ adcs r5,r5,r10
+ adc r6,r6,r11
+
+# ifdef __ARMEB__
+ rev r3,r3
+ rev r4,r4
+ rev r5,r5
+ rev r6,r6
+# endif
+ str r3,[r1,#0] @ store the result
+ str r4,[r1,#4]
+ str r5,[r1,#8]
+ str r6,[r1,#12]
+
+ ldmia sp!,{r4-r11}
+ bx lr @ bx lr
+.size poly1305_emit_neon,.-poly1305_emit_neon
+
+.align 5
+.Lzeros:
+.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
+.LOPENSSL_armcap:
+.word OPENSSL_armcap_P-.Lpoly1305_init
+#endif
+.asciz "Poly1305 for ARMv4/NEON, CRYPTOGAMS by <appro@openssl.org>"
+.align 2
+#if __ARM_MAX_ARCH__>=7
+.comm OPENSSL_armcap_P,4,4
+#endif
diff --git a/lib/zinc/poly1305/poly1305-arm64-cryptogams.S b/lib/zinc/poly1305/poly1305-arm64-cryptogams.S
new file mode 100644
index 000000000000..0ecb50a83ec0
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-arm64-cryptogams.S
@@ -0,0 +1,869 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+#include "arm_arch.h"
+
+.text
+
+// forward "declarations" are required for Apple
+
+.globl poly1305_blocks
+.globl poly1305_emit
+
+.globl poly1305_init
+.type poly1305_init,%function
+.align 5
+poly1305_init:
+ cmp x1,xzr
+ stp xzr,xzr,[x0] // zero hash value
+ stp xzr,xzr,[x0,#16] // [along with is_base2_26]
+
+ csel x0,xzr,x0,eq
+ b.eq .Lno_key
+
+#ifdef __ILP32__
+ ldrsw x11,.LOPENSSL_armcap_P
+#else
+ ldr x11,.LOPENSSL_armcap_P
+#endif
+ adr x10,.LOPENSSL_armcap_P
+
+ ldp x7,x8,[x1] // load key
+ mov x9,#0xfffffffc0fffffff
+ movk x9,#0x0fff,lsl#48
+ ldr w17,[x10,x11]
+#ifdef __ARMEB__
+ rev x7,x7 // flip bytes
+ rev x8,x8
+#endif
+ and x7,x7,x9 // &=0ffffffc0fffffff
+ and x9,x9,#-4
+ and x8,x8,x9 // &=0ffffffc0ffffffc
+ stp x7,x8,[x0,#32] // save key value
+
+ tst w17,#ARMV7_NEON
+
+ adr x12,poly1305_blocks
+ adr x7,poly1305_blocks_neon
+ adr x13,poly1305_emit
+ adr x8,poly1305_emit_neon
+
+ csel x12,x12,x7,eq
+ csel x13,x13,x8,eq
+
+#ifdef __ILP32__
+ stp w12,w13,[x2]
+#else
+ stp x12,x13,[x2]
+#endif
+
+ mov x0,#1
+.Lno_key:
+ ret
+.size poly1305_init,.-poly1305_init
+
+.type poly1305_blocks,%function
+.align 5
+poly1305_blocks:
+ ands x2,x2,#-16
+ b.eq .Lno_data
+
+ ldp x4,x5,[x0] // load hash value
+ ldp x7,x8,[x0,#32] // load key value
+ ldr x6,[x0,#16]
+ add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
+ b .Loop
+
+.align 5
+.Loop:
+ ldp x10,x11,[x1],#16 // load input
+ sub x2,x2,#16
+#ifdef __ARMEB__
+ rev x10,x10
+ rev x11,x11
+#endif
+ adds x4,x4,x10 // accumulate input
+ adcs x5,x5,x11
+
+ mul x12,x4,x7 // h0*r0
+ adc x6,x6,x3
+ umulh x13,x4,x7
+
+ mul x10,x5,x9 // h1*5*r1
+ umulh x11,x5,x9
+
+ adds x12,x12,x10
+ mul x10,x4,x8 // h0*r1
+ adc x13,x13,x11
+ umulh x14,x4,x8
+
+ adds x13,x13,x10
+ mul x10,x5,x7 // h1*r0
+ adc x14,x14,xzr
+ umulh x11,x5,x7
+
+ adds x13,x13,x10
+ mul x10,x6,x9 // h2*5*r1
+ adc x14,x14,x11
+ mul x11,x6,x7 // h2*r0
+
+ adds x13,x13,x10
+ adc x14,x14,x11
+
+ and x10,x14,#-4 // final reduction
+ and x6,x14,#3
+ add x10,x10,x14,lsr#2
+ adds x4,x12,x10
+ adcs x5,x13,xzr
+ adc x6,x6,xzr
+
+ cbnz x2,.Loop
+
+ stp x4,x5,[x0] // store hash value
+ str x6,[x0,#16]
+
+.Lno_data:
+ ret
+.size poly1305_blocks,.-poly1305_blocks
+
+.type poly1305_emit,%function
+.align 5
+poly1305_emit:
+ ldp x4,x5,[x0] // load hash base 2^64
+ ldr x6,[x0,#16]
+ ldp x10,x11,[x2] // load nonce
+
+ adds x12,x4,#5 // compare to modulus
+ adcs x13,x5,xzr
+ adc x14,x6,xzr
+
+ tst x14,#-4 // see if it's carried/borrowed
+
+ csel x4,x4,x12,eq
+ csel x5,x5,x13,eq
+
+#ifdef __ARMEB__
+ ror x10,x10,#32 // flip nonce words
+ ror x11,x11,#32
+#endif
+ adds x4,x4,x10 // accumulate nonce
+ adc x5,x5,x11
+#ifdef __ARMEB__
+ rev x4,x4 // flip output bytes
+ rev x5,x5
+#endif
+ stp x4,x5,[x1] // write result
+
+ ret
+.size poly1305_emit,.-poly1305_emit
+.type poly1305_mult,%function
+.align 5
+poly1305_mult:
+ mul x12,x4,x7 // h0*r0
+ umulh x13,x4,x7
+
+ mul x10,x5,x9 // h1*5*r1
+ umulh x11,x5,x9
+
+ adds x12,x12,x10
+ mul x10,x4,x8 // h0*r1
+ adc x13,x13,x11
+ umulh x14,x4,x8
+
+ adds x13,x13,x10
+ mul x10,x5,x7 // h1*r0
+ adc x14,x14,xzr
+ umulh x11,x5,x7
+
+ adds x13,x13,x10
+ mul x10,x6,x9 // h2*5*r1
+ adc x14,x14,x11
+ mul x11,x6,x7 // h2*r0
+
+ adds x13,x13,x10
+ adc x14,x14,x11
+
+ and x10,x14,#-4 // final reduction
+ and x6,x14,#3
+ add x10,x10,x14,lsr#2
+ adds x4,x12,x10
+ adcs x5,x13,xzr
+ adc x6,x6,xzr
+
+ ret
+.size poly1305_mult,.-poly1305_mult
+
+.type poly1305_splat,%function
+.align 5
+poly1305_splat:
+ and x12,x4,#0x03ffffff // base 2^64 -> base 2^26
+ ubfx x13,x4,#26,#26
+ extr x14,x5,x4,#52
+ and x14,x14,#0x03ffffff
+ ubfx x15,x5,#14,#26
+ extr x16,x6,x5,#40
+
+ str w12,[x0,#16*0] // r0
+ add w12,w13,w13,lsl#2 // r1*5
+ str w13,[x0,#16*1] // r1
+ add w13,w14,w14,lsl#2 // r2*5
+ str w12,[x0,#16*2] // s1
+ str w14,[x0,#16*3] // r2
+ add w14,w15,w15,lsl#2 // r3*5
+ str w13,[x0,#16*4] // s2
+ str w15,[x0,#16*5] // r3
+ add w15,w16,w16,lsl#2 // r4*5
+ str w14,[x0,#16*6] // s3
+ str w16,[x0,#16*7] // r4
+ str w15,[x0,#16*8] // s4
+
+ ret
+.size poly1305_splat,.-poly1305_splat
+
+.type poly1305_blocks_neon,%function
+.align 5
+poly1305_blocks_neon:
+ ldr x17,[x0,#24]
+ cmp x2,#128
+ b.hs .Lblocks_neon
+ cbz x17,poly1305_blocks
+
+.Lblocks_neon:
+ stp x29,x30,[sp,#-80]!
+ add x29,sp,#0
+
+ ands x2,x2,#-16
+ b.eq .Lno_data_neon
+
+ cbz x17,.Lbase2_64_neon
+
+ ldp w10,w11,[x0] // load hash value base 2^26
+ ldp w12,w13,[x0,#8]
+ ldr w14,[x0,#16]
+
+ tst x2,#31
+ b.eq .Leven_neon
+
+ ldp x7,x8,[x0,#32] // load key value
+
+ add x4,x10,x11,lsl#26 // base 2^26 -> base 2^64
+ lsr x5,x12,#12
+ adds x4,x4,x12,lsl#52
+ add x5,x5,x13,lsl#14
+ adc x5,x5,xzr
+ lsr x6,x14,#24
+ adds x5,x5,x14,lsl#40
+ adc x14,x6,xzr // can be partially reduced...
+
+ ldp x12,x13,[x1],#16 // load input
+ sub x2,x2,#16
+ add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
+
+ and x10,x14,#-4 // ... so reduce
+ and x6,x14,#3
+ add x10,x10,x14,lsr#2
+ adds x4,x4,x10
+ adcs x5,x5,xzr
+ adc x6,x6,xzr
+
+#ifdef __ARMEB__
+ rev x12,x12
+ rev x13,x13
+#endif
+ adds x4,x4,x12 // accumulate input
+ adcs x5,x5,x13
+ adc x6,x6,x3
+
+ bl poly1305_mult
+ ldr x30,[sp,#8]
+
+ cbz x3,.Lstore_base2_64_neon
+
+ and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
+ ubfx x11,x4,#26,#26
+ extr x12,x5,x4,#52
+ and x12,x12,#0x03ffffff
+ ubfx x13,x5,#14,#26
+ extr x14,x6,x5,#40
+
+ cbnz x2,.Leven_neon
+
+ stp w10,w11,[x0] // store hash value base 2^26
+ stp w12,w13,[x0,#8]
+ str w14,[x0,#16]
+ b .Lno_data_neon
+
+.align 4
+.Lstore_base2_64_neon:
+ stp x4,x5,[x0] // store hash value base 2^64
+ stp x6,xzr,[x0,#16] // note that is_base2_26 is zeroed
+ b .Lno_data_neon
+
+.align 4
+.Lbase2_64_neon:
+ ldp x7,x8,[x0,#32] // load key value
+
+ ldp x4,x5,[x0] // load hash value base 2^64
+ ldr x6,[x0,#16]
+
+ tst x2,#31
+ b.eq .Linit_neon
+
+ ldp x12,x13,[x1],#16 // load input
+ sub x2,x2,#16
+ add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
+#ifdef __ARMEB__
+ rev x12,x12
+ rev x13,x13
+#endif
+ adds x4,x4,x12 // accumulate input
+ adcs x5,x5,x13
+ adc x6,x6,x3
+
+ bl poly1305_mult
+
+.Linit_neon:
+ and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
+ ubfx x11,x4,#26,#26
+ extr x12,x5,x4,#52
+ and x12,x12,#0x03ffffff
+ ubfx x13,x5,#14,#26
+ extr x14,x6,x5,#40
+
+ stp d8,d9,[sp,#16] // meet ABI requirements
+ stp d10,d11,[sp,#32]
+ stp d12,d13,[sp,#48]
+ stp d14,d15,[sp,#64]
+
+ fmov d24,x10
+ fmov d25,x11
+ fmov d26,x12
+ fmov d27,x13
+ fmov d28,x14
+
+ ////////////////////////////////// initialize r^n table
+ mov x4,x7 // r^1
+ add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
+ mov x5,x8
+ mov x6,xzr
+ add x0,x0,#48+12
+ bl poly1305_splat
+
+ bl poly1305_mult // r^2
+ sub x0,x0,#4
+ bl poly1305_splat
+
+ bl poly1305_mult // r^3
+ sub x0,x0,#4
+ bl poly1305_splat
+
+ bl poly1305_mult // r^4
+ sub x0,x0,#4
+ bl poly1305_splat
+ ldr x30,[sp,#8]
+
+ add x16,x1,#32
+ adr x17,.Lzeros
+ subs x2,x2,#64
+ csel x16,x17,x16,lo
+
+ mov x4,#1
+ str x4,[x0,#-24] // set is_base2_26
+ sub x0,x0,#48 // restore original x0
+ b .Ldo_neon
+
+.align 4
+.Leven_neon:
+ add x16,x1,#32
+ adr x17,.Lzeros
+ subs x2,x2,#64
+ csel x16,x17,x16,lo
+
+ stp d8,d9,[sp,#16] // meet ABI requirements
+ stp d10,d11,[sp,#32]
+ stp d12,d13,[sp,#48]
+ stp d14,d15,[sp,#64]
+
+ fmov d24,x10
+ fmov d25,x11
+ fmov d26,x12
+ fmov d27,x13
+ fmov d28,x14
+
+.Ldo_neon:
+ ldp x8,x12,[x16],#16 // inp[2:3] (or zero)
+ ldp x9,x13,[x16],#48
+
+ lsl x3,x3,#24
+ add x15,x0,#48
+
+#ifdef __ARMEB__
+ rev x8,x8
+ rev x12,x12
+ rev x9,x9
+ rev x13,x13
+#endif
+ and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
+ and x5,x9,#0x03ffffff
+ ubfx x6,x8,#26,#26
+ ubfx x7,x9,#26,#26
+ add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
+ extr x8,x12,x8,#52
+ extr x9,x13,x9,#52
+ add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
+ fmov d14,x4
+ and x8,x8,#0x03ffffff
+ and x9,x9,#0x03ffffff
+ ubfx x10,x12,#14,#26
+ ubfx x11,x13,#14,#26
+ add x12,x3,x12,lsr#40
+ add x13,x3,x13,lsr#40
+ add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
+ fmov d15,x6
+ add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
+ add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
+ fmov d16,x8
+ fmov d17,x10
+ fmov d18,x12
+
+ ldp x8,x12,[x1],#16 // inp[0:1]
+ ldp x9,x13,[x1],#48
+
+ ld1 {v0.4s,v1.4s,v2.4s,v3.4s},[x15],#64
+ ld1 {v4.4s,v5.4s,v6.4s,v7.4s},[x15],#64
+ ld1 {v8.4s},[x15]
+
+#ifdef __ARMEB__
+ rev x8,x8
+ rev x12,x12
+ rev x9,x9
+ rev x13,x13
+#endif
+ and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
+ and x5,x9,#0x03ffffff
+ ubfx x6,x8,#26,#26
+ ubfx x7,x9,#26,#26
+ add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
+ extr x8,x12,x8,#52
+ extr x9,x13,x9,#52
+ add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
+ fmov d9,x4
+ and x8,x8,#0x03ffffff
+ and x9,x9,#0x03ffffff
+ ubfx x10,x12,#14,#26
+ ubfx x11,x13,#14,#26
+ add x12,x3,x12,lsr#40
+ add x13,x3,x13,lsr#40
+ add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
+ fmov d10,x6
+ add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
+ add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
+ movi v31.2d,#-1
+ fmov d11,x8
+ fmov d12,x10
+ fmov d13,x12
+ ushr v31.2d,v31.2d,#38
+
+ b.ls .Lskip_loop
+
+.align 4
+.Loop_neon:
+ ////////////////////////////////////////////////////////////////
+ // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2
+ // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r
+ // ___________________/
+ // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2
+ // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r
+ // ___________________/ ____________________/
+ //
+ // Note that we start with inp[2:3]*r^2. This is because it
+ // doesn't depend on reduction in previous iteration.
+ ////////////////////////////////////////////////////////////////
+ // d4 = h0*r4 + h1*r3 + h2*r2 + h3*r1 + h4*r0
+ // d3 = h0*r3 + h1*r2 + h2*r1 + h3*r0 + h4*5*r4
+ // d2 = h0*r2 + h1*r1 + h2*r0 + h3*5*r4 + h4*5*r3
+ // d1 = h0*r1 + h1*r0 + h2*5*r4 + h3*5*r3 + h4*5*r2
+ // d0 = h0*r0 + h1*5*r4 + h2*5*r3 + h3*5*r2 + h4*5*r1
+
+ subs x2,x2,#64
+ umull v23.2d,v14.2s,v7.s[2]
+ csel x16,x17,x16,lo
+ umull v22.2d,v14.2s,v5.s[2]
+ umull v21.2d,v14.2s,v3.s[2]
+ ldp x8,x12,[x16],#16 // inp[2:3] (or zero)
+ umull v20.2d,v14.2s,v1.s[2]
+ ldp x9,x13,[x16],#48
+ umull v19.2d,v14.2s,v0.s[2]
+#ifdef __ARMEB__
+ rev x8,x8
+ rev x12,x12
+ rev x9,x9
+ rev x13,x13
+#endif
+
+ umlal v23.2d,v15.2s,v5.s[2]
+ and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
+ umlal v22.2d,v15.2s,v3.s[2]
+ and x5,x9,#0x03ffffff
+ umlal v21.2d,v15.2s,v1.s[2]
+ ubfx x6,x8,#26,#26
+ umlal v20.2d,v15.2s,v0.s[2]
+ ubfx x7,x9,#26,#26
+ umlal v19.2d,v15.2s,v8.s[2]
+ add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
+
+ umlal v23.2d,v16.2s,v3.s[2]
+ extr x8,x12,x8,#52
+ umlal v22.2d,v16.2s,v1.s[2]
+ extr x9,x13,x9,#52
+ umlal v21.2d,v16.2s,v0.s[2]
+ add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
+ umlal v20.2d,v16.2s,v8.s[2]
+ fmov d14,x4
+ umlal v19.2d,v16.2s,v6.s[2]
+ and x8,x8,#0x03ffffff
+
+ umlal v23.2d,v17.2s,v1.s[2]
+ and x9,x9,#0x03ffffff
+ umlal v22.2d,v17.2s,v0.s[2]
+ ubfx x10,x12,#14,#26
+ umlal v21.2d,v17.2s,v8.s[2]
+ ubfx x11,x13,#14,#26
+ umlal v20.2d,v17.2s,v6.s[2]
+ add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
+ umlal v19.2d,v17.2s,v4.s[2]
+ fmov d15,x6
+
+ add v11.2s,v11.2s,v26.2s
+ add x12,x3,x12,lsr#40
+ umlal v23.2d,v18.2s,v0.s[2]
+ add x13,x3,x13,lsr#40
+ umlal v22.2d,v18.2s,v8.s[2]
+ add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
+ umlal v21.2d,v18.2s,v6.s[2]
+ add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
+ umlal v20.2d,v18.2s,v4.s[2]
+ fmov d16,x8
+ umlal v19.2d,v18.2s,v2.s[2]
+ fmov d17,x10
+
+ ////////////////////////////////////////////////////////////////
+ // (hash+inp[0:1])*r^4 and accumulate
+
+ add v9.2s,v9.2s,v24.2s
+ fmov d18,x12
+ umlal v22.2d,v11.2s,v1.s[0]
+ ldp x8,x12,[x1],#16 // inp[0:1]
+ umlal v19.2d,v11.2s,v6.s[0]
+ ldp x9,x13,[x1],#48
+ umlal v23.2d,v11.2s,v3.s[0]
+ umlal v20.2d,v11.2s,v8.s[0]
+ umlal v21.2d,v11.2s,v0.s[0]
+#ifdef __ARMEB__
+ rev x8,x8
+ rev x12,x12
+ rev x9,x9
+ rev x13,x13
+#endif
+
+ add v10.2s,v10.2s,v25.2s
+ umlal v22.2d,v9.2s,v5.s[0]
+ umlal v23.2d,v9.2s,v7.s[0]
+ and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
+ umlal v21.2d,v9.2s,v3.s[0]
+ and x5,x9,#0x03ffffff
+ umlal v19.2d,v9.2s,v0.s[0]
+ ubfx x6,x8,#26,#26
+ umlal v20.2d,v9.2s,v1.s[0]
+ ubfx x7,x9,#26,#26
+
+ add v12.2s,v12.2s,v27.2s
+ add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
+ umlal v22.2d,v10.2s,v3.s[0]
+ extr x8,x12,x8,#52
+ umlal v23.2d,v10.2s,v5.s[0]
+ extr x9,x13,x9,#52
+ umlal v19.2d,v10.2s,v8.s[0]
+ add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
+ umlal v21.2d,v10.2s,v1.s[0]
+ fmov d9,x4
+ umlal v20.2d,v10.2s,v0.s[0]
+ and x8,x8,#0x03ffffff
+
+ add v13.2s,v13.2s,v28.2s
+ and x9,x9,#0x03ffffff
+ umlal v22.2d,v12.2s,v0.s[0]
+ ubfx x10,x12,#14,#26
+ umlal v19.2d,v12.2s,v4.s[0]
+ ubfx x11,x13,#14,#26
+ umlal v23.2d,v12.2s,v1.s[0]
+ add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
+ umlal v20.2d,v12.2s,v6.s[0]
+ fmov d10,x6
+ umlal v21.2d,v12.2s,v8.s[0]
+ add x12,x3,x12,lsr#40
+
+ umlal v22.2d,v13.2s,v8.s[0]
+ add x13,x3,x13,lsr#40
+ umlal v19.2d,v13.2s,v2.s[0]
+ add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
+ umlal v23.2d,v13.2s,v0.s[0]
+ add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
+ umlal v20.2d,v13.2s,v4.s[0]
+ fmov d11,x8
+ umlal v21.2d,v13.2s,v6.s[0]
+ fmov d12,x10
+ fmov d13,x12
+
+ /////////////////////////////////////////////////////////////////
+ // lazy reduction as discussed in "NEON crypto" by D.J. Bernstein
+ // and P. Schwabe
+ //
+ // [see discussion in poly1305-armv4 module]
+
+ ushr v29.2d,v22.2d,#26
+ xtn v27.2s,v22.2d
+ ushr v30.2d,v19.2d,#26
+ and v19.16b,v19.16b,v31.16b
+ add v23.2d,v23.2d,v29.2d // h3 -> h4
+ bic v27.2s,#0xfc,lsl#24 // &=0x03ffffff
+ add v20.2d,v20.2d,v30.2d // h0 -> h1
+
+ ushr v29.2d,v23.2d,#26
+ xtn v28.2s,v23.2d
+ ushr v30.2d,v20.2d,#26
+ xtn v25.2s,v20.2d
+ bic v28.2s,#0xfc,lsl#24
+ add v21.2d,v21.2d,v30.2d // h1 -> h2
+
+ add v19.2d,v19.2d,v29.2d
+ shl v29.2d,v29.2d,#2
+ shrn v30.2s,v21.2d,#26
+ xtn v26.2s,v21.2d
+ add v19.2d,v19.2d,v29.2d // h4 -> h0
+ bic v25.2s,#0xfc,lsl#24
+ add v27.2s,v27.2s,v30.2s // h2 -> h3
+ bic v26.2s,#0xfc,lsl#24
+
+ shrn v29.2s,v19.2d,#26
+ xtn v24.2s,v19.2d
+ ushr v30.2s,v27.2s,#26
+ bic v27.2s,#0xfc,lsl#24
+ bic v24.2s,#0xfc,lsl#24
+ add v25.2s,v25.2s,v29.2s // h0 -> h1
+ add v28.2s,v28.2s,v30.2s // h3 -> h4
+
+ b.hi .Loop_neon
+
+.Lskip_loop:
+ dup v16.2d,v16.d[0]
+ add v11.2s,v11.2s,v26.2s
+
+ ////////////////////////////////////////////////////////////////
+ // multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1
+
+ adds x2,x2,#32
+ b.ne .Long_tail
+
+ dup v16.2d,v11.d[0]
+ add v14.2s,v9.2s,v24.2s
+ add v17.2s,v12.2s,v27.2s
+ add v15.2s,v10.2s,v25.2s
+ add v18.2s,v13.2s,v28.2s
+
+.Long_tail:
+ dup v14.2d,v14.d[0]
+ umull2 v19.2d,v16.4s,v6.4s
+ umull2 v22.2d,v16.4s,v1.4s
+ umull2 v23.2d,v16.4s,v3.4s
+ umull2 v21.2d,v16.4s,v0.4s
+ umull2 v20.2d,v16.4s,v8.4s
+
+ dup v15.2d,v15.d[0]
+ umlal2 v19.2d,v14.4s,v0.4s
+ umlal2 v21.2d,v14.4s,v3.4s
+ umlal2 v22.2d,v14.4s,v5.4s
+ umlal2 v23.2d,v14.4s,v7.4s
+ umlal2 v20.2d,v14.4s,v1.4s
+
+ dup v17.2d,v17.d[0]
+ umlal2 v19.2d,v15.4s,v8.4s
+ umlal2 v22.2d,v15.4s,v3.4s
+ umlal2 v21.2d,v15.4s,v1.4s
+ umlal2 v23.2d,v15.4s,v5.4s
+ umlal2 v20.2d,v15.4s,v0.4s
+
+ dup v18.2d,v18.d[0]
+ umlal2 v22.2d,v17.4s,v0.4s
+ umlal2 v23.2d,v17.4s,v1.4s
+ umlal2 v19.2d,v17.4s,v4.4s
+ umlal2 v20.2d,v17.4s,v6.4s
+ umlal2 v21.2d,v17.4s,v8.4s
+
+ umlal2 v22.2d,v18.4s,v8.4s
+ umlal2 v19.2d,v18.4s,v2.4s
+ umlal2 v23.2d,v18.4s,v0.4s
+ umlal2 v20.2d,v18.4s,v4.4s
+ umlal2 v21.2d,v18.4s,v6.4s
+
+ b.eq .Lshort_tail
+
+ ////////////////////////////////////////////////////////////////
+ // (hash+inp[0:1])*r^4:r^3 and accumulate
+
+ add v9.2s,v9.2s,v24.2s
+ umlal v22.2d,v11.2s,v1.2s
+ umlal v19.2d,v11.2s,v6.2s
+ umlal v23.2d,v11.2s,v3.2s
+ umlal v20.2d,v11.2s,v8.2s
+ umlal v21.2d,v11.2s,v0.2s
+
+ add v10.2s,v10.2s,v25.2s
+ umlal v22.2d,v9.2s,v5.2s
+ umlal v19.2d,v9.2s,v0.2s
+ umlal v23.2d,v9.2s,v7.2s
+ umlal v20.2d,v9.2s,v1.2s
+ umlal v21.2d,v9.2s,v3.2s
+
+ add v12.2s,v12.2s,v27.2s
+ umlal v22.2d,v10.2s,v3.2s
+ umlal v19.2d,v10.2s,v8.2s
+ umlal v23.2d,v10.2s,v5.2s
+ umlal v20.2d,v10.2s,v0.2s
+ umlal v21.2d,v10.2s,v1.2s
+
+ add v13.2s,v13.2s,v28.2s
+ umlal v22.2d,v12.2s,v0.2s
+ umlal v19.2d,v12.2s,v4.2s
+ umlal v23.2d,v12.2s,v1.2s
+ umlal v20.2d,v12.2s,v6.2s
+ umlal v21.2d,v12.2s,v8.2s
+
+ umlal v22.2d,v13.2s,v8.2s
+ umlal v19.2d,v13.2s,v2.2s
+ umlal v23.2d,v13.2s,v0.2s
+ umlal v20.2d,v13.2s,v4.2s
+ umlal v21.2d,v13.2s,v6.2s
+
+.Lshort_tail:
+ ////////////////////////////////////////////////////////////////
+ // horizontal add
+
+ addp v22.2d,v22.2d,v22.2d
+ ldp d8,d9,[sp,#16] // meet ABI requirements
+ addp v19.2d,v19.2d,v19.2d
+ ldp d10,d11,[sp,#32]
+ addp v23.2d,v23.2d,v23.2d
+ ldp d12,d13,[sp,#48]
+ addp v20.2d,v20.2d,v20.2d
+ ldp d14,d15,[sp,#64]
+ addp v21.2d,v21.2d,v21.2d
+
+ ////////////////////////////////////////////////////////////////
+ // lazy reduction, but without narrowing
+
+ ushr v29.2d,v22.2d,#26
+ and v22.16b,v22.16b,v31.16b
+ ushr v30.2d,v19.2d,#26
+ and v19.16b,v19.16b,v31.16b
+
+ add v23.2d,v23.2d,v29.2d // h3 -> h4
+ add v20.2d,v20.2d,v30.2d // h0 -> h1
+
+ ushr v29.2d,v23.2d,#26
+ and v23.16b,v23.16b,v31.16b
+ ushr v30.2d,v20.2d,#26
+ and v20.16b,v20.16b,v31.16b
+ add v21.2d,v21.2d,v30.2d // h1 -> h2
+
+ add v19.2d,v19.2d,v29.2d
+ shl v29.2d,v29.2d,#2
+ ushr v30.2d,v21.2d,#26
+ and v21.16b,v21.16b,v31.16b
+ add v19.2d,v19.2d,v29.2d // h4 -> h0
+ add v22.2d,v22.2d,v30.2d // h2 -> h3
+
+ ushr v29.2d,v19.2d,#26
+ and v19.16b,v19.16b,v31.16b
+ ushr v30.2d,v22.2d,#26
+ and v22.16b,v22.16b,v31.16b
+ add v20.2d,v20.2d,v29.2d // h0 -> h1
+ add v23.2d,v23.2d,v30.2d // h3 -> h4
+
+ ////////////////////////////////////////////////////////////////
+ // write the result, can be partially reduced
+
+ st4 {v19.s,v20.s,v21.s,v22.s}[0],[x0],#16
+ st1 {v23.s}[0],[x0]
+
+.Lno_data_neon:
+ ldr x29,[sp],#80
+ ret
+.size poly1305_blocks_neon,.-poly1305_blocks_neon
+
+.type poly1305_emit_neon,%function
+.align 5
+poly1305_emit_neon:
+ ldr x17,[x0,#24]
+ cbz x17,poly1305_emit
+
+ ldp w10,w11,[x0] // load hash value base 2^26
+ ldp w12,w13,[x0,#8]
+ ldr w14,[x0,#16]
+
+ add x4,x10,x11,lsl#26 // base 2^26 -> base 2^64
+ lsr x5,x12,#12
+ adds x4,x4,x12,lsl#52
+ add x5,x5,x13,lsl#14
+ adc x5,x5,xzr
+ lsr x6,x14,#24
+ adds x5,x5,x14,lsl#40
+ adc x6,x6,xzr // can be partially reduced...
+
+ ldp x10,x11,[x2] // load nonce
+
+ and x12,x6,#-4 // ... so reduce
+ add x12,x12,x6,lsr#2
+ and x6,x6,#3
+ adds x4,x4,x12
+ adcs x5,x5,xzr
+ adc x6,x6,xzr
+
+ adds x12,x4,#5 // compare to modulus
+ adcs x13,x5,xzr
+ adc x14,x6,xzr
+
+ tst x14,#-4 // see if it's carried/borrowed
+
+ csel x4,x4,x12,eq
+ csel x5,x5,x13,eq
+
+#ifdef __ARMEB__
+ ror x10,x10,#32 // flip nonce words
+ ror x11,x11,#32
+#endif
+ adds x4,x4,x10 // accumulate nonce
+ adc x5,x5,x11
+#ifdef __ARMEB__
+ rev x4,x4 // flip output bytes
+ rev x5,x5
+#endif
+ stp x4,x5,[x1] // write result
+
+ ret
+.size poly1305_emit_neon,.-poly1305_emit_neon
+
+.align 5
+.Lzeros:
+.long 0,0,0,0,0,0,0,0
+.LOPENSSL_armcap_P:
+#ifdef __ILP32__
+.long OPENSSL_armcap_P-.
+#else
+.quad OPENSSL_armcap_P-.
+#endif
+.byte 80,111,108,121,49,51,48,53,32,102,111,114,32,65,82,77,118,56,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
+.align 2
+.align 2
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 15/28] zinc: Poly1305 ARM and ARM64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (11 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 14/28] zinc: import Andy Polyakov's Poly1305 ARM and ARM64 implementations Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 16/28] zinc: import Andy Polyakov's Poly1305 MIPS64 implementation Jason A. Donenfeld
` (10 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Russell King, linux-arm-kernel, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
These wire Andy Polyakov's implementations up to the kernel. We make a
few small changes to the assembly:
- Entries and exits use the proper kernel convention macro.
- CPU feature checking is done in C by the glue code, so that has been
removed from the assembly.
- The function names have been renamed to fit kernel conventions.
- Labels have been renamed to fit kernel conventions.
- The neon code can jump to the scalar code when it makes sense to do
so.
The NEON code uses base 2^26, while the scalar code uses base 2^64 on 64-bit
and base 2^32 on 32-bit. If we hit the unfortunate situation of using NEON
and then having to go back to scalar -- because the user is silly and has
called the update function from two separate contexts -- then we need to
convert back to the original base before proceeding. It is possible to
reason that the initial reduction below is sufficient given the
implementation invariants. However, for an avoidance of doubt and because
this is not performance critical, we do the full reduction anyway. This
conversion is found in the glue code, and a proof of correctness may be
easily obtained from Z3: <https://xn--4db.cc/ltPtHCKN/py>.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 2 +
lib/zinc/poly1305/poly1305-arm-glue.c | 140 +++++++++++++++++
...ly1305-arm-cryptogams.S => poly1305-arm.S} | 147 ++++++------------
...05-arm64-cryptogams.S => poly1305-arm64.S} | 127 +++++----------
lib/zinc/poly1305/poly1305.c | 2 +
5 files changed, 231 insertions(+), 187 deletions(-)
create mode 100644 lib/zinc/poly1305/poly1305-arm-glue.c
rename lib/zinc/poly1305/{poly1305-arm-cryptogams.S => poly1305-arm.S} (91%)
rename lib/zinc/poly1305/{poly1305-arm64-cryptogams.S => poly1305-arm64.S} (89%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index a8943d960b6a..c09fd3de60f9 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -12,4 +12,6 @@ obj-$(CONFIG_ZINC_CHACHA20) += zinc_chacha20.o
zinc_poly1305-y := poly1305/poly1305.o
zinc_poly1305-$(CONFIG_ZINC_ARCH_X86_64) += poly1305/poly1305-x86_64.o
+zinc_poly1305-$(CONFIG_ZINC_ARCH_ARM) += poly1305/poly1305-arm.o
+zinc_poly1305-$(CONFIG_ZINC_ARCH_ARM64) += poly1305/poly1305-arm64.o
obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
diff --git a/lib/zinc/poly1305/poly1305-arm-glue.c b/lib/zinc/poly1305/poly1305-arm-glue.c
new file mode 100644
index 000000000000..f4f08ecffbf6
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-arm-glue.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+
+asmlinkage void poly1305_init_arm(void *ctx, const u8 key[16]);
+asmlinkage void poly1305_blocks_arm(void *ctx, const u8 *inp, const size_t len,
+ const u32 padbit);
+asmlinkage void poly1305_emit_arm(void *ctx, u8 mac[16], const u32 nonce[4]);
+asmlinkage void poly1305_blocks_neon(void *ctx, const u8 *inp, const size_t len,
+ const u32 padbit);
+asmlinkage void poly1305_emit_neon(void *ctx, u8 mac[16], const u32 nonce[4]);
+
+static bool poly1305_use_neon __ro_after_init;
+static bool *const poly1305_nobs[] __initconst = { &poly1305_use_neon };
+
+static void __init poly1305_fpu_init(void)
+{
+#if defined(CONFIG_ZINC_ARCH_ARM64)
+ poly1305_use_neon = elf_hwcap & HWCAP_ASIMD;
+#elif defined(CONFIG_ZINC_ARCH_ARM)
+ poly1305_use_neon = elf_hwcap & HWCAP_NEON;
+#endif
+}
+
+#if defined(CONFIG_ZINC_ARCH_ARM64)
+struct poly1305_arch_internal {
+ union {
+ u32 h[5];
+ struct {
+ u64 h0, h1, h2;
+ };
+ };
+ u64 is_base2_26;
+ u64 r[2];
+};
+#elif defined(CONFIG_ZINC_ARCH_ARM)
+struct poly1305_arch_internal {
+ union {
+ u32 h[5];
+ struct {
+ u64 h0, h1;
+ u32 h2;
+ } __packed;
+ };
+ u32 r[4];
+ u32 is_base2_26;
+};
+#endif
+
+/* The NEON code uses base 2^26, while the scalar code uses base 2^64 on 64-bit
+ * and base 2^32 on 32-bit. If we hit the unfortunate situation of using NEON
+ * and then having to go back to scalar -- because the user is silly and has
+ * called the update function from two separate contexts -- then we need to
+ * convert back to the original base before proceeding. The below function is
+ * written for 64-bit integers, and so we have to swap words at the end on
+ * big-endian 32-bit. It is possible to reason that the initial reduction below
+ * is sufficient given the implementation invariants. However, for an avoidance
+ * of doubt and because this is not performance critical, we do the full
+ * reduction anyway.
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->h0 = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->h1 = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->h2 = state->h[4] >> 24;
+ if (IS_ENABLED(CONFIG_ZINC_ARCH_ARM) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) {
+ state->h0 = rol64(state->h0, 32);
+ state->h1 = rol64(state->h1, 32);
+ }
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->h2 >> 2) + (state->h2 & ~3ULL);
+ state->h2 &= 3;
+ state->h0 += cy;
+ state->h1 += (cy = ULT(state->h0, cy));
+ state->h2 += ULT(state->h1, cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
+static inline bool poly1305_init_arch(void *ctx,
+ const u8 key[POLY1305_KEY_SIZE])
+{
+ poly1305_init_arm(ctx, key);
+ return true;
+}
+
+static inline bool poly1305_blocks_arch(void *ctx, const u8 *inp,
+ size_t len, const u32 padbit,
+ simd_context_t *simd_context)
+{
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(PAGE_SIZE < POLY1305_BLOCK_SIZE ||
+ PAGE_SIZE % POLY1305_BLOCK_SIZE);
+
+ if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !poly1305_use_neon ||
+ !simd_use(simd_context)) {
+ convert_to_base2_64(ctx);
+ poly1305_blocks_arm(ctx, inp, len, padbit);
+ return true;
+ }
+
+ for (;;) {
+ const size_t bytes = min_t(size_t, len, PAGE_SIZE);
+
+ poly1305_blocks_neon(ctx, inp, bytes, padbit);
+ len -= bytes;
+ if (!len)
+ break;
+ inp += bytes;
+ simd_relax(simd_context);
+ }
+ return true;
+}
+
+static inline bool poly1305_emit_arch(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4],
+ simd_context_t *simd_context)
+{
+ if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !poly1305_use_neon ||
+ !simd_use(simd_context)) {
+ convert_to_base2_64(ctx);
+ poly1305_emit_arm(ctx, mac, nonce);
+ } else
+ poly1305_emit_neon(ctx, mac, nonce);
+ return true;
+}
diff --git a/lib/zinc/poly1305/poly1305-arm-cryptogams.S b/lib/zinc/poly1305/poly1305-arm.S
similarity index 91%
rename from lib/zinc/poly1305/poly1305-arm-cryptogams.S
rename to lib/zinc/poly1305/poly1305-arm.S
index 884b465030e4..4a0e9d451119 100644
--- a/lib/zinc/poly1305/poly1305-arm-cryptogams.S
+++ b/lib/zinc/poly1305/poly1305-arm.S
@@ -1,9 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-#include "arm_arch.h"
+#include <linux/linkage.h>
.text
#if defined(__thumb2__)
@@ -13,13 +16,8 @@
.code 32
#endif
-.globl poly1305_emit
-.globl poly1305_blocks
-.globl poly1305_init
-.type poly1305_init,%function
.align 5
-poly1305_init:
-.Lpoly1305_init:
+ENTRY(poly1305_init_arm)
stmdb sp!,{r4-r11}
eor r3,r3,r3
@@ -38,10 +36,6 @@ poly1305_init:
moveq r0,#0
beq .Lno_key
-#if __ARM_MAX_ARCH__>=7
- adr r11,.Lpoly1305_init
- ldr r12,.LOPENSSL_armcap
-#endif
ldrb r4,[r1,#0]
mov r10,#0x0fffffff
ldrb r5,[r1,#1]
@@ -56,12 +50,6 @@ poly1305_init:
ldrb r7,[r1,#6]
and r4,r4,r10
-#if __ARM_MAX_ARCH__>=7
- ldr r12,[r11,r12] @ OPENSSL_armcap_P
-# ifdef __APPLE__
- ldr r12,[r12]
-# endif
-#endif
ldrb r8,[r1,#7]
orr r5,r5,r6,lsl#8
ldrb r6,[r1,#8]
@@ -71,35 +59,6 @@ poly1305_init:
ldrb r8,[r1,#10]
and r5,r5,r3
-#if __ARM_MAX_ARCH__>=7
- tst r12,#ARMV7_NEON @ check for NEON
-# ifdef __APPLE__
- adr r9,poly1305_blocks_neon
- adr r11,poly1305_blocks
-# ifdef __thumb2__
- it ne
-# endif
- movne r11,r9
- adr r12,poly1305_emit
- adr r10,poly1305_emit_neon
-# ifdef __thumb2__
- it ne
-# endif
- movne r12,r10
-# else
-# ifdef __thumb2__
- itete eq
-# endif
- addeq r12,r11,#(poly1305_emit-.Lpoly1305_init)
- addne r12,r11,#(poly1305_emit_neon-.Lpoly1305_init)
- addeq r11,r11,#(poly1305_blocks-.Lpoly1305_init)
- addne r11,r11,#(poly1305_blocks_neon-.Lpoly1305_init)
-# endif
-# ifdef __thumb2__
- orr r12,r12,#1 @ thumb-ify address
- orr r11,r11,#1
-# endif
-#endif
ldrb r9,[r1,#11]
orr r6,r6,r7,lsl#8
ldrb r7,[r1,#12]
@@ -118,26 +77,20 @@ poly1305_init:
str r6,[r0,#8]
and r7,r7,r3
str r7,[r0,#12]
-#if __ARM_MAX_ARCH__>=7
- stmia r2,{r11,r12} @ fill functions table
- mov r0,#1
-#else
- mov r0,#0
-#endif
.Lno_key:
ldmia sp!,{r4-r11}
-#if __ARM_ARCH__>=5
+#if __LINUX_ARM_ARCH__ >= 5
bx lr @ bx lr
#else
tst lr,#1
moveq pc,lr @ be binary compatible with V4, yet
.word 0xe12fff1e @ interoperable with Thumb ISA:-)
#endif
-.size poly1305_init,.-poly1305_init
-.type poly1305_blocks,%function
+ENDPROC(poly1305_init_arm)
+
.align 5
-poly1305_blocks:
-.Lpoly1305_blocks:
+ENTRY(poly1305_blocks_arm)
+.Lpoly1305_blocks_arm:
stmdb sp!,{r3-r11,lr}
ands r2,r2,#-16
@@ -158,11 +111,11 @@ poly1305_blocks:
b .Loop
.Loop:
-#if __ARM_ARCH__<7
+#if __LINUX_ARM_ARCH__ < 7
ldrb r0,[lr],#16 @ load input
-# ifdef __thumb2__
+#ifdef __thumb2__
it hi
-# endif
+#endif
addhi r8,r8,#1 @ 1<<128
ldrb r1,[lr,#-15]
ldrb r2,[lr,#-14]
@@ -201,19 +154,19 @@ poly1305_blocks:
orr r3,r2,r3,lsl#24
#else
ldr r0,[lr],#16 @ load input
-# ifdef __thumb2__
+#ifdef __thumb2__
it hi
-# endif
+#endif
addhi r8,r8,#1 @ padbit
ldr r1,[lr,#-12]
ldr r2,[lr,#-8]
ldr r3,[lr,#-4]
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r0,r0
rev r1,r1
rev r2,r2
rev r3,r3
-# endif
+#endif
adds r4,r4,r0 @ accumulate input
str lr,[sp,#8] @ offload input pointer
adcs r5,r5,r1
@@ -283,7 +236,7 @@ poly1305_blocks:
stmia r0,{r4-r8} @ store the result
.Lno_data:
-#if __ARM_ARCH__>=5
+#if __LINUX_ARM_ARCH__ >= 5
ldmia sp!,{r3-r11,pc}
#else
ldmia sp!,{r3-r11,lr}
@@ -291,13 +244,12 @@ poly1305_blocks:
moveq pc,lr @ be binary compatible with V4, yet
.word 0xe12fff1e @ interoperable with Thumb ISA:-)
#endif
-.size poly1305_blocks,.-poly1305_blocks
-.type poly1305_emit,%function
+ENDPROC(poly1305_blocks_arm)
+
.align 5
-poly1305_emit:
+ENTRY(poly1305_emit_arm)
stmdb sp!,{r4-r11}
.Lpoly1305_emit_enter:
-
ldmia r0,{r3-r7}
adds r8,r3,#5 @ compare to modulus
adcs r9,r4,#0
@@ -332,13 +284,13 @@ poly1305_emit:
adcs r5,r5,r10
adc r6,r6,r11
-#if __ARM_ARCH__>=7
-# ifdef __ARMEB__
+#if __LINUX_ARM_ARCH__ >= 7
+#ifdef __ARMEB__
rev r3,r3
rev r4,r4
rev r5,r5
rev r6,r6
-# endif
+#endif
str r3,[r1,#0]
str r4,[r1,#4]
str r5,[r1,#8]
@@ -377,20 +329,22 @@ poly1305_emit:
strb r6,[r1,#15]
#endif
ldmia sp!,{r4-r11}
-#if __ARM_ARCH__>=5
+#if __LINUX_ARM_ARCH__ >= 5
bx lr @ bx lr
#else
tst lr,#1
moveq pc,lr @ be binary compatible with V4, yet
.word 0xe12fff1e @ interoperable with Thumb ISA:-)
#endif
-.size poly1305_emit,.-poly1305_emit
-#if __ARM_MAX_ARCH__>=7
+ENDPROC(poly1305_emit_arm)
+
+
+#ifdef CONFIG_KERNEL_MODE_NEON
.fpu neon
-.type poly1305_init_neon,%function
.align 5
-poly1305_init_neon:
+ENTRY(poly1305_init_neon)
+.Lpoly1305_init_neon:
ldr r4,[r0,#20] @ load key base 2^32
ldr r5,[r0,#24]
ldr r6,[r0,#28]
@@ -600,11 +554,10 @@ poly1305_init_neon:
vst1.32 {d8[1]},[r7]
bx lr @ bx lr
-.size poly1305_init_neon,.-poly1305_init_neon
+ENDPROC(poly1305_init_neon)
-.type poly1305_blocks_neon,%function
.align 5
-poly1305_blocks_neon:
+ENTRY(poly1305_blocks_neon)
ldr ip,[r0,#36] @ is_base2_26
ands r2,r2,#-16
beq .Lno_data_neon
@@ -612,7 +565,7 @@ poly1305_blocks_neon:
cmp r2,#64
bhs .Lenter_neon
tst ip,ip @ is_base2_26?
- beq .Lpoly1305_blocks
+ beq .Lpoly1305_blocks_arm
.Lenter_neon:
stmdb sp!,{r4-r7}
@@ -622,7 +575,7 @@ poly1305_blocks_neon:
bne .Lbase2_26_neon
stmdb sp!,{r1-r3,lr}
- bl poly1305_init_neon
+ bl .Lpoly1305_init_neon
ldr r4,[r0,#0] @ load hash value base 2^32
ldr r5,[r0,#4]
@@ -686,12 +639,12 @@ poly1305_blocks_neon:
sub r2,r2,#16
add r4,r1,#32
-# ifdef __ARMEB__
+#ifdef __ARMEB__
vrev32.8 q10,q10
vrev32.8 q13,q13
vrev32.8 q11,q11
vrev32.8 q12,q12
-# endif
+#endif
vsri.u32 d28,d26,#8 @ base 2^32 -> base 2^26
vshl.u32 d26,d26,#18
@@ -735,12 +688,12 @@ poly1305_blocks_neon:
addhi r7,r0,#(48+1*9*4)
addhi r6,r0,#(48+3*9*4)
-# ifdef __ARMEB__
+#ifdef __ARMEB__
vrev32.8 q10,q10
vrev32.8 q13,q13
vrev32.8 q11,q11
vrev32.8 q12,q12
-# endif
+#endif
vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26
vshl.u32 q13,q13,#18
@@ -866,12 +819,12 @@ poly1305_blocks_neon:
vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1]
add r1,r1,#64
-# ifdef __ARMEB__
+#ifdef __ARMEB__
vrev32.8 q10,q10
vrev32.8 q11,q11
vrev32.8 q12,q12
vrev32.8 q13,q13
-# endif
+#endif
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ lazy reduction interleaved with base 2^32 -> base 2^26 of
@@ -1086,11 +1039,10 @@ poly1305_blocks_neon:
ldmia sp!,{r4-r7}
.Lno_data_neon:
bx lr @ bx lr
-.size poly1305_blocks_neon,.-poly1305_blocks_neon
+ENDPROC(poly1305_blocks_neon)
-.type poly1305_emit_neon,%function
.align 5
-poly1305_emit_neon:
+ENTRY(poly1305_emit_neon)
ldr ip,[r0,#36] @ is_base2_26
stmdb sp!,{r4-r11}
@@ -1144,12 +1096,12 @@ poly1305_emit_neon:
adcs r5,r5,r10
adc r6,r6,r11
-# ifdef __ARMEB__
+#ifdef __ARMEB__
rev r3,r3
rev r4,r4
rev r5,r5
rev r6,r6
-# endif
+#endif
str r3,[r1,#0] @ store the result
str r4,[r1,#4]
str r5,[r1,#8]
@@ -1157,16 +1109,9 @@ poly1305_emit_neon:
ldmia sp!,{r4-r11}
bx lr @ bx lr
-.size poly1305_emit_neon,.-poly1305_emit_neon
+ENDPROC(poly1305_emit_neon)
.align 5
.Lzeros:
.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
-.LOPENSSL_armcap:
-.word OPENSSL_armcap_P-.Lpoly1305_init
-#endif
-.asciz "Poly1305 for ARMv4/NEON, CRYPTOGAMS by <appro@openssl.org>"
-.align 2
-#if __ARM_MAX_ARCH__>=7
-.comm OPENSSL_armcap_P,4,4
#endif
diff --git a/lib/zinc/poly1305/poly1305-arm64-cryptogams.S b/lib/zinc/poly1305/poly1305-arm64.S
similarity index 89%
rename from lib/zinc/poly1305/poly1305-arm64-cryptogams.S
rename to lib/zinc/poly1305/poly1305-arm64.S
index 0ecb50a83ec0..5f4e7fb0a836 100644
--- a/lib/zinc/poly1305/poly1305-arm64-cryptogams.S
+++ b/lib/zinc/poly1305/poly1305-arm64.S
@@ -1,21 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-#include "arm_arch.h"
-
+#include <linux/linkage.h>
.text
-// forward "declarations" are required for Apple
-
-.globl poly1305_blocks
-.globl poly1305_emit
-
-.globl poly1305_init
-.type poly1305_init,%function
.align 5
-poly1305_init:
+ENTRY(poly1305_init_arm)
cmp x1,xzr
stp xzr,xzr,[x0] // zero hash value
stp xzr,xzr,[x0,#16] // [along with is_base2_26]
@@ -23,18 +18,10 @@ poly1305_init:
csel x0,xzr,x0,eq
b.eq .Lno_key
-#ifdef __ILP32__
- ldrsw x11,.LOPENSSL_armcap_P
-#else
- ldr x11,.LOPENSSL_armcap_P
-#endif
- adr x10,.LOPENSSL_armcap_P
-
ldp x7,x8,[x1] // load key
mov x9,#0xfffffffc0fffffff
movk x9,#0x0fff,lsl#48
- ldr w17,[x10,x11]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x7,x7 // flip bytes
rev x8,x8
#endif
@@ -43,30 +30,12 @@ poly1305_init:
and x8,x8,x9 // &=0ffffffc0ffffffc
stp x7,x8,[x0,#32] // save key value
- tst w17,#ARMV7_NEON
-
- adr x12,poly1305_blocks
- adr x7,poly1305_blocks_neon
- adr x13,poly1305_emit
- adr x8,poly1305_emit_neon
-
- csel x12,x12,x7,eq
- csel x13,x13,x8,eq
-
-#ifdef __ILP32__
- stp w12,w13,[x2]
-#else
- stp x12,x13,[x2]
-#endif
-
- mov x0,#1
.Lno_key:
ret
-.size poly1305_init,.-poly1305_init
+ENDPROC(poly1305_init_arm)
-.type poly1305_blocks,%function
.align 5
-poly1305_blocks:
+ENTRY(poly1305_blocks_arm)
ands x2,x2,#-16
b.eq .Lno_data
@@ -80,7 +49,7 @@ poly1305_blocks:
.Loop:
ldp x10,x11,[x1],#16 // load input
sub x2,x2,#16
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x10,x10
rev x11,x11
#endif
@@ -126,11 +95,10 @@ poly1305_blocks:
.Lno_data:
ret
-.size poly1305_blocks,.-poly1305_blocks
+ENDPROC(poly1305_blocks_arm)
-.type poly1305_emit,%function
.align 5
-poly1305_emit:
+ENTRY(poly1305_emit_arm)
ldp x4,x5,[x0] // load hash base 2^64
ldr x6,[x0,#16]
ldp x10,x11,[x2] // load nonce
@@ -144,23 +112,23 @@ poly1305_emit:
csel x4,x4,x12,eq
csel x5,x5,x13,eq
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
ror x10,x10,#32 // flip nonce words
ror x11,x11,#32
#endif
adds x4,x4,x10 // accumulate nonce
adc x5,x5,x11
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x4,x4 // flip output bytes
rev x5,x5
#endif
stp x4,x5,[x1] // write result
ret
-.size poly1305_emit,.-poly1305_emit
-.type poly1305_mult,%function
+ENDPROC(poly1305_emit_arm)
+
.align 5
-poly1305_mult:
+__poly1305_mult:
mul x12,x4,x7 // h0*r0
umulh x13,x4,x7
@@ -193,11 +161,8 @@ poly1305_mult:
adc x6,x6,xzr
ret
-.size poly1305_mult,.-poly1305_mult
-.type poly1305_splat,%function
-.align 5
-poly1305_splat:
+__poly1305_splat:
and x12,x4,#0x03ffffff // base 2^64 -> base 2^26
ubfx x13,x4,#26,#26
extr x14,x5,x4,#52
@@ -220,15 +185,14 @@ poly1305_splat:
str w15,[x0,#16*8] // s4
ret
-.size poly1305_splat,.-poly1305_splat
-.type poly1305_blocks_neon,%function
+#ifdef CONFIG_KERNEL_MODE_NEON
.align 5
-poly1305_blocks_neon:
+ENTRY(poly1305_blocks_neon)
ldr x17,[x0,#24]
cmp x2,#128
b.hs .Lblocks_neon
- cbz x17,poly1305_blocks
+ cbz x17,poly1305_blocks_arm
.Lblocks_neon:
stp x29,x30,[sp,#-80]!
@@ -268,7 +232,7 @@ poly1305_blocks_neon:
adcs x5,x5,xzr
adc x6,x6,xzr
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x12,x12
rev x13,x13
#endif
@@ -276,7 +240,7 @@ poly1305_blocks_neon:
adcs x5,x5,x13
adc x6,x6,x3
- bl poly1305_mult
+ bl __poly1305_mult
ldr x30,[sp,#8]
cbz x3,.Lstore_base2_64_neon
@@ -314,7 +278,7 @@ poly1305_blocks_neon:
ldp x12,x13,[x1],#16 // load input
sub x2,x2,#16
add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x12,x12
rev x13,x13
#endif
@@ -322,7 +286,7 @@ poly1305_blocks_neon:
adcs x5,x5,x13
adc x6,x6,x3
- bl poly1305_mult
+ bl __poly1305_mult
.Linit_neon:
and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
@@ -349,19 +313,19 @@ poly1305_blocks_neon:
mov x5,x8
mov x6,xzr
add x0,x0,#48+12
- bl poly1305_splat
+ bl __poly1305_splat
- bl poly1305_mult // r^2
+ bl __poly1305_mult // r^2
sub x0,x0,#4
- bl poly1305_splat
+ bl __poly1305_splat
- bl poly1305_mult // r^3
+ bl __poly1305_mult // r^3
sub x0,x0,#4
- bl poly1305_splat
+ bl __poly1305_splat
- bl poly1305_mult // r^4
+ bl __poly1305_mult // r^4
sub x0,x0,#4
- bl poly1305_splat
+ bl __poly1305_splat
ldr x30,[sp,#8]
add x16,x1,#32
@@ -399,7 +363,7 @@ poly1305_blocks_neon:
lsl x3,x3,#24
add x15,x0,#48
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x8,x8
rev x12,x12
rev x9,x9
@@ -435,7 +399,7 @@ poly1305_blocks_neon:
ld1 {v4.4s,v5.4s,v6.4s,v7.4s},[x15],#64
ld1 {v8.4s},[x15]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x8,x8
rev x12,x12
rev x9,x9
@@ -496,7 +460,7 @@ poly1305_blocks_neon:
umull v20.2d,v14.2s,v1.s[2]
ldp x9,x13,[x16],#48
umull v19.2d,v14.2s,v0.s[2]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x8,x8
rev x12,x12
rev x9,x9
@@ -561,7 +525,7 @@ poly1305_blocks_neon:
umlal v23.2d,v11.2s,v3.s[0]
umlal v20.2d,v11.2s,v8.s[0]
umlal v21.2d,v11.2s,v0.s[0]
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x8,x8
rev x12,x12
rev x9,x9
@@ -801,13 +765,12 @@ poly1305_blocks_neon:
.Lno_data_neon:
ldr x29,[sp],#80
ret
-.size poly1305_blocks_neon,.-poly1305_blocks_neon
+ENDPROC(poly1305_blocks_neon)
-.type poly1305_emit_neon,%function
.align 5
-poly1305_emit_neon:
+ENTRY(poly1305_emit_neon)
ldr x17,[x0,#24]
- cbz x17,poly1305_emit
+ cbz x17,poly1305_emit_arm
ldp w10,w11,[x0] // load hash value base 2^26
ldp w12,w13,[x0,#8]
@@ -840,30 +803,22 @@ poly1305_emit_neon:
csel x4,x4,x12,eq
csel x5,x5,x13,eq
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
ror x10,x10,#32 // flip nonce words
ror x11,x11,#32
#endif
adds x4,x4,x10 // accumulate nonce
adc x5,x5,x11
-#ifdef __ARMEB__
+#ifdef __AARCH64EB__
rev x4,x4 // flip output bytes
rev x5,x5
#endif
stp x4,x5,[x1] // write result
ret
-.size poly1305_emit_neon,.-poly1305_emit_neon
+ENDPROC(poly1305_emit_neon)
.align 5
.Lzeros:
.long 0,0,0,0,0,0,0,0
-.LOPENSSL_armcap_P:
-#ifdef __ILP32__
-.long OPENSSL_armcap_P-.
-#else
-.quad OPENSSL_armcap_P-.
#endif
-.byte 80,111,108,121,49,51,48,53,32,102,111,114,32,65,82,77,118,56,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
-.align 2
-.align 2
diff --git a/lib/zinc/poly1305/poly1305.c b/lib/zinc/poly1305/poly1305.c
index 51af7045cac8..9dc85f62e806 100644
--- a/lib/zinc/poly1305/poly1305.c
+++ b/lib/zinc/poly1305/poly1305.c
@@ -18,6 +18,8 @@
#if defined(CONFIG_ZINC_ARCH_X86_64)
#include "poly1305-x86_64-glue.c"
+#elif defined(CONFIG_ZINC_ARCH_ARM) || defined(CONFIG_ZINC_ARCH_ARM64)
+#include "poly1305-arm-glue.c"
#else
static inline bool poly1305_init_arch(void *ctx,
const u8 key[POLY1305_KEY_SIZE])
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 16/28] zinc: import Andy Polyakov's Poly1305 MIPS64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (12 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 15/28] zinc: " Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 17/28] zinc: Poly1305 MIPS32r2 and MIPS64 implementations Jason A. Donenfeld
` (9 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Andy Polyakov, Ralf Baechle, Paul Burton,
James Hogan, linux-mips, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
This MIPS64 accelerated implementation comes from Andy Polyakov's
implementation, and is included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
be the same as OpenSSL's commit 947716c1872d210828122212d076d503ae68b928
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Based-on-code-from: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
.../poly1305/poly1305-mips64-cryptogams.S | 338 ++++++++++++++++++
1 file changed, 338 insertions(+)
create mode 100644 lib/zinc/poly1305/poly1305-mips64-cryptogams.S
diff --git a/lib/zinc/poly1305/poly1305-mips64-cryptogams.S b/lib/zinc/poly1305/poly1305-mips64-cryptogams.S
new file mode 100644
index 000000000000..24a6005884c3
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-mips64-cryptogams.S
@@ -0,0 +1,338 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/*
+ * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ */
+
+#include "mips_arch.h"
+
+#ifdef MIPSEB
+# define MSB 0
+# define LSB 7
+#else
+# define MSB 7
+# define LSB 0
+#endif
+
+.text
+.set noat
+.set noreorder
+
+.align 5
+.globl poly1305_init
+.ent poly1305_init
+poly1305_init:
+ .frame $29,0,$31
+ .set reorder
+
+ sd $0,0($4)
+ sd $0,8($4)
+ sd $0,16($4)
+
+ beqz $5,.Lno_key
+
+#if defined(_MIPS_ARCH_MIPS64R6)
+ ld $8,0($5)
+ ld $9,8($5)
+#else
+ ldl $8,0+MSB($5)
+ ldl $9,8+MSB($5)
+ ldr $8,0+LSB($5)
+ ldr $9,8+LSB($5)
+#endif
+#ifdef MIPSEB
+# if defined(_MIPS_ARCH_MIPS64R2)
+ dsbh $8,$8 # byte swap
+ dsbh $9,$9
+ dshd $8,$8
+ dshd $9,$9
+# else
+ ori $10,$0,0xFF
+ dsll $1,$10,32
+ or $10,$1 # 0x000000FF000000FF
+
+ and $11,$8,$10 # byte swap
+ and $2,$9,$10
+ dsrl $1,$8,24
+ dsrl $24,$9,24
+ dsll $11,24
+ dsll $2,24
+ and $1,$10
+ and $24,$10
+ dsll $10,8 # 0x0000FF000000FF00
+ or $11,$1
+ or $2,$24
+ and $1,$8,$10
+ and $24,$9,$10
+ dsrl $8,8
+ dsrl $9,8
+ dsll $1,8
+ dsll $24,8
+ and $8,$10
+ and $9,$10
+ or $11,$1
+ or $2,$24
+ or $8,$11
+ or $9,$2
+ dsrl $11,$8,32
+ dsrl $2,$9,32
+ dsll $8,32
+ dsll $9,32
+ or $8,$11
+ or $9,$2
+# endif
+#endif
+ li $10,1
+ dsll $10,32
+ daddiu $10,-63
+ dsll $10,28
+ daddiu $10,-1 # 0ffffffc0fffffff
+
+ and $8,$10
+ daddiu $10,-3 # 0ffffffc0ffffffc
+ and $9,$10
+
+ sd $8,24($4)
+ dsrl $10,$9,2
+ sd $9,32($4)
+ daddu $10,$9 # s1 = r1 + (r1 >> 2)
+ sd $10,40($4)
+
+.Lno_key:
+ li $2,0 # return 0
+ jr $31
+.end poly1305_init
+.align 5
+.globl poly1305_blocks
+.ent poly1305_blocks
+poly1305_blocks:
+ .set noreorder
+ dsrl $6,4 # number of complete blocks
+ bnez $6,poly1305_blocks_internal
+ nop
+ jr $31
+ nop
+.end poly1305_blocks
+
+.align 5
+.ent poly1305_blocks_internal
+poly1305_blocks_internal:
+ .frame $29,6*8,$31
+ .mask 0x00030000,-8
+ .set noreorder
+ dsubu $29,6*8
+ sd $17,40($29)
+ sd $16,32($29)
+ .set reorder
+
+ ld $12,0($4) # load hash value
+ ld $13,8($4)
+ ld $14,16($4)
+
+ ld $15,24($4) # load key
+ ld $16,32($4)
+ ld $17,40($4)
+
+.Loop:
+#if defined(_MIPS_ARCH_MIPS64R6)
+ ld $8,0($5) # load input
+ ld $9,8($5)
+#else
+ ldl $8,0+MSB($5) # load input
+ ldl $9,8+MSB($5)
+ ldr $8,0+LSB($5)
+ ldr $9,8+LSB($5)
+#endif
+ daddiu $6,-1
+ daddiu $5,16
+#ifdef MIPSEB
+# if defined(_MIPS_ARCH_MIPS64R2)
+ dsbh $8,$8 # byte swap
+ dsbh $9,$9
+ dshd $8,$8
+ dshd $9,$9
+# else
+ ori $10,$0,0xFF
+ dsll $1,$10,32
+ or $10,$1 # 0x000000FF000000FF
+
+ and $11,$8,$10 # byte swap
+ and $2,$9,$10
+ dsrl $1,$8,24
+ dsrl $24,$9,24
+ dsll $11,24
+ dsll $2,24
+ and $1,$10
+ and $24,$10
+ dsll $10,8 # 0x0000FF000000FF00
+ or $11,$1
+ or $2,$24
+ and $1,$8,$10
+ and $24,$9,$10
+ dsrl $8,8
+ dsrl $9,8
+ dsll $1,8
+ dsll $24,8
+ and $8,$10
+ and $9,$10
+ or $11,$1
+ or $2,$24
+ or $8,$11
+ or $9,$2
+ dsrl $11,$8,32
+ dsrl $2,$9,32
+ dsll $8,32
+ dsll $9,32
+ or $8,$11
+ or $9,$2
+# endif
+#endif
+ daddu $12,$8 # accumulate input
+ daddu $13,$9
+ sltu $10,$12,$8
+ sltu $11,$13,$9
+ daddu $13,$10
+
+ dmultu ($15,$12) # h0*r0
+ daddu $14,$7
+ sltu $10,$13,$10
+ mflo ($8,$15,$12)
+ mfhi ($9,$15,$12)
+
+ dmultu ($17,$13) # h1*5*r1
+ daddu $10,$11
+ daddu $14,$10
+ mflo ($10,$17,$13)
+ mfhi ($11,$17,$13)
+
+ dmultu ($16,$12) # h0*r1
+ daddu $8,$10
+ daddu $9,$11
+ mflo ($1,$16,$12)
+ mfhi ($25,$16,$12)
+ sltu $10,$8,$10
+ daddu $9,$10
+
+ dmultu ($15,$13) # h1*r0
+ daddu $9,$1
+ sltu $1,$9,$1
+ mflo ($10,$15,$13)
+ mfhi ($11,$15,$13)
+ daddu $25,$1
+
+ dmultu ($17,$14) # h2*5*r1
+ daddu $9,$10
+ daddu $25,$11
+ mflo ($1,$17,$14)
+
+ dmultu ($15,$14) # h2*r0
+ sltu $10,$9,$10
+ daddu $25,$10
+ mflo ($2,$15,$14)
+
+ daddu $9,$1
+ daddu $25,$2
+ sltu $1,$9,$1
+ daddu $25,$1
+
+ li $10,-4 # final reduction
+ and $10,$25
+ dsrl $11,$25,2
+ andi $14,$25,3
+ daddu $10,$11
+ daddu $12,$8,$10
+ sltu $10,$12,$10
+ daddu $13,$9,$10
+ sltu $10,$13,$10
+ daddu $14,$14,$10
+
+ bnez $6,.Loop
+
+ sd $12,0($4) # store hash value
+ sd $13,8($4)
+ sd $14,16($4)
+
+ .set noreorder
+ ld $17,40($29) # epilogue
+ ld $16,32($29)
+ jr $31
+ daddu $29,6*8
+.end poly1305_blocks_internal
+.align 5
+.globl poly1305_emit
+.ent poly1305_emit
+poly1305_emit:
+ .frame $29,0,$31
+ .set reorder
+
+ ld $10,0($4)
+ ld $11,8($4)
+ ld $1,16($4)
+
+ daddiu $8,$10,5 # compare to modulus
+ sltiu $2,$8,5
+ daddu $9,$11,$2
+ sltu $2,$9,$2
+ daddu $1,$1,$2
+
+ dsrl $1,2 # see if it carried/borrowed
+ dsubu $1,$0,$1
+ nor $2,$0,$1
+
+ and $8,$1
+ and $10,$2
+ and $9,$1
+ and $11,$2
+ or $8,$10
+ or $9,$11
+
+ lwu $10,0($6) # load nonce
+ lwu $11,4($6)
+ lwu $1,8($6)
+ lwu $2,12($6)
+ dsll $11,32
+ dsll $2,32
+ or $10,$11
+ or $1,$2
+
+ daddu $8,$10 # accumulate nonce
+ daddu $9,$1
+ sltu $10,$8,$10
+ daddu $9,$10
+
+ dsrl $10,$8,8 # write mac value
+ dsrl $11,$8,16
+ dsrl $1,$8,24
+ sb $8,0($5)
+ dsrl $2,$8,32
+ sb $10,1($5)
+ dsrl $10,$8,40
+ sb $11,2($5)
+ dsrl $11,$8,48
+ sb $1,3($5)
+ dsrl $1,$8,56
+ sb $2,4($5)
+ dsrl $2,$9,8
+ sb $10,5($5)
+ dsrl $10,$9,16
+ sb $11,6($5)
+ dsrl $11,$9,24
+ sb $1,7($5)
+
+ sb $9,8($5)
+ dsrl $1,$9,32
+ sb $2,9($5)
+ dsrl $2,$9,40
+ sb $10,10($5)
+ dsrl $10,$9,48
+ sb $11,11($5)
+ dsrl $11,$9,56
+ sb $1,12($5)
+ sb $2,13($5)
+ sb $10,14($5)
+ sb $11,15($5)
+
+ jr $31
+.end poly1305_emit
+.rdata
+.asciiz "Poly1305 for MIPS64, CRYPTOGAMS by <appro@openssl.org>"
+.align 2
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 17/28] zinc: Poly1305 MIPS32r2 and MIPS64 implementations
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (13 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 16/28] zinc: import Andy Polyakov's Poly1305 MIPS64 implementation Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 18/28] zinc: ChaCha20Poly1305 construction and selftest Jason A. Donenfeld
` (8 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, René van Dorst, Ralf Baechle,
Paul Burton, James Hogan, linux-mips, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This MIPS32r2 implementation comes from René van Dorst and me and
results in a nice speedup on the usual OpenWRT targets. The MIPS64
implementation from Andy Polyakov ported here results in a nice speedup
on commodity Octeon hardware, and has been modified slightly from the
original:
- The function names have been renamed to fit kernel conventions.
- A comment has been added.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Co-developed-by: René van Dorst <opensource@vdorst.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 3 +
lib/zinc/poly1305/poly1305-mips-glue.c | 37 ++
lib/zinc/poly1305/poly1305-mips.S | 407 ++++++++++++++++++
...-mips64-cryptogams.S => poly1305-mips64.S} | 80 ++--
lib/zinc/poly1305/poly1305.c | 2 +
5 files changed, 500 insertions(+), 29 deletions(-)
create mode 100644 lib/zinc/poly1305/poly1305-mips-glue.c
create mode 100644 lib/zinc/poly1305/poly1305-mips.S
rename lib/zinc/poly1305/{poly1305-mips64-cryptogams.S => poly1305-mips64.S} (75%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index c09fd3de60f9..5c4b1d51cb03 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -14,4 +14,7 @@ zinc_poly1305-y := poly1305/poly1305.o
zinc_poly1305-$(CONFIG_ZINC_ARCH_X86_64) += poly1305/poly1305-x86_64.o
zinc_poly1305-$(CONFIG_ZINC_ARCH_ARM) += poly1305/poly1305-arm.o
zinc_poly1305-$(CONFIG_ZINC_ARCH_ARM64) += poly1305/poly1305-arm64.o
+zinc_poly1305-$(CONFIG_ZINC_ARCH_MIPS) += poly1305/poly1305-mips.o
+AFLAGS_poly1305-mips.o += -O2 # This is required to fill the branch delay slots
+zinc_poly1305-$(CONFIG_ZINC_ARCH_MIPS64) += poly1305/poly1305-mips64.o
obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
diff --git a/lib/zinc/poly1305/poly1305-mips-glue.c b/lib/zinc/poly1305/poly1305-mips-glue.c
new file mode 100644
index 000000000000..1eba9512a05c
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-mips-glue.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+asmlinkage void poly1305_init_mips(void *ctx, const u8 key[16]);
+asmlinkage void poly1305_blocks_mips(void *ctx, const u8 *inp, const size_t len,
+ const u32 padbit);
+asmlinkage void poly1305_emit_mips(void *ctx, u8 mac[16], const u32 nonce[4]);
+
+static bool *const poly1305_nobs[] __initconst = { };
+static void __init poly1305_fpu_init(void)
+{
+}
+
+static inline bool poly1305_init_arch(void *ctx,
+ const u8 key[POLY1305_KEY_SIZE])
+{
+ poly1305_init_mips(ctx, key);
+ return true;
+}
+
+static inline bool poly1305_blocks_arch(void *ctx, const u8 *inp,
+ size_t len, const u32 padbit,
+ simd_context_t *simd_context)
+{
+ poly1305_blocks_mips(ctx, inp, len, padbit);
+ return true;
+}
+
+static inline bool poly1305_emit_arch(void *ctx, u8 mac[POLY1305_MAC_SIZE],
+ const u32 nonce[4],
+ simd_context_t *simd_context)
+{
+ poly1305_emit_mips(ctx, mac, nonce);
+ return true;
+}
diff --git a/lib/zinc/poly1305/poly1305-mips.S b/lib/zinc/poly1305/poly1305-mips.S
new file mode 100644
index 000000000000..4d695eef1091
--- /dev/null
+++ b/lib/zinc/poly1305/poly1305-mips.S
@@ -0,0 +1,407 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2016-2018 René van Dorst <opensource@vdorst.com> All Rights Reserved.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+#define MSB 0
+#define LSB 3
+#else
+#define MSB 3
+#define LSB 0
+#endif
+
+#define POLY1305_BLOCK_SIZE 16
+.text
+#define H0 $t0
+#define H1 $t1
+#define H2 $t2
+#define H3 $t3
+#define H4 $t4
+
+#define R0 $t5
+#define R1 $t6
+#define R2 $t7
+#define R3 $t8
+
+#define O0 $s0
+#define O1 $s4
+#define O2 $v1
+#define O3 $t9
+#define O4 $s5
+
+#define S1 $s1
+#define S2 $s2
+#define S3 $s3
+
+#define SC $at
+#define CA $v0
+
+/* Input arguments */
+#define poly $a0
+#define src $a1
+#define srclen $a2
+#define hibit $a3
+
+/* Location in the opaque buffer
+ * R[0..3], CA, H[0..4]
+ */
+#define PTR_POLY1305_R(n) ( 0 + (n*4)) ## ($a0)
+#define PTR_POLY1305_CA (16 ) ## ($a0)
+#define PTR_POLY1305_H(n) (20 + (n*4)) ## ($a0)
+
+#define POLY1305_BLOCK_SIZE 16
+#define POLY1305_STACK_SIZE 32
+
+.set noat
+.align 4
+.globl poly1305_blocks_mips
+.ent poly1305_blocks_mips
+poly1305_blocks_mips:
+ .frame $sp, POLY1305_STACK_SIZE, $ra
+ /* srclen &= 0xFFFFFFF0 */
+ ins srclen, $zero, 0, 4
+
+ addiu $sp, -(POLY1305_STACK_SIZE)
+
+ /* check srclen >= 16 bytes */
+ beqz srclen, .Lpoly1305_blocks_mips_end
+
+ /* Calculate last round based on src address pointer.
+ * last round src ptr (srclen) = src + (srclen & 0xFFFFFFF0)
+ */
+ addu srclen, src
+
+ lw R0, PTR_POLY1305_R(0)
+ lw R1, PTR_POLY1305_R(1)
+ lw R2, PTR_POLY1305_R(2)
+ lw R3, PTR_POLY1305_R(3)
+
+ /* store the used save registers. */
+ sw $s0, 0($sp)
+ sw $s1, 4($sp)
+ sw $s2, 8($sp)
+ sw $s3, 12($sp)
+ sw $s4, 16($sp)
+ sw $s5, 20($sp)
+
+ /* load Hx and Carry */
+ lw CA, PTR_POLY1305_CA
+ lw H0, PTR_POLY1305_H(0)
+ lw H1, PTR_POLY1305_H(1)
+ lw H2, PTR_POLY1305_H(2)
+ lw H3, PTR_POLY1305_H(3)
+ lw H4, PTR_POLY1305_H(4)
+
+ /* Sx = Rx + (Rx >> 2) */
+ srl S1, R1, 2
+ srl S2, R2, 2
+ srl S3, R3, 2
+ addu S1, R1
+ addu S2, R2
+ addu S3, R3
+
+ addiu SC, $zero, 1
+
+.Lpoly1305_loop:
+ lwl O0, 0+MSB(src)
+ lwl O1, 4+MSB(src)
+ lwl O2, 8+MSB(src)
+ lwl O3,12+MSB(src)
+ lwr O0, 0+LSB(src)
+ lwr O1, 4+LSB(src)
+ lwr O2, 8+LSB(src)
+ lwr O3,12+LSB(src)
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+ wsbh O0
+ wsbh O1
+ wsbh O2
+ wsbh O3
+ rotr O0, 16
+ rotr O1, 16
+ rotr O2, 16
+ rotr O3, 16
+#endif
+
+ /* h0 = (u32)(d0 = (u64)h0 + inp[0] + c 'Carry_previous cycle'); */
+ addu H0, CA
+ sltu CA, H0, CA
+ addu O0, H0
+ sltu H0, O0, H0
+ addu CA, H0
+
+ /* h1 = (u32)(d1 = (u64)h1 + (d0 >> 32) + inp[4]); */
+ addu H1, CA
+ sltu CA, H1, CA
+ addu O1, H1
+ sltu H1, O1, H1
+ addu CA, H1
+
+ /* h2 = (u32)(d2 = (u64)h2 + (d1 >> 32) + inp[8]); */
+ addu H2, CA
+ sltu CA, H2, CA
+ addu O2, H2
+ sltu H2, O2, H2
+ addu CA, H2
+
+ /* h3 = (u32)(d3 = (u64)h3 + (d2 >> 32) + inp[12]); */
+ addu H3, CA
+ sltu CA, H3, CA
+ addu O3, H3
+ sltu H3, O3, H3
+ addu CA, H3
+
+ /* h4 += (u32)(d3 >> 32) + padbit; */
+ addu H4, hibit
+ addu O4, H4, CA
+
+ /* D0 */
+ multu O0, R0
+ maddu O1, S3
+ maddu O2, S2
+ maddu O3, S1
+ mfhi CA
+ mflo H0
+
+ /* D1 */
+ multu O0, R1
+ maddu O1, R0
+ maddu O2, S3
+ maddu O3, S2
+ maddu O4, S1
+ maddu CA, SC
+ mfhi CA
+ mflo H1
+
+ /* D2 */
+ multu O0, R2
+ maddu O1, R1
+ maddu O2, R0
+ maddu O3, S3
+ maddu O4, S2
+ maddu CA, SC
+ mfhi CA
+ mflo H2
+
+ /* D4 */
+ mul H4, O4, R0
+
+ /* D3 */
+ multu O0, R3
+ maddu O1, R2
+ maddu O2, R1
+ maddu O3, R0
+ maddu O4, S3
+ maddu CA, SC
+ mfhi CA
+ mflo H3
+
+ addiu src, POLY1305_BLOCK_SIZE
+
+ /* h4 += (u32)(d3 >> 32); */
+ addu O4, H4, CA
+ /* h4 &= 3 */
+ andi H4, O4, 3
+ /* c = (h4 >> 2) + (h4 & ~3U); */
+ srl CA, O4, 2
+ ins O4, $zero, 0, 2
+
+ addu CA, O4
+
+ /* able to do a 16 byte block. */
+ bne src, srclen, .Lpoly1305_loop
+
+ /* restore the used save registers. */
+ lw $s0, 0($sp)
+ lw $s1, 4($sp)
+ lw $s2, 8($sp)
+ lw $s3, 12($sp)
+ lw $s4, 16($sp)
+ lw $s5, 20($sp)
+
+ /* store Hx and Carry */
+ sw CA, PTR_POLY1305_CA
+ sw H0, PTR_POLY1305_H(0)
+ sw H1, PTR_POLY1305_H(1)
+ sw H2, PTR_POLY1305_H(2)
+ sw H3, PTR_POLY1305_H(3)
+ sw H4, PTR_POLY1305_H(4)
+
+.Lpoly1305_blocks_mips_end:
+ addiu $sp, POLY1305_STACK_SIZE
+
+ /* Jump Back */
+ jr $ra
+.end poly1305_blocks_mips
+.set at
+
+/* Input arguments CTX=$a0, MAC=$a1, NONCE=$a2 */
+#define MAC $a1
+#define NONCE $a2
+
+#define G0 $t5
+#define G1 $t6
+#define G2 $t7
+#define G3 $t8
+#define G4 $t9
+
+.set noat
+.align 4
+.globl poly1305_emit_mips
+.ent poly1305_emit_mips
+poly1305_emit_mips:
+ /* load Hx and Carry */
+ lw CA, PTR_POLY1305_CA
+ lw H0, PTR_POLY1305_H(0)
+ lw H1, PTR_POLY1305_H(1)
+ lw H2, PTR_POLY1305_H(2)
+ lw H3, PTR_POLY1305_H(3)
+ lw H4, PTR_POLY1305_H(4)
+
+ /* Add left over carry */
+ addu H0, CA
+ sltu CA, H0, CA
+ addu H1, CA
+ sltu CA, H1, CA
+ addu H2, CA
+ sltu CA, H2, CA
+ addu H3, CA
+ sltu CA, H3, CA
+ addu H4, CA
+
+ /* compare to modulus by computing h + -p */
+ addiu G0, H0, 5
+ sltu CA, G0, H0
+ addu G1, H1, CA
+ sltu CA, G1, H1
+ addu G2, H2, CA
+ sltu CA, G2, H2
+ addu G3, H3, CA
+ sltu CA, G3, H3
+ addu G4, H4, CA
+
+ srl SC, G4, 2
+
+ /* if there was carry into 131st bit, h3:h0 = g3:g0 */
+ movn H0, G0, SC
+ movn H1, G1, SC
+ movn H2, G2, SC
+ movn H3, G3, SC
+
+ lwl G0, 0+MSB(NONCE)
+ lwl G1, 4+MSB(NONCE)
+ lwl G2, 8+MSB(NONCE)
+ lwl G3,12+MSB(NONCE)
+ lwr G0, 0+LSB(NONCE)
+ lwr G1, 4+LSB(NONCE)
+ lwr G2, 8+LSB(NONCE)
+ lwr G3,12+LSB(NONCE)
+
+ /* mac = (h + nonce) % (2^128) */
+ addu H0, G0
+ sltu CA, H0, G0
+
+ /* H1 */
+ addu H1, CA
+ sltu CA, H1, CA
+ addu H1, G1
+ sltu G1, H1, G1
+ addu CA, G1
+
+ /* H2 */
+ addu H2, CA
+ sltu CA, H2, CA
+ addu H2, G2
+ sltu G2, H2, G2
+ addu CA, G2
+
+ /* H3 */
+ addu H3, CA
+ addu H3, G3
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+ wsbh H0
+ wsbh H1
+ wsbh H2
+ wsbh H3
+ rotr H0, 16
+ rotr H1, 16
+ rotr H2, 16
+ rotr H3, 16
+#endif
+
+ /* store MAC */
+ swl H0, 0+MSB(MAC)
+ swl H1, 4+MSB(MAC)
+ swl H2, 8+MSB(MAC)
+ swl H3,12+MSB(MAC)
+ swr H0, 0+LSB(MAC)
+ swr H1, 4+LSB(MAC)
+ swr H2, 8+LSB(MAC)
+ swr H3,12+LSB(MAC)
+
+ jr $ra
+.end poly1305_emit_mips
+
+#define PR0 $t0
+#define PR1 $t1
+#define PR2 $t2
+#define PR3 $t3
+#define PT0 $t4
+
+/* Input arguments CTX=$a0, KEY=$a1 */
+
+.align 4
+.globl poly1305_init_mips
+.ent poly1305_init_mips
+poly1305_init_mips:
+ lwl PR0, 0+MSB($a1)
+ lwl PR1, 4+MSB($a1)
+ lwl PR2, 8+MSB($a1)
+ lwl PR3,12+MSB($a1)
+ lwr PR0, 0+LSB($a1)
+ lwr PR1, 4+LSB($a1)
+ lwr PR2, 8+LSB($a1)
+ lwr PR3,12+LSB($a1)
+
+ /* store Hx and Carry */
+ sw $zero, PTR_POLY1305_CA
+ sw $zero, PTR_POLY1305_H(0)
+ sw $zero, PTR_POLY1305_H(1)
+ sw $zero, PTR_POLY1305_H(2)
+ sw $zero, PTR_POLY1305_H(3)
+ sw $zero, PTR_POLY1305_H(4)
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+ wsbh PR0
+ wsbh PR1
+ wsbh PR2
+ wsbh PR3
+ rotr PR0, 16
+ rotr PR1, 16
+ rotr PR2, 16
+ rotr PR3, 16
+#endif
+
+ lui PT0, 0x0FFF
+ ori PT0, 0xFFFC
+
+ /* AND 0x0fffffff; */
+ ext PR0, PR0, 0, (32-4)
+
+ /* AND 0x0ffffffc; */
+ and PR1, PT0
+ and PR2, PT0
+ and PR3, PT0
+
+ /* store Rx */
+ sw PR0, PTR_POLY1305_R(0)
+ sw PR1, PTR_POLY1305_R(1)
+ sw PR2, PTR_POLY1305_R(2)
+ sw PR3, PTR_POLY1305_R(3)
+
+ /* Jump Back */
+ jr $ra
+.end poly1305_init_mips
diff --git a/lib/zinc/poly1305/poly1305-mips64-cryptogams.S b/lib/zinc/poly1305/poly1305-mips64.S
similarity index 75%
rename from lib/zinc/poly1305/poly1305-mips64-cryptogams.S
rename to lib/zinc/poly1305/poly1305-mips64.S
index 24a6005884c3..272a86c47bcb 100644
--- a/lib/zinc/poly1305/poly1305-mips64-cryptogams.S
+++ b/lib/zinc/poly1305/poly1305-mips64.S
@@ -1,26 +1,49 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
* Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
*/
-#include "mips_arch.h"
+#if (defined(_MIPS_ARCH_MIPS64R3) || defined(_MIPS_ARCH_MIPS64R5) || \
+ defined(_MIPS_ARCH_MIPS64R6)) && !defined(_MIPS_ARCH_MIPS64R2)
+#define _MIPS_ARCH_MIPS64R2
+#endif
+
+#ifdef __MIPSEB__
+#define MSB 0
+#define LSB 7
+#else
+#define MSB 7
+#define LSB 0
+#endif
-#ifdef MIPSEB
-# define MSB 0
-# define LSB 7
+#if defined(_MIPS_ARCH_MIPS64R6)
+#define dmultu(rs,rt)
+#define mflo(rd,rs,rt) dmulu rd,rs,rt
+#define mfhi(rd,rs,rt) dmuhu rd,rs,rt
#else
-# define MSB 7
-# define LSB 0
+#define dmultu(rs,rt) dmultu rs,rt
+#define multu(rs,rt) multu rs,rt
+#define mflo(rd,rs,rt) mflo rd
+#define mfhi(rd,rs,rt) mfhi rd
#endif
.text
.set noat
.set noreorder
+/* While most of the assembly in the kernel prefers ENTRY() and ENDPROC(),
+ * there is no existing MIPS assembly that uses it, and MIPS assembler seems
+ * to like its own .ent/.end notation, which the MIPS include files don't
+ * provide in a MIPS-specific ENTRY/ENDPROC definition. So, we skip these
+ * for now, until somebody complains. */
+
.align 5
-.globl poly1305_init
-.ent poly1305_init
-poly1305_init:
+.globl poly1305_init_mips
+.ent poly1305_init_mips
+poly1305_init_mips:
.frame $29,0,$31
.set reorder
@@ -39,13 +62,13 @@ poly1305_init:
ldr $8,0+LSB($5)
ldr $9,8+LSB($5)
#endif
-#ifdef MIPSEB
-# if defined(_MIPS_ARCH_MIPS64R2)
+#ifdef __MIPSEB__
+#if defined(_MIPS_ARCH_MIPS64R2)
dsbh $8,$8 # byte swap
dsbh $9,$9
dshd $8,$8
dshd $9,$9
-# else
+#else
ori $10,$0,0xFF
dsll $1,$10,32
or $10,$1 # 0x000000FF000000FF
@@ -79,7 +102,7 @@ poly1305_init:
dsll $9,32
or $8,$11
or $9,$2
-# endif
+#endif
#endif
li $10,1
dsll $10,32
@@ -100,18 +123,19 @@ poly1305_init:
.Lno_key:
li $2,0 # return 0
jr $31
-.end poly1305_init
+.end poly1305_init_mips
+
.align 5
-.globl poly1305_blocks
-.ent poly1305_blocks
-poly1305_blocks:
+.globl poly1305_blocks_mips
+.ent poly1305_blocks_mips
+poly1305_blocks_mips:
.set noreorder
dsrl $6,4 # number of complete blocks
bnez $6,poly1305_blocks_internal
nop
jr $31
nop
-.end poly1305_blocks
+.end poly1305_blocks_mips
.align 5
.ent poly1305_blocks_internal
@@ -144,13 +168,13 @@ poly1305_blocks_internal:
#endif
daddiu $6,-1
daddiu $5,16
-#ifdef MIPSEB
-# if defined(_MIPS_ARCH_MIPS64R2)
+#ifdef __MIPSEB__
+#if defined(_MIPS_ARCH_MIPS64R2)
dsbh $8,$8 # byte swap
dsbh $9,$9
dshd $8,$8
dshd $9,$9
-# else
+#else
ori $10,$0,0xFF
dsll $1,$10,32
or $10,$1 # 0x000000FF000000FF
@@ -184,7 +208,7 @@ poly1305_blocks_internal:
dsll $9,32
or $8,$11
or $9,$2
-# endif
+#endif
#endif
daddu $12,$8 # accumulate input
daddu $13,$9
@@ -257,10 +281,11 @@ poly1305_blocks_internal:
jr $31
daddu $29,6*8
.end poly1305_blocks_internal
+
.align 5
-.globl poly1305_emit
-.ent poly1305_emit
-poly1305_emit:
+.globl poly1305_emit_mips
+.ent poly1305_emit_mips
+poly1305_emit_mips:
.frame $29,0,$31
.set reorder
@@ -332,7 +357,4 @@ poly1305_emit:
sb $11,15($5)
jr $31
-.end poly1305_emit
-.rdata
-.asciiz "Poly1305 for MIPS64, CRYPTOGAMS by <appro@openssl.org>"
-.align 2
+.end poly1305_emit_mips
diff --git a/lib/zinc/poly1305/poly1305.c b/lib/zinc/poly1305/poly1305.c
index 9dc85f62e806..e3386a0e1554 100644
--- a/lib/zinc/poly1305/poly1305.c
+++ b/lib/zinc/poly1305/poly1305.c
@@ -20,6 +20,8 @@
#include "poly1305-x86_64-glue.c"
#elif defined(CONFIG_ZINC_ARCH_ARM) || defined(CONFIG_ZINC_ARCH_ARM64)
#include "poly1305-arm-glue.c"
+#elif defined(CONFIG_ZINC_ARCH_MIPS) || defined(CONFIG_ZINC_ARCH_MIPS64)
+#include "poly1305-mips-glue.c"
#else
static inline bool poly1305_init_arch(void *ctx,
const u8 key[POLY1305_KEY_SIZE])
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 18/28] zinc: ChaCha20Poly1305 construction and selftest
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (14 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 17/28] zinc: Poly1305 MIPS32r2 and MIPS64 implementations Jason A. Donenfeld
@ 2018-10-06 2:56 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 19/28] zinc: BLAKE2s generic C implementation " Jason A. Donenfeld
` (7 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:56 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
This is an implementation of the ChaCha20Poly1305 AEAD, with an easy API
for encrypting either contiguous buffers or scatter gather lists (such
as those created from skb_to_sgvec).
Information: https://tools.ietf.org/html/rfc8439
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
include/zinc/chacha20poly1305.h | 50 +
lib/zinc/Kconfig | 6 +
lib/zinc/Makefile | 3 +
lib/zinc/chacha20poly1305.c | 362 ++
lib/zinc/selftest/chacha20poly1305.c | 9034 ++++++++++++++++++++++++++
5 files changed, 9455 insertions(+)
create mode 100644 include/zinc/chacha20poly1305.h
create mode 100644 lib/zinc/chacha20poly1305.c
create mode 100644 lib/zinc/selftest/chacha20poly1305.c
diff --git a/include/zinc/chacha20poly1305.h b/include/zinc/chacha20poly1305.h
new file mode 100644
index 000000000000..d2753c5bada5
--- /dev/null
+++ b/include/zinc/chacha20poly1305.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_CHACHA20POLY1305_H
+#define _ZINC_CHACHA20POLY1305_H
+
+#include <linux/simd.h>
+#include <linux/types.h>
+
+struct scatterlist;
+
+enum chacha20poly1305_lengths {
+ XCHACHA20POLY1305_NONCE_SIZE = 24,
+ CHACHA20POLY1305_KEY_SIZE = 32,
+ CHACHA20POLY1305_AUTHTAG_SIZE = 16
+};
+
+void chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE]);
+
+bool __must_check chacha20poly1305_encrypt_sg(
+ struct scatterlist *dst, struct scatterlist *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len, const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE], simd_context_t *simd_context);
+
+bool __must_check
+chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len, const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE]);
+
+bool __must_check chacha20poly1305_decrypt_sg(
+ struct scatterlist *dst, struct scatterlist *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len, const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE], simd_context_t *simd_context);
+
+void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE],
+ const u8 key[CHACHA20POLY1305_KEY_SIZE]);
+
+bool __must_check xchacha20poly1305_decrypt(
+ u8 *dst, const u8 *src, const size_t src_len, const u8 *ad,
+ const size_t ad_len, const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE],
+ const u8 key[CHACHA20POLY1305_KEY_SIZE]);
+
+#endif /* _ZINC_CHACHA20POLY1305_H */
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
index f08bf1eaa2a0..765eba3267c9 100644
--- a/lib/zinc/Kconfig
+++ b/lib/zinc/Kconfig
@@ -5,6 +5,12 @@ config ZINC_CHACHA20
config ZINC_POLY1305
tristate
+config ZINC_CHACHA20POLY1305
+ tristate
+ select ZINC_CHACHA20
+ select ZINC_POLY1305
+ select CRYPTO_BLKCIPHER
+
config ZINC_SELFTEST
bool "Zinc cryptography library self-tests"
help
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 5c4b1d51cb03..c31186b491e8 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -18,3 +18,6 @@ zinc_poly1305-$(CONFIG_ZINC_ARCH_MIPS) += poly1305/poly1305-mips.o
AFLAGS_poly1305-mips.o += -O2 # This is required to fill the branch delay slots
zinc_poly1305-$(CONFIG_ZINC_ARCH_MIPS64) += poly1305/poly1305-mips64.o
obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
+
+zinc_chacha20poly1305-y := chacha20poly1305.o
+obj-$(CONFIG_ZINC_CHACHA20POLY1305) += zinc_chacha20poly1305.o
diff --git a/lib/zinc/chacha20poly1305.c b/lib/zinc/chacha20poly1305.c
new file mode 100644
index 000000000000..65b88f413dae
--- /dev/null
+++ b/lib/zinc/chacha20poly1305.c
@@ -0,0 +1,362 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is an implementation of the ChaCha20Poly1305 AEAD construction.
+ *
+ * Information: https://tools.ietf.org/html/rfc8439
+ */
+
+#include <zinc/chacha20poly1305.h>
+#include <zinc/chacha20.h>
+#include <zinc/poly1305.h>
+#include "selftest/run.h"
+
+#include <asm/unaligned.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <crypto/scatterwalk.h> // For blkcipher_walk.
+
+static const u8 pad0[16] = { 0 };
+
+static struct crypto_alg chacha20_alg = {
+ .cra_blocksize = 1,
+ .cra_alignmask = sizeof(u32) - 1
+};
+static struct crypto_blkcipher chacha20_cipher = {
+ .base = {
+ .__crt_alg = &chacha20_alg
+ }
+};
+static struct blkcipher_desc chacha20_desc = {
+ .tfm = &chacha20_cipher
+};
+
+static inline void
+__chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len, const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ struct poly1305_ctx poly1305_state;
+ struct chacha20_ctx chacha20_state;
+ union {
+ u8 block0[POLY1305_KEY_SIZE];
+ __le64 lens[2];
+ } b = { { 0 } };
+
+ chacha20_init(&chacha20_state, key, nonce);
+ chacha20(&chacha20_state, b.block0, b.block0, sizeof(b.block0),
+ simd_context);
+ poly1305_init(&poly1305_state, b.block0);
+
+ poly1305_update(&poly1305_state, ad, ad_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf,
+ simd_context);
+
+ chacha20(&chacha20_state, dst, src, src_len, simd_context);
+
+ poly1305_update(&poly1305_state, dst, src_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - src_len) & 0xf,
+ simd_context);
+
+ b.lens[0] = cpu_to_le64(ad_len);
+ b.lens[1] = cpu_to_le64(src_len);
+ poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens),
+ simd_context);
+
+ poly1305_final(&poly1305_state, dst + src_len, simd_context);
+
+ memzero_explicit(&chacha20_state, sizeof(chacha20_state));
+ memzero_explicit(&b, sizeof(b));
+}
+
+void chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ simd_context_t simd_context;
+
+ simd_get(&simd_context);
+ __chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len, nonce, key,
+ &simd_context);
+ simd_put(&simd_context);
+}
+EXPORT_SYMBOL(chacha20poly1305_encrypt);
+
+bool chacha20poly1305_encrypt_sg(struct scatterlist *dst,
+ struct scatterlist *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ struct poly1305_ctx poly1305_state;
+ struct chacha20_ctx chacha20_state;
+ int ret = 0;
+ struct blkcipher_walk walk;
+ union {
+ u8 block0[POLY1305_KEY_SIZE];
+ u8 mac[POLY1305_MAC_SIZE];
+ __le64 lens[2];
+ } b = { { 0 } };
+
+ chacha20_init(&chacha20_state, key, nonce);
+ chacha20(&chacha20_state, b.block0, b.block0, sizeof(b.block0),
+ simd_context);
+ poly1305_init(&poly1305_state, b.block0);
+
+ poly1305_update(&poly1305_state, ad, ad_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf,
+ simd_context);
+
+ if (likely(src_len)) {
+ blkcipher_walk_init(&walk, dst, src, src_len);
+ ret = blkcipher_walk_virt_block(&chacha20_desc, &walk,
+ CHACHA20_BLOCK_SIZE);
+ while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
+ size_t chunk_len =
+ rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE);
+
+ chacha20(&chacha20_state, walk.dst.virt.addr,
+ walk.src.virt.addr, chunk_len, simd_context);
+ poly1305_update(&poly1305_state, walk.dst.virt.addr,
+ chunk_len, simd_context);
+ simd_relax(simd_context);
+ ret = blkcipher_walk_done(&chacha20_desc, &walk,
+ walk.nbytes % CHACHA20_BLOCK_SIZE);
+ }
+ if (walk.nbytes) {
+ chacha20(&chacha20_state, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.nbytes, simd_context);
+ poly1305_update(&poly1305_state, walk.dst.virt.addr,
+ walk.nbytes, simd_context);
+ ret = blkcipher_walk_done(&chacha20_desc, &walk, 0);
+ }
+ }
+ if (unlikely(ret))
+ goto err;
+
+ poly1305_update(&poly1305_state, pad0, (0x10 - src_len) & 0xf,
+ simd_context);
+
+ b.lens[0] = cpu_to_le64(ad_len);
+ b.lens[1] = cpu_to_le64(src_len);
+ poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens),
+ simd_context);
+
+ poly1305_final(&poly1305_state, b.mac, simd_context);
+ scatterwalk_map_and_copy(b.mac, dst, src_len, sizeof(b.mac), 1);
+err:
+ memzero_explicit(&chacha20_state, sizeof(chacha20_state));
+ memzero_explicit(&b, sizeof(b));
+ return !ret;
+}
+EXPORT_SYMBOL(chacha20poly1305_encrypt_sg);
+
+static inline bool
+__chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len, const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ struct poly1305_ctx poly1305_state;
+ struct chacha20_ctx chacha20_state;
+ int ret;
+ size_t dst_len;
+ union {
+ u8 block0[POLY1305_KEY_SIZE];
+ u8 mac[POLY1305_MAC_SIZE];
+ __le64 lens[2];
+ } b = { { 0 } };
+
+ if (unlikely(src_len < POLY1305_MAC_SIZE))
+ return false;
+
+ chacha20_init(&chacha20_state, key, nonce);
+ chacha20(&chacha20_state, b.block0, b.block0, sizeof(b.block0),
+ simd_context);
+ poly1305_init(&poly1305_state, b.block0);
+
+ poly1305_update(&poly1305_state, ad, ad_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf,
+ simd_context);
+
+ dst_len = src_len - POLY1305_MAC_SIZE;
+ poly1305_update(&poly1305_state, src, dst_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - dst_len) & 0xf,
+ simd_context);
+
+ b.lens[0] = cpu_to_le64(ad_len);
+ b.lens[1] = cpu_to_le64(dst_len);
+ poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens),
+ simd_context);
+
+ poly1305_final(&poly1305_state, b.mac, simd_context);
+
+ ret = crypto_memneq(b.mac, src + dst_len, POLY1305_MAC_SIZE);
+ if (likely(!ret))
+ chacha20(&chacha20_state, dst, src, dst_len, simd_context);
+
+ memzero_explicit(&chacha20_state, sizeof(chacha20_state));
+ memzero_explicit(&b, sizeof(b));
+
+ return !ret;
+}
+
+bool chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ simd_context_t simd_context, ret;
+
+ simd_get(&simd_context);
+ ret = __chacha20poly1305_decrypt(dst, src, src_len, ad, ad_len, nonce,
+ key, &simd_context);
+ simd_put(&simd_context);
+ return ret;
+}
+EXPORT_SYMBOL(chacha20poly1305_decrypt);
+
+bool chacha20poly1305_decrypt_sg(struct scatterlist *dst,
+ struct scatterlist *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u64 nonce,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE],
+ simd_context_t *simd_context)
+{
+ struct poly1305_ctx poly1305_state;
+ struct chacha20_ctx chacha20_state;
+ struct blkcipher_walk walk;
+ int ret = 0;
+ size_t dst_len;
+ union {
+ u8 block0[POLY1305_KEY_SIZE];
+ struct {
+ u8 read_mac[POLY1305_MAC_SIZE];
+ u8 computed_mac[POLY1305_MAC_SIZE];
+ };
+ __le64 lens[2];
+ } b = { { 0 } };
+
+ if (unlikely(src_len < POLY1305_MAC_SIZE))
+ return false;
+
+ chacha20_init(&chacha20_state, key, nonce);
+ chacha20(&chacha20_state, b.block0, b.block0, sizeof(b.block0),
+ simd_context);
+ poly1305_init(&poly1305_state, b.block0);
+
+ poly1305_update(&poly1305_state, ad, ad_len, simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf,
+ simd_context);
+
+ dst_len = src_len - POLY1305_MAC_SIZE;
+ if (likely(dst_len)) {
+ blkcipher_walk_init(&walk, dst, src, dst_len);
+ ret = blkcipher_walk_virt_block(&chacha20_desc, &walk,
+ CHACHA20_BLOCK_SIZE);
+ while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
+ size_t chunk_len =
+ rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE);
+
+ poly1305_update(&poly1305_state, walk.src.virt.addr,
+ chunk_len, simd_context);
+ chacha20(&chacha20_state, walk.dst.virt.addr,
+ walk.src.virt.addr, chunk_len, simd_context);
+ simd_relax(simd_context);
+ ret = blkcipher_walk_done(&chacha20_desc, &walk,
+ walk.nbytes % CHACHA20_BLOCK_SIZE);
+ }
+ if (walk.nbytes) {
+ poly1305_update(&poly1305_state, walk.src.virt.addr,
+ walk.nbytes, simd_context);
+ chacha20(&chacha20_state, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.nbytes, simd_context);
+ ret = blkcipher_walk_done(&chacha20_desc, &walk, 0);
+ }
+ }
+ if (unlikely(ret))
+ goto err;
+
+ poly1305_update(&poly1305_state, pad0, (0x10 - dst_len) & 0xf,
+ simd_context);
+
+ b.lens[0] = cpu_to_le64(ad_len);
+ b.lens[1] = cpu_to_le64(dst_len);
+ poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens),
+ simd_context);
+
+ poly1305_final(&poly1305_state, b.computed_mac, simd_context);
+
+ scatterwalk_map_and_copy(b.read_mac, src, dst_len, POLY1305_MAC_SIZE, 0);
+ ret = crypto_memneq(b.read_mac, b.computed_mac, POLY1305_MAC_SIZE);
+err:
+ memzero_explicit(&chacha20_state, sizeof(chacha20_state));
+ memzero_explicit(&b, sizeof(b));
+ return !ret;
+}
+EXPORT_SYMBOL(chacha20poly1305_decrypt_sg);
+
+void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE],
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ simd_context_t simd_context;
+ u32 derived_key[CHACHA20_KEY_WORDS] __aligned(16);
+
+ simd_get(&simd_context);
+ hchacha20(derived_key, nonce, key, &simd_context);
+ cpu_to_le32_array(derived_key, ARRAY_SIZE(derived_key));
+ __chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len,
+ get_unaligned_le64(nonce + 16),
+ (u8 *)derived_key, &simd_context);
+ memzero_explicit(derived_key, CHACHA20POLY1305_KEY_SIZE);
+ simd_put(&simd_context);
+}
+EXPORT_SYMBOL(xchacha20poly1305_encrypt);
+
+bool xchacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE],
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ bool ret;
+ simd_context_t simd_context;
+ u32 derived_key[CHACHA20_KEY_WORDS] __aligned(16);
+
+ simd_get(&simd_context);
+ hchacha20(derived_key, nonce, key, &simd_context);
+ cpu_to_le32_array(derived_key, ARRAY_SIZE(derived_key));
+ ret = __chacha20poly1305_decrypt(dst, src, src_len, ad, ad_len,
+ get_unaligned_le64(nonce + 16),
+ (u8 *)derived_key, &simd_context);
+ memzero_explicit(derived_key, CHACHA20POLY1305_KEY_SIZE);
+ simd_put(&simd_context);
+ return ret;
+}
+EXPORT_SYMBOL(xchacha20poly1305_decrypt);
+
+#include "selftest/chacha20poly1305.c"
+
+static int __init mod_init(void)
+{
+ if (!selftest_run("chacha20poly1305", chacha20poly1305_selftest,
+ NULL, 0))
+ return -ENOTRECOVERABLE;
+ return 0;
+}
+
+static void __exit mod_exit(void)
+{
+}
+
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("ChaCha20Poly1305 AEAD construction");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/zinc/selftest/chacha20poly1305.c b/lib/zinc/selftest/chacha20poly1305.c
new file mode 100644
index 000000000000..571befe60466
--- /dev/null
+++ b/lib/zinc/selftest/chacha20poly1305.c
@@ -0,0 +1,9034 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+struct chacha20poly1305_testvec {
+ const u8 *input, *output, *assoc, *nonce, *key;
+ size_t ilen, alen, nlen;
+ bool failure;
+};
+
+/* The first of these are the ChaCha20-Poly1305 AEAD test vectors from RFC7539
+ * 2.8.2. After they are generated by reference implementations. And the final
+ * marked ones are taken from wycheproof, but we only do these for the encrypt
+ * side, because mostly we're stressing the primitives rather than the actual
+ * chapoly construction. This also requires adding a 96-bit nonce construction,
+ * just for the purpose of the tests.
+ */
+
+static const u8 enc_input001[] __initconst = {
+ 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74,
+ 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20,
+ 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66,
+ 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69,
+ 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20,
+ 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20,
+ 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d,
+ 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e,
+ 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65,
+ 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64,
+ 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63,
+ 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f,
+ 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64,
+ 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65,
+ 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61,
+ 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e,
+ 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69,
+ 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72,
+ 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20,
+ 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65,
+ 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61,
+ 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72,
+ 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65,
+ 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61,
+ 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20,
+ 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65,
+ 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20,
+ 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20,
+ 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b,
+ 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67,
+ 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80,
+ 0x9d
+};
+static const u8 enc_output001[] __initconst = {
+ 0x64, 0xa0, 0x86, 0x15, 0x75, 0x86, 0x1a, 0xf4,
+ 0x60, 0xf0, 0x62, 0xc7, 0x9b, 0xe6, 0x43, 0xbd,
+ 0x5e, 0x80, 0x5c, 0xfd, 0x34, 0x5c, 0xf3, 0x89,
+ 0xf1, 0x08, 0x67, 0x0a, 0xc7, 0x6c, 0x8c, 0xb2,
+ 0x4c, 0x6c, 0xfc, 0x18, 0x75, 0x5d, 0x43, 0xee,
+ 0xa0, 0x9e, 0xe9, 0x4e, 0x38, 0x2d, 0x26, 0xb0,
+ 0xbd, 0xb7, 0xb7, 0x3c, 0x32, 0x1b, 0x01, 0x00,
+ 0xd4, 0xf0, 0x3b, 0x7f, 0x35, 0x58, 0x94, 0xcf,
+ 0x33, 0x2f, 0x83, 0x0e, 0x71, 0x0b, 0x97, 0xce,
+ 0x98, 0xc8, 0xa8, 0x4a, 0xbd, 0x0b, 0x94, 0x81,
+ 0x14, 0xad, 0x17, 0x6e, 0x00, 0x8d, 0x33, 0xbd,
+ 0x60, 0xf9, 0x82, 0xb1, 0xff, 0x37, 0xc8, 0x55,
+ 0x97, 0x97, 0xa0, 0x6e, 0xf4, 0xf0, 0xef, 0x61,
+ 0xc1, 0x86, 0x32, 0x4e, 0x2b, 0x35, 0x06, 0x38,
+ 0x36, 0x06, 0x90, 0x7b, 0x6a, 0x7c, 0x02, 0xb0,
+ 0xf9, 0xf6, 0x15, 0x7b, 0x53, 0xc8, 0x67, 0xe4,
+ 0xb9, 0x16, 0x6c, 0x76, 0x7b, 0x80, 0x4d, 0x46,
+ 0xa5, 0x9b, 0x52, 0x16, 0xcd, 0xe7, 0xa4, 0xe9,
+ 0x90, 0x40, 0xc5, 0xa4, 0x04, 0x33, 0x22, 0x5e,
+ 0xe2, 0x82, 0xa1, 0xb0, 0xa0, 0x6c, 0x52, 0x3e,
+ 0xaf, 0x45, 0x34, 0xd7, 0xf8, 0x3f, 0xa1, 0x15,
+ 0x5b, 0x00, 0x47, 0x71, 0x8c, 0xbc, 0x54, 0x6a,
+ 0x0d, 0x07, 0x2b, 0x04, 0xb3, 0x56, 0x4e, 0xea,
+ 0x1b, 0x42, 0x22, 0x73, 0xf5, 0x48, 0x27, 0x1a,
+ 0x0b, 0xb2, 0x31, 0x60, 0x53, 0xfa, 0x76, 0x99,
+ 0x19, 0x55, 0xeb, 0xd6, 0x31, 0x59, 0x43, 0x4e,
+ 0xce, 0xbb, 0x4e, 0x46, 0x6d, 0xae, 0x5a, 0x10,
+ 0x73, 0xa6, 0x72, 0x76, 0x27, 0x09, 0x7a, 0x10,
+ 0x49, 0xe6, 0x17, 0xd9, 0x1d, 0x36, 0x10, 0x94,
+ 0xfa, 0x68, 0xf0, 0xff, 0x77, 0x98, 0x71, 0x30,
+ 0x30, 0x5b, 0xea, 0xba, 0x2e, 0xda, 0x04, 0xdf,
+ 0x99, 0x7b, 0x71, 0x4d, 0x6c, 0x6f, 0x2c, 0x29,
+ 0xa6, 0xad, 0x5c, 0xb4, 0x02, 0x2b, 0x02, 0x70,
+ 0x9b, 0xee, 0xad, 0x9d, 0x67, 0x89, 0x0c, 0xbb,
+ 0x22, 0x39, 0x23, 0x36, 0xfe, 0xa1, 0x85, 0x1f,
+ 0x38
+};
+static const u8 enc_assoc001[] __initconst = {
+ 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x4e, 0x91
+};
+static const u8 enc_nonce001[] __initconst = {
+ 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08
+};
+static const u8 enc_key001[] __initconst = {
+ 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a,
+ 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0,
+ 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09,
+ 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0
+};
+
+static const u8 enc_input002[] __initconst = { };
+static const u8 enc_output002[] __initconst = {
+ 0xea, 0xe0, 0x1e, 0x9e, 0x2c, 0x91, 0xaa, 0xe1,
+ 0xdb, 0x5d, 0x99, 0x3f, 0x8a, 0xf7, 0x69, 0x92
+};
+static const u8 enc_assoc002[] __initconst = { };
+static const u8 enc_nonce002[] __initconst = {
+ 0xca, 0xbf, 0x33, 0x71, 0x32, 0x45, 0x77, 0x8e
+};
+static const u8 enc_key002[] __initconst = {
+ 0x4c, 0xf5, 0x96, 0x83, 0x38, 0xe6, 0xae, 0x7f,
+ 0x2d, 0x29, 0x25, 0x76, 0xd5, 0x75, 0x27, 0x86,
+ 0x91, 0x9a, 0x27, 0x7a, 0xfb, 0x46, 0xc5, 0xef,
+ 0x94, 0x81, 0x79, 0x57, 0x14, 0x59, 0x40, 0x68
+};
+
+static const u8 enc_input003[] __initconst = { };
+static const u8 enc_output003[] __initconst = {
+ 0xdd, 0x6b, 0x3b, 0x82, 0xce, 0x5a, 0xbd, 0xd6,
+ 0xa9, 0x35, 0x83, 0xd8, 0x8c, 0x3d, 0x85, 0x77
+};
+static const u8 enc_assoc003[] __initconst = {
+ 0x33, 0x10, 0x41, 0x12, 0x1f, 0xf3, 0xd2, 0x6b
+};
+static const u8 enc_nonce003[] __initconst = {
+ 0x3d, 0x86, 0xb5, 0x6b, 0xc8, 0xa3, 0x1f, 0x1d
+};
+static const u8 enc_key003[] __initconst = {
+ 0x2d, 0xb0, 0x5d, 0x40, 0xc8, 0xed, 0x44, 0x88,
+ 0x34, 0xd1, 0x13, 0xaf, 0x57, 0xa1, 0xeb, 0x3a,
+ 0x2a, 0x80, 0x51, 0x36, 0xec, 0x5b, 0xbc, 0x08,
+ 0x93, 0x84, 0x21, 0xb5, 0x13, 0x88, 0x3c, 0x0d
+};
+
+static const u8 enc_input004[] __initconst = {
+ 0xa4
+};
+static const u8 enc_output004[] __initconst = {
+ 0xb7, 0x1b, 0xb0, 0x73, 0x59, 0xb0, 0x84, 0xb2,
+ 0x6d, 0x8e, 0xab, 0x94, 0x31, 0xa1, 0xae, 0xac,
+ 0x89
+};
+static const u8 enc_assoc004[] __initconst = {
+ 0x6a, 0xe2, 0xad, 0x3f, 0x88, 0x39, 0x5a, 0x40
+};
+static const u8 enc_nonce004[] __initconst = {
+ 0xd2, 0x32, 0x1f, 0x29, 0x28, 0xc6, 0xc4, 0xc4
+};
+static const u8 enc_key004[] __initconst = {
+ 0x4b, 0x28, 0x4b, 0xa3, 0x7b, 0xbe, 0xe9, 0xf8,
+ 0x31, 0x80, 0x82, 0xd7, 0xd8, 0xe8, 0xb5, 0xa1,
+ 0xe2, 0x18, 0x18, 0x8a, 0x9c, 0xfa, 0xa3, 0x3d,
+ 0x25, 0x71, 0x3e, 0x40, 0xbc, 0x54, 0x7a, 0x3e
+};
+
+static const u8 enc_input005[] __initconst = {
+ 0x2d
+};
+static const u8 enc_output005[] __initconst = {
+ 0xbf, 0xe1, 0x5b, 0x0b, 0xdb, 0x6b, 0xf5, 0x5e,
+ 0x6c, 0x5d, 0x84, 0x44, 0x39, 0x81, 0xc1, 0x9c,
+ 0xac
+};
+static const u8 enc_assoc005[] __initconst = { };
+static const u8 enc_nonce005[] __initconst = {
+ 0x20, 0x1c, 0xaa, 0x5f, 0x9c, 0xbf, 0x92, 0x30
+};
+static const u8 enc_key005[] __initconst = {
+ 0x66, 0xca, 0x9c, 0x23, 0x2a, 0x4b, 0x4b, 0x31,
+ 0x0e, 0x92, 0x89, 0x8b, 0xf4, 0x93, 0xc7, 0x87,
+ 0x98, 0xa3, 0xd8, 0x39, 0xf8, 0xf4, 0xa7, 0x01,
+ 0xc0, 0x2e, 0x0a, 0xa6, 0x7e, 0x5a, 0x78, 0x87
+};
+
+static const u8 enc_input006[] __initconst = {
+ 0x33, 0x2f, 0x94, 0xc1, 0xa4, 0xef, 0xcc, 0x2a,
+ 0x5b, 0xa6, 0xe5, 0x8f, 0x1d, 0x40, 0xf0, 0x92,
+ 0x3c, 0xd9, 0x24, 0x11, 0xa9, 0x71, 0xf9, 0x37,
+ 0x14, 0x99, 0xfa, 0xbe, 0xe6, 0x80, 0xde, 0x50,
+ 0xc9, 0x96, 0xd4, 0xb0, 0xec, 0x9e, 0x17, 0xec,
+ 0xd2, 0x5e, 0x72, 0x99, 0xfc, 0x0a, 0xe1, 0xcb,
+ 0x48, 0xd2, 0x85, 0xdd, 0x2f, 0x90, 0xe0, 0x66,
+ 0x3b, 0xe6, 0x20, 0x74, 0xbe, 0x23, 0x8f, 0xcb,
+ 0xb4, 0xe4, 0xda, 0x48, 0x40, 0xa6, 0xd1, 0x1b,
+ 0xc7, 0x42, 0xce, 0x2f, 0x0c, 0xa6, 0x85, 0x6e,
+ 0x87, 0x37, 0x03, 0xb1, 0x7c, 0x25, 0x96, 0xa3,
+ 0x05, 0xd8, 0xb0, 0xf4, 0xed, 0xea, 0xc2, 0xf0,
+ 0x31, 0x98, 0x6c, 0xd1, 0x14, 0x25, 0xc0, 0xcb,
+ 0x01, 0x74, 0xd0, 0x82, 0xf4, 0x36, 0xf5, 0x41,
+ 0xd5, 0xdc, 0xca, 0xc5, 0xbb, 0x98, 0xfe, 0xfc,
+ 0x69, 0x21, 0x70, 0xd8, 0xa4, 0x4b, 0xc8, 0xde,
+ 0x8f
+};
+static const u8 enc_output006[] __initconst = {
+ 0x8b, 0x06, 0xd3, 0x31, 0xb0, 0x93, 0x45, 0xb1,
+ 0x75, 0x6e, 0x26, 0xf9, 0x67, 0xbc, 0x90, 0x15,
+ 0x81, 0x2c, 0xb5, 0xf0, 0xc6, 0x2b, 0xc7, 0x8c,
+ 0x56, 0xd1, 0xbf, 0x69, 0x6c, 0x07, 0xa0, 0xda,
+ 0x65, 0x27, 0xc9, 0x90, 0x3d, 0xef, 0x4b, 0x11,
+ 0x0f, 0x19, 0x07, 0xfd, 0x29, 0x92, 0xd9, 0xc8,
+ 0xf7, 0x99, 0x2e, 0x4a, 0xd0, 0xb8, 0x2c, 0xdc,
+ 0x93, 0xf5, 0x9e, 0x33, 0x78, 0xd1, 0x37, 0xc3,
+ 0x66, 0xd7, 0x5e, 0xbc, 0x44, 0xbf, 0x53, 0xa5,
+ 0xbc, 0xc4, 0xcb, 0x7b, 0x3a, 0x8e, 0x7f, 0x02,
+ 0xbd, 0xbb, 0xe7, 0xca, 0xa6, 0x6c, 0x6b, 0x93,
+ 0x21, 0x93, 0x10, 0x61, 0xe7, 0x69, 0xd0, 0x78,
+ 0xf3, 0x07, 0x5a, 0x1a, 0x8f, 0x73, 0xaa, 0xb1,
+ 0x4e, 0xd3, 0xda, 0x4f, 0xf3, 0x32, 0xe1, 0x66,
+ 0x3e, 0x6c, 0xc6, 0x13, 0xba, 0x06, 0x5b, 0xfc,
+ 0x6a, 0xe5, 0x6f, 0x60, 0xfb, 0x07, 0x40, 0xb0,
+ 0x8c, 0x9d, 0x84, 0x43, 0x6b, 0xc1, 0xf7, 0x8d,
+ 0x8d, 0x31, 0xf7, 0x7a, 0x39, 0x4d, 0x8f, 0x9a,
+ 0xeb
+};
+static const u8 enc_assoc006[] __initconst = {
+ 0x70, 0xd3, 0x33, 0xf3, 0x8b, 0x18, 0x0b
+};
+static const u8 enc_nonce006[] __initconst = {
+ 0xdf, 0x51, 0x84, 0x82, 0x42, 0x0c, 0x75, 0x9c
+};
+static const u8 enc_key006[] __initconst = {
+ 0x68, 0x7b, 0x8d, 0x8e, 0xe3, 0xc4, 0xdd, 0xae,
+ 0xdf, 0x72, 0x7f, 0x53, 0x72, 0x25, 0x1e, 0x78,
+ 0x91, 0xcb, 0x69, 0x76, 0x1f, 0x49, 0x93, 0xf9,
+ 0x6f, 0x21, 0xcc, 0x39, 0x9c, 0xad, 0xb1, 0x01
+};
+
+static const u8 enc_input007[] __initconst = {
+ 0x9b, 0x18, 0xdb, 0xdd, 0x9a, 0x0f, 0x3e, 0xa5,
+ 0x15, 0x17, 0xde, 0xdf, 0x08, 0x9d, 0x65, 0x0a,
+ 0x67, 0x30, 0x12, 0xe2, 0x34, 0x77, 0x4b, 0xc1,
+ 0xd9, 0xc6, 0x1f, 0xab, 0xc6, 0x18, 0x50, 0x17,
+ 0xa7, 0x9d, 0x3c, 0xa6, 0xc5, 0x35, 0x8c, 0x1c,
+ 0xc0, 0xa1, 0x7c, 0x9f, 0x03, 0x89, 0xca, 0xe1,
+ 0xe6, 0xe9, 0xd4, 0xd3, 0x88, 0xdb, 0xb4, 0x51,
+ 0x9d, 0xec, 0xb4, 0xfc, 0x52, 0xee, 0x6d, 0xf1,
+ 0x75, 0x42, 0xc6, 0xfd, 0xbd, 0x7a, 0x8e, 0x86,
+ 0xfc, 0x44, 0xb3, 0x4f, 0xf3, 0xea, 0x67, 0x5a,
+ 0x41, 0x13, 0xba, 0xb0, 0xdc, 0xe1, 0xd3, 0x2a,
+ 0x7c, 0x22, 0xb3, 0xca, 0xac, 0x6a, 0x37, 0x98,
+ 0x3e, 0x1d, 0x40, 0x97, 0xf7, 0x9b, 0x1d, 0x36,
+ 0x6b, 0xb3, 0x28, 0xbd, 0x60, 0x82, 0x47, 0x34,
+ 0xaa, 0x2f, 0x7d, 0xe9, 0xa8, 0x70, 0x81, 0x57,
+ 0xd4, 0xb9, 0x77, 0x0a, 0x9d, 0x29, 0xa7, 0x84,
+ 0x52, 0x4f, 0xc2, 0x4a, 0x40, 0x3b, 0x3c, 0xd4,
+ 0xc9, 0x2a, 0xdb, 0x4a, 0x53, 0xc4, 0xbe, 0x80,
+ 0xe9, 0x51, 0x7f, 0x8f, 0xc7, 0xa2, 0xce, 0x82,
+ 0x5c, 0x91, 0x1e, 0x74, 0xd9, 0xd0, 0xbd, 0xd5,
+ 0xf3, 0xfd, 0xda, 0x4d, 0x25, 0xb4, 0xbb, 0x2d,
+ 0xac, 0x2f, 0x3d, 0x71, 0x85, 0x7b, 0xcf, 0x3c,
+ 0x7b, 0x3e, 0x0e, 0x22, 0x78, 0x0c, 0x29, 0xbf,
+ 0xe4, 0xf4, 0x57, 0xb3, 0xcb, 0x49, 0xa0, 0xfc,
+ 0x1e, 0x05, 0x4e, 0x16, 0xbc, 0xd5, 0xa8, 0xa3,
+ 0xee, 0x05, 0x35, 0xc6, 0x7c, 0xab, 0x60, 0x14,
+ 0x55, 0x1a, 0x8e, 0xc5, 0x88, 0x5d, 0xd5, 0x81,
+ 0xc2, 0x81, 0xa5, 0xc4, 0x60, 0xdb, 0xaf, 0x77,
+ 0x91, 0xe1, 0xce, 0xa2, 0x7e, 0x7f, 0x42, 0xe3,
+ 0xb0, 0x13, 0x1c, 0x1f, 0x25, 0x60, 0x21, 0xe2,
+ 0x40, 0x5f, 0x99, 0xb7, 0x73, 0xec, 0x9b, 0x2b,
+ 0xf0, 0x65, 0x11, 0xc8, 0xd0, 0x0a, 0x9f, 0xd3
+};
+static const u8 enc_output007[] __initconst = {
+ 0x85, 0x04, 0xc2, 0xed, 0x8d, 0xfd, 0x97, 0x5c,
+ 0xd2, 0xb7, 0xe2, 0xc1, 0x6b, 0xa3, 0xba, 0xf8,
+ 0xc9, 0x50, 0xc3, 0xc6, 0xa5, 0xe3, 0xa4, 0x7c,
+ 0xc3, 0x23, 0x49, 0x5e, 0xa9, 0xb9, 0x32, 0xeb,
+ 0x8a, 0x7c, 0xca, 0xe5, 0xec, 0xfb, 0x7c, 0xc0,
+ 0xcb, 0x7d, 0xdc, 0x2c, 0x9d, 0x92, 0x55, 0x21,
+ 0x0a, 0xc8, 0x43, 0x63, 0x59, 0x0a, 0x31, 0x70,
+ 0x82, 0x67, 0x41, 0x03, 0xf8, 0xdf, 0xf2, 0xac,
+ 0xa7, 0x02, 0xd4, 0xd5, 0x8a, 0x2d, 0xc8, 0x99,
+ 0x19, 0x66, 0xd0, 0xf6, 0x88, 0x2c, 0x77, 0xd9,
+ 0xd4, 0x0d, 0x6c, 0xbd, 0x98, 0xde, 0xe7, 0x7f,
+ 0xad, 0x7e, 0x8a, 0xfb, 0xe9, 0x4b, 0xe5, 0xf7,
+ 0xe5, 0x50, 0xa0, 0x90, 0x3f, 0xd6, 0x22, 0x53,
+ 0xe3, 0xfe, 0x1b, 0xcc, 0x79, 0x3b, 0xec, 0x12,
+ 0x47, 0x52, 0xa7, 0xd6, 0x04, 0xe3, 0x52, 0xe6,
+ 0x93, 0x90, 0x91, 0x32, 0x73, 0x79, 0xb8, 0xd0,
+ 0x31, 0xde, 0x1f, 0x9f, 0x2f, 0x05, 0x38, 0x54,
+ 0x2f, 0x35, 0x04, 0x39, 0xe0, 0xa7, 0xba, 0xc6,
+ 0x52, 0xf6, 0x37, 0x65, 0x4c, 0x07, 0xa9, 0x7e,
+ 0xb3, 0x21, 0x6f, 0x74, 0x8c, 0xc9, 0xde, 0xdb,
+ 0x65, 0x1b, 0x9b, 0xaa, 0x60, 0xb1, 0x03, 0x30,
+ 0x6b, 0xb2, 0x03, 0xc4, 0x1c, 0x04, 0xf8, 0x0f,
+ 0x64, 0xaf, 0x46, 0xe4, 0x65, 0x99, 0x49, 0xe2,
+ 0xea, 0xce, 0x78, 0x00, 0xd8, 0x8b, 0xd5, 0x2e,
+ 0xcf, 0xfc, 0x40, 0x49, 0xe8, 0x58, 0xdc, 0x34,
+ 0x9c, 0x8c, 0x61, 0xbf, 0x0a, 0x8e, 0xec, 0x39,
+ 0xa9, 0x30, 0x05, 0x5a, 0xd2, 0x56, 0x01, 0xc7,
+ 0xda, 0x8f, 0x4e, 0xbb, 0x43, 0xa3, 0x3a, 0xf9,
+ 0x15, 0x2a, 0xd0, 0xa0, 0x7a, 0x87, 0x34, 0x82,
+ 0xfe, 0x8a, 0xd1, 0x2d, 0x5e, 0xc7, 0xbf, 0x04,
+ 0x53, 0x5f, 0x3b, 0x36, 0xd4, 0x25, 0x5c, 0x34,
+ 0x7a, 0x8d, 0xd5, 0x05, 0xce, 0x72, 0xca, 0xef,
+ 0x7a, 0x4b, 0xbc, 0xb0, 0x10, 0x5c, 0x96, 0x42,
+ 0x3a, 0x00, 0x98, 0xcd, 0x15, 0xe8, 0xb7, 0x53
+};
+static const u8 enc_assoc007[] __initconst = { };
+static const u8 enc_nonce007[] __initconst = {
+ 0xde, 0x7b, 0xef, 0xc3, 0x65, 0x1b, 0x68, 0xb0
+};
+static const u8 enc_key007[] __initconst = {
+ 0x8d, 0xb8, 0x91, 0x48, 0xf0, 0xe7, 0x0a, 0xbd,
+ 0xf9, 0x3f, 0xcd, 0xd9, 0xa0, 0x1e, 0x42, 0x4c,
+ 0xe7, 0xde, 0x25, 0x3d, 0xa3, 0xd7, 0x05, 0x80,
+ 0x8d, 0xf2, 0x82, 0xac, 0x44, 0x16, 0x51, 0x01
+};
+
+static const u8 enc_input008[] __initconst = {
+ 0xc3, 0x09, 0x94, 0x62, 0xe6, 0x46, 0x2e, 0x10,
+ 0xbe, 0x00, 0xe4, 0xfc, 0xf3, 0x40, 0xa3, 0xe2,
+ 0x0f, 0xc2, 0x8b, 0x28, 0xdc, 0xba, 0xb4, 0x3c,
+ 0xe4, 0x21, 0x58, 0x61, 0xcd, 0x8b, 0xcd, 0xfb,
+ 0xac, 0x94, 0xa1, 0x45, 0xf5, 0x1c, 0xe1, 0x12,
+ 0xe0, 0x3b, 0x67, 0x21, 0x54, 0x5e, 0x8c, 0xaa,
+ 0xcf, 0xdb, 0xb4, 0x51, 0xd4, 0x13, 0xda, 0xe6,
+ 0x83, 0x89, 0xb6, 0x92, 0xe9, 0x21, 0x76, 0xa4,
+ 0x93, 0x7d, 0x0e, 0xfd, 0x96, 0x36, 0x03, 0x91,
+ 0x43, 0x5c, 0x92, 0x49, 0x62, 0x61, 0x7b, 0xeb,
+ 0x43, 0x89, 0xb8, 0x12, 0x20, 0x43, 0xd4, 0x47,
+ 0x06, 0x84, 0xee, 0x47, 0xe9, 0x8a, 0x73, 0x15,
+ 0x0f, 0x72, 0xcf, 0xed, 0xce, 0x96, 0xb2, 0x7f,
+ 0x21, 0x45, 0x76, 0xeb, 0x26, 0x28, 0x83, 0x6a,
+ 0xad, 0xaa, 0xa6, 0x81, 0xd8, 0x55, 0xb1, 0xa3,
+ 0x85, 0xb3, 0x0c, 0xdf, 0xf1, 0x69, 0x2d, 0x97,
+ 0x05, 0x2a, 0xbc, 0x7c, 0x7b, 0x25, 0xf8, 0x80,
+ 0x9d, 0x39, 0x25, 0xf3, 0x62, 0xf0, 0x66, 0x5e,
+ 0xf4, 0xa0, 0xcf, 0xd8, 0xfd, 0x4f, 0xb1, 0x1f,
+ 0x60, 0x3a, 0x08, 0x47, 0xaf, 0xe1, 0xf6, 0x10,
+ 0x77, 0x09, 0xa7, 0x27, 0x8f, 0x9a, 0x97, 0x5a,
+ 0x26, 0xfa, 0xfe, 0x41, 0x32, 0x83, 0x10, 0xe0,
+ 0x1d, 0xbf, 0x64, 0x0d, 0xf4, 0x1c, 0x32, 0x35,
+ 0xe5, 0x1b, 0x36, 0xef, 0xd4, 0x4a, 0x93, 0x4d,
+ 0x00, 0x7c, 0xec, 0x02, 0x07, 0x8b, 0x5d, 0x7d,
+ 0x1b, 0x0e, 0xd1, 0xa6, 0xa5, 0x5d, 0x7d, 0x57,
+ 0x88, 0xa8, 0xcc, 0x81, 0xb4, 0x86, 0x4e, 0xb4,
+ 0x40, 0xe9, 0x1d, 0xc3, 0xb1, 0x24, 0x3e, 0x7f,
+ 0xcc, 0x8a, 0x24, 0x9b, 0xdf, 0x6d, 0xf0, 0x39,
+ 0x69, 0x3e, 0x4c, 0xc0, 0x96, 0xe4, 0x13, 0xda,
+ 0x90, 0xda, 0xf4, 0x95, 0x66, 0x8b, 0x17, 0x17,
+ 0xfe, 0x39, 0x43, 0x25, 0xaa, 0xda, 0xa0, 0x43,
+ 0x3c, 0xb1, 0x41, 0x02, 0xa3, 0xf0, 0xa7, 0x19,
+ 0x59, 0xbc, 0x1d, 0x7d, 0x6c, 0x6d, 0x91, 0x09,
+ 0x5c, 0xb7, 0x5b, 0x01, 0xd1, 0x6f, 0x17, 0x21,
+ 0x97, 0xbf, 0x89, 0x71, 0xa5, 0xb0, 0x6e, 0x07,
+ 0x45, 0xfd, 0x9d, 0xea, 0x07, 0xf6, 0x7a, 0x9f,
+ 0x10, 0x18, 0x22, 0x30, 0x73, 0xac, 0xd4, 0x6b,
+ 0x72, 0x44, 0xed, 0xd9, 0x19, 0x9b, 0x2d, 0x4a,
+ 0x41, 0xdd, 0xd1, 0x85, 0x5e, 0x37, 0x19, 0xed,
+ 0xd2, 0x15, 0x8f, 0x5e, 0x91, 0xdb, 0x33, 0xf2,
+ 0xe4, 0xdb, 0xff, 0x98, 0xfb, 0xa3, 0xb5, 0xca,
+ 0x21, 0x69, 0x08, 0xe7, 0x8a, 0xdf, 0x90, 0xff,
+ 0x3e, 0xe9, 0x20, 0x86, 0x3c, 0xe9, 0xfc, 0x0b,
+ 0xfe, 0x5c, 0x61, 0xaa, 0x13, 0x92, 0x7f, 0x7b,
+ 0xec, 0xe0, 0x6d, 0xa8, 0x23, 0x22, 0xf6, 0x6b,
+ 0x77, 0xc4, 0xfe, 0x40, 0x07, 0x3b, 0xb6, 0xf6,
+ 0x8e, 0x5f, 0xd4, 0xb9, 0xb7, 0x0f, 0x21, 0x04,
+ 0xef, 0x83, 0x63, 0x91, 0x69, 0x40, 0xa3, 0x48,
+ 0x5c, 0xd2, 0x60, 0xf9, 0x4f, 0x6c, 0x47, 0x8b,
+ 0x3b, 0xb1, 0x9f, 0x8e, 0xee, 0x16, 0x8a, 0x13,
+ 0xfc, 0x46, 0x17, 0xc3, 0xc3, 0x32, 0x56, 0xf8,
+ 0x3c, 0x85, 0x3a, 0xb6, 0x3e, 0xaa, 0x89, 0x4f,
+ 0xb3, 0xdf, 0x38, 0xfd, 0xf1, 0xe4, 0x3a, 0xc0,
+ 0xe6, 0x58, 0xb5, 0x8f, 0xc5, 0x29, 0xa2, 0x92,
+ 0x4a, 0xb6, 0xa0, 0x34, 0x7f, 0xab, 0xb5, 0x8a,
+ 0x90, 0xa1, 0xdb, 0x4d, 0xca, 0xb6, 0x2c, 0x41,
+ 0x3c, 0xf7, 0x2b, 0x21, 0xc3, 0xfd, 0xf4, 0x17,
+ 0x5c, 0xb5, 0x33, 0x17, 0x68, 0x2b, 0x08, 0x30,
+ 0xf3, 0xf7, 0x30, 0x3c, 0x96, 0xe6, 0x6a, 0x20,
+ 0x97, 0xe7, 0x4d, 0x10, 0x5f, 0x47, 0x5f, 0x49,
+ 0x96, 0x09, 0xf0, 0x27, 0x91, 0xc8, 0xf8, 0x5a,
+ 0x2e, 0x79, 0xb5, 0xe2, 0xb8, 0xe8, 0xb9, 0x7b,
+ 0xd5, 0x10, 0xcb, 0xff, 0x5d, 0x14, 0x73, 0xf3
+};
+static const u8 enc_output008[] __initconst = {
+ 0x14, 0xf6, 0x41, 0x37, 0xa6, 0xd4, 0x27, 0xcd,
+ 0xdb, 0x06, 0x3e, 0x9a, 0x4e, 0xab, 0xd5, 0xb1,
+ 0x1e, 0x6b, 0xd2, 0xbc, 0x11, 0xf4, 0x28, 0x93,
+ 0x63, 0x54, 0xef, 0xbb, 0x5e, 0x1d, 0x3a, 0x1d,
+ 0x37, 0x3c, 0x0a, 0x6c, 0x1e, 0xc2, 0xd1, 0x2c,
+ 0xb5, 0xa3, 0xb5, 0x7b, 0xb8, 0x8f, 0x25, 0xa6,
+ 0x1b, 0x61, 0x1c, 0xec, 0x28, 0x58, 0x26, 0xa4,
+ 0xa8, 0x33, 0x28, 0x25, 0x5c, 0x45, 0x05, 0xe5,
+ 0x6c, 0x99, 0xe5, 0x45, 0xc4, 0xa2, 0x03, 0x84,
+ 0x03, 0x73, 0x1e, 0x8c, 0x49, 0xac, 0x20, 0xdd,
+ 0x8d, 0xb3, 0xc4, 0xf5, 0xe7, 0x4f, 0xf1, 0xed,
+ 0xa1, 0x98, 0xde, 0xa4, 0x96, 0xdd, 0x2f, 0xab,
+ 0xab, 0x97, 0xcf, 0x3e, 0xd2, 0x9e, 0xb8, 0x13,
+ 0x07, 0x28, 0x29, 0x19, 0xaf, 0xfd, 0xf2, 0x49,
+ 0x43, 0xea, 0x49, 0x26, 0x91, 0xc1, 0x07, 0xd6,
+ 0xbb, 0x81, 0x75, 0x35, 0x0d, 0x24, 0x7f, 0xc8,
+ 0xda, 0xd4, 0xb7, 0xeb, 0xe8, 0x5c, 0x09, 0xa2,
+ 0x2f, 0xdc, 0x28, 0x7d, 0x3a, 0x03, 0xfa, 0x94,
+ 0xb5, 0x1d, 0x17, 0x99, 0x36, 0xc3, 0x1c, 0x18,
+ 0x34, 0xe3, 0x9f, 0xf5, 0x55, 0x7c, 0xb0, 0x60,
+ 0x9d, 0xff, 0xac, 0xd4, 0x61, 0xf2, 0xad, 0xf8,
+ 0xce, 0xc7, 0xbe, 0x5c, 0xd2, 0x95, 0xa8, 0x4b,
+ 0x77, 0x13, 0x19, 0x59, 0x26, 0xc9, 0xb7, 0x8f,
+ 0x6a, 0xcb, 0x2d, 0x37, 0x91, 0xea, 0x92, 0x9c,
+ 0x94, 0x5b, 0xda, 0x0b, 0xce, 0xfe, 0x30, 0x20,
+ 0xf8, 0x51, 0xad, 0xf2, 0xbe, 0xe7, 0xc7, 0xff,
+ 0xb3, 0x33, 0x91, 0x6a, 0xc9, 0x1a, 0x41, 0xc9,
+ 0x0f, 0xf3, 0x10, 0x0e, 0xfd, 0x53, 0xff, 0x6c,
+ 0x16, 0x52, 0xd9, 0xf3, 0xf7, 0x98, 0x2e, 0xc9,
+ 0x07, 0x31, 0x2c, 0x0c, 0x72, 0xd7, 0xc5, 0xc6,
+ 0x08, 0x2a, 0x7b, 0xda, 0xbd, 0x7e, 0x02, 0xea,
+ 0x1a, 0xbb, 0xf2, 0x04, 0x27, 0x61, 0x28, 0x8e,
+ 0xf5, 0x04, 0x03, 0x1f, 0x4c, 0x07, 0x55, 0x82,
+ 0xec, 0x1e, 0xd7, 0x8b, 0x2f, 0x65, 0x56, 0xd1,
+ 0xd9, 0x1e, 0x3c, 0xe9, 0x1f, 0x5e, 0x98, 0x70,
+ 0x38, 0x4a, 0x8c, 0x49, 0xc5, 0x43, 0xa0, 0xa1,
+ 0x8b, 0x74, 0x9d, 0x4c, 0x62, 0x0d, 0x10, 0x0c,
+ 0xf4, 0x6c, 0x8f, 0xe0, 0xaa, 0x9a, 0x8d, 0xb7,
+ 0xe0, 0xbe, 0x4c, 0x87, 0xf1, 0x98, 0x2f, 0xcc,
+ 0xed, 0xc0, 0x52, 0x29, 0xdc, 0x83, 0xf8, 0xfc,
+ 0x2c, 0x0e, 0xa8, 0x51, 0x4d, 0x80, 0x0d, 0xa3,
+ 0xfe, 0xd8, 0x37, 0xe7, 0x41, 0x24, 0xfc, 0xfb,
+ 0x75, 0xe3, 0x71, 0x7b, 0x57, 0x45, 0xf5, 0x97,
+ 0x73, 0x65, 0x63, 0x14, 0x74, 0xb8, 0x82, 0x9f,
+ 0xf8, 0x60, 0x2f, 0x8a, 0xf2, 0x4e, 0xf1, 0x39,
+ 0xda, 0x33, 0x91, 0xf8, 0x36, 0xe0, 0x8d, 0x3f,
+ 0x1f, 0x3b, 0x56, 0xdc, 0xa0, 0x8f, 0x3c, 0x9d,
+ 0x71, 0x52, 0xa7, 0xb8, 0xc0, 0xa5, 0xc6, 0xa2,
+ 0x73, 0xda, 0xf4, 0x4b, 0x74, 0x5b, 0x00, 0x3d,
+ 0x99, 0xd7, 0x96, 0xba, 0xe6, 0xe1, 0xa6, 0x96,
+ 0x38, 0xad, 0xb3, 0xc0, 0xd2, 0xba, 0x91, 0x6b,
+ 0xf9, 0x19, 0xdd, 0x3b, 0xbe, 0xbe, 0x9c, 0x20,
+ 0x50, 0xba, 0xa1, 0xd0, 0xce, 0x11, 0xbd, 0x95,
+ 0xd8, 0xd1, 0xdd, 0x33, 0x85, 0x74, 0xdc, 0xdb,
+ 0x66, 0x76, 0x44, 0xdc, 0x03, 0x74, 0x48, 0x35,
+ 0x98, 0xb1, 0x18, 0x47, 0x94, 0x7d, 0xff, 0x62,
+ 0xe4, 0x58, 0x78, 0xab, 0xed, 0x95, 0x36, 0xd9,
+ 0x84, 0x91, 0x82, 0x64, 0x41, 0xbb, 0x58, 0xe6,
+ 0x1c, 0x20, 0x6d, 0x15, 0x6b, 0x13, 0x96, 0xe8,
+ 0x35, 0x7f, 0xdc, 0x40, 0x2c, 0xe9, 0xbc, 0x8a,
+ 0x4f, 0x92, 0xec, 0x06, 0x2d, 0x50, 0xdf, 0x93,
+ 0x5d, 0x65, 0x5a, 0xa8, 0xfc, 0x20, 0x50, 0x14,
+ 0xa9, 0x8a, 0x7e, 0x1d, 0x08, 0x1f, 0xe2, 0x99,
+ 0xd0, 0xbe, 0xfb, 0x3a, 0x21, 0x9d, 0xad, 0x86,
+ 0x54, 0xfd, 0x0d, 0x98, 0x1c, 0x5a, 0x6f, 0x1f,
+ 0x9a, 0x40, 0xcd, 0xa2, 0xff, 0x6a, 0xf1, 0x54
+};
+static const u8 enc_assoc008[] __initconst = { };
+static const u8 enc_nonce008[] __initconst = {
+ 0x0e, 0x0d, 0x57, 0xbb, 0x7b, 0x40, 0x54, 0x02
+};
+static const u8 enc_key008[] __initconst = {
+ 0xf2, 0xaa, 0x4f, 0x99, 0xfd, 0x3e, 0xa8, 0x53,
+ 0xc1, 0x44, 0xe9, 0x81, 0x18, 0xdc, 0xf5, 0xf0,
+ 0x3e, 0x44, 0x15, 0x59, 0xe0, 0xc5, 0x44, 0x86,
+ 0xc3, 0x91, 0xa8, 0x75, 0xc0, 0x12, 0x46, 0xba
+};
+
+static const u8 enc_input009[] __initconst = {
+ 0xe6, 0xc3, 0xdb, 0x63, 0x55, 0x15, 0xe3, 0x5b,
+ 0xb7, 0x4b, 0x27, 0x8b, 0x5a, 0xdd, 0xc2, 0xe8,
+ 0x3a, 0x6b, 0xd7, 0x81, 0x96, 0x35, 0x97, 0xca,
+ 0xd7, 0x68, 0xe8, 0xef, 0xce, 0xab, 0xda, 0x09,
+ 0x6e, 0xd6, 0x8e, 0xcb, 0x55, 0xb5, 0xe1, 0xe5,
+ 0x57, 0xfd, 0xc4, 0xe3, 0xe0, 0x18, 0x4f, 0x85,
+ 0xf5, 0x3f, 0x7e, 0x4b, 0x88, 0xc9, 0x52, 0x44,
+ 0x0f, 0xea, 0xaf, 0x1f, 0x71, 0x48, 0x9f, 0x97,
+ 0x6d, 0xb9, 0x6f, 0x00, 0xa6, 0xde, 0x2b, 0x77,
+ 0x8b, 0x15, 0xad, 0x10, 0xa0, 0x2b, 0x7b, 0x41,
+ 0x90, 0x03, 0x2d, 0x69, 0xae, 0xcc, 0x77, 0x7c,
+ 0xa5, 0x9d, 0x29, 0x22, 0xc2, 0xea, 0xb4, 0x00,
+ 0x1a, 0xd2, 0x7a, 0x98, 0x8a, 0xf9, 0xf7, 0x82,
+ 0xb0, 0xab, 0xd8, 0xa6, 0x94, 0x8d, 0x58, 0x2f,
+ 0x01, 0x9e, 0x00, 0x20, 0xfc, 0x49, 0xdc, 0x0e,
+ 0x03, 0xe8, 0x45, 0x10, 0xd6, 0xa8, 0xda, 0x55,
+ 0x10, 0x9a, 0xdf, 0x67, 0x22, 0x8b, 0x43, 0xab,
+ 0x00, 0xbb, 0x02, 0xc8, 0xdd, 0x7b, 0x97, 0x17,
+ 0xd7, 0x1d, 0x9e, 0x02, 0x5e, 0x48, 0xde, 0x8e,
+ 0xcf, 0x99, 0x07, 0x95, 0x92, 0x3c, 0x5f, 0x9f,
+ 0xc5, 0x8a, 0xc0, 0x23, 0xaa, 0xd5, 0x8c, 0x82,
+ 0x6e, 0x16, 0x92, 0xb1, 0x12, 0x17, 0x07, 0xc3,
+ 0xfb, 0x36, 0xf5, 0x6c, 0x35, 0xd6, 0x06, 0x1f,
+ 0x9f, 0xa7, 0x94, 0xa2, 0x38, 0x63, 0x9c, 0xb0,
+ 0x71, 0xb3, 0xa5, 0xd2, 0xd8, 0xba, 0x9f, 0x08,
+ 0x01, 0xb3, 0xff, 0x04, 0x97, 0x73, 0x45, 0x1b,
+ 0xd5, 0xa9, 0x9c, 0x80, 0xaf, 0x04, 0x9a, 0x85,
+ 0xdb, 0x32, 0x5b, 0x5d, 0x1a, 0xc1, 0x36, 0x28,
+ 0x10, 0x79, 0xf1, 0x3c, 0xbf, 0x1a, 0x41, 0x5c,
+ 0x4e, 0xdf, 0xb2, 0x7c, 0x79, 0x3b, 0x7a, 0x62,
+ 0x3d, 0x4b, 0xc9, 0x9b, 0x2a, 0x2e, 0x7c, 0xa2,
+ 0xb1, 0x11, 0x98, 0xa7, 0x34, 0x1a, 0x00, 0xf3,
+ 0xd1, 0xbc, 0x18, 0x22, 0xba, 0x02, 0x56, 0x62,
+ 0x31, 0x10, 0x11, 0x6d, 0xe0, 0x54, 0x9d, 0x40,
+ 0x1f, 0x26, 0x80, 0x41, 0xca, 0x3f, 0x68, 0x0f,
+ 0x32, 0x1d, 0x0a, 0x8e, 0x79, 0xd8, 0xa4, 0x1b,
+ 0x29, 0x1c, 0x90, 0x8e, 0xc5, 0xe3, 0xb4, 0x91,
+ 0x37, 0x9a, 0x97, 0x86, 0x99, 0xd5, 0x09, 0xc5,
+ 0xbb, 0xa3, 0x3f, 0x21, 0x29, 0x82, 0x14, 0x5c,
+ 0xab, 0x25, 0xfb, 0xf2, 0x4f, 0x58, 0x26, 0xd4,
+ 0x83, 0xaa, 0x66, 0x89, 0x67, 0x7e, 0xc0, 0x49,
+ 0xe1, 0x11, 0x10, 0x7f, 0x7a, 0xda, 0x29, 0x04,
+ 0xff, 0xf0, 0xcb, 0x09, 0x7c, 0x9d, 0xfa, 0x03,
+ 0x6f, 0x81, 0x09, 0x31, 0x60, 0xfb, 0x08, 0xfa,
+ 0x74, 0xd3, 0x64, 0x44, 0x7c, 0x55, 0x85, 0xec,
+ 0x9c, 0x6e, 0x25, 0xb7, 0x6c, 0xc5, 0x37, 0xb6,
+ 0x83, 0x87, 0x72, 0x95, 0x8b, 0x9d, 0xe1, 0x69,
+ 0x5c, 0x31, 0x95, 0x42, 0xa6, 0x2c, 0xd1, 0x36,
+ 0x47, 0x1f, 0xec, 0x54, 0xab, 0xa2, 0x1c, 0xd8,
+ 0x00, 0xcc, 0xbc, 0x0d, 0x65, 0xe2, 0x67, 0xbf,
+ 0xbc, 0xea, 0xee, 0x9e, 0xe4, 0x36, 0x95, 0xbe,
+ 0x73, 0xd9, 0xa6, 0xd9, 0x0f, 0xa0, 0xcc, 0x82,
+ 0x76, 0x26, 0xad, 0x5b, 0x58, 0x6c, 0x4e, 0xab,
+ 0x29, 0x64, 0xd3, 0xd9, 0xa9, 0x08, 0x8c, 0x1d,
+ 0xa1, 0x4f, 0x80, 0xd8, 0x3f, 0x94, 0xfb, 0xd3,
+ 0x7b, 0xfc, 0xd1, 0x2b, 0xc3, 0x21, 0xeb, 0xe5,
+ 0x1c, 0x84, 0x23, 0x7f, 0x4b, 0xfa, 0xdb, 0x34,
+ 0x18, 0xa2, 0xc2, 0xe5, 0x13, 0xfe, 0x6c, 0x49,
+ 0x81, 0xd2, 0x73, 0xe7, 0xe2, 0xd7, 0xe4, 0x4f,
+ 0x4b, 0x08, 0x6e, 0xb1, 0x12, 0x22, 0x10, 0x9d,
+ 0xac, 0x51, 0x1e, 0x17, 0xd9, 0x8a, 0x0b, 0x42,
+ 0x88, 0x16, 0x81, 0x37, 0x7c, 0x6a, 0xf7, 0xef,
+ 0x2d, 0xe3, 0xd9, 0xf8, 0x5f, 0xe0, 0x53, 0x27,
+ 0x74, 0xb9, 0xe2, 0xd6, 0x1c, 0x80, 0x2c, 0x52,
+ 0x65
+};
+static const u8 enc_output009[] __initconst = {
+ 0xfd, 0x81, 0x8d, 0xd0, 0x3d, 0xb4, 0xd5, 0xdf,
+ 0xd3, 0x42, 0x47, 0x5a, 0x6d, 0x19, 0x27, 0x66,
+ 0x4b, 0x2e, 0x0c, 0x27, 0x9c, 0x96, 0x4c, 0x72,
+ 0x02, 0xa3, 0x65, 0xc3, 0xb3, 0x6f, 0x2e, 0xbd,
+ 0x63, 0x8a, 0x4a, 0x5d, 0x29, 0xa2, 0xd0, 0x28,
+ 0x48, 0xc5, 0x3d, 0x98, 0xa3, 0xbc, 0xe0, 0xbe,
+ 0x3b, 0x3f, 0xe6, 0x8a, 0xa4, 0x7f, 0x53, 0x06,
+ 0xfa, 0x7f, 0x27, 0x76, 0x72, 0x31, 0xa1, 0xf5,
+ 0xd6, 0x0c, 0x52, 0x47, 0xba, 0xcd, 0x4f, 0xd7,
+ 0xeb, 0x05, 0x48, 0x0d, 0x7c, 0x35, 0x4a, 0x09,
+ 0xc9, 0x76, 0x71, 0x02, 0xa3, 0xfb, 0xb7, 0x1a,
+ 0x65, 0xb7, 0xed, 0x98, 0xc6, 0x30, 0x8a, 0x00,
+ 0xae, 0xa1, 0x31, 0xe5, 0xb5, 0x9e, 0x6d, 0x62,
+ 0xda, 0xda, 0x07, 0x0f, 0x38, 0x38, 0xd3, 0xcb,
+ 0xc1, 0xb0, 0xad, 0xec, 0x72, 0xec, 0xb1, 0xa2,
+ 0x7b, 0x59, 0xf3, 0x3d, 0x2b, 0xef, 0xcd, 0x28,
+ 0x5b, 0x83, 0xcc, 0x18, 0x91, 0x88, 0xb0, 0x2e,
+ 0xf9, 0x29, 0x31, 0x18, 0xf9, 0x4e, 0xe9, 0x0a,
+ 0x91, 0x92, 0x9f, 0xae, 0x2d, 0xad, 0xf4, 0xe6,
+ 0x1a, 0xe2, 0xa4, 0xee, 0x47, 0x15, 0xbf, 0x83,
+ 0x6e, 0xd7, 0x72, 0x12, 0x3b, 0x2d, 0x24, 0xe9,
+ 0xb2, 0x55, 0xcb, 0x3c, 0x10, 0xf0, 0x24, 0x8a,
+ 0x4a, 0x02, 0xea, 0x90, 0x25, 0xf0, 0xb4, 0x79,
+ 0x3a, 0xef, 0x6e, 0xf5, 0x52, 0xdf, 0xb0, 0x0a,
+ 0xcd, 0x24, 0x1c, 0xd3, 0x2e, 0x22, 0x74, 0xea,
+ 0x21, 0x6f, 0xe9, 0xbd, 0xc8, 0x3e, 0x36, 0x5b,
+ 0x19, 0xf1, 0xca, 0x99, 0x0a, 0xb4, 0xa7, 0x52,
+ 0x1a, 0x4e, 0xf2, 0xad, 0x8d, 0x56, 0x85, 0xbb,
+ 0x64, 0x89, 0xba, 0x26, 0xf9, 0xc7, 0xe1, 0x89,
+ 0x19, 0x22, 0x77, 0xc3, 0xa8, 0xfc, 0xff, 0xad,
+ 0xfe, 0xb9, 0x48, 0xae, 0x12, 0x30, 0x9f, 0x19,
+ 0xfb, 0x1b, 0xef, 0x14, 0x87, 0x8a, 0x78, 0x71,
+ 0xf3, 0xf4, 0xb7, 0x00, 0x9c, 0x1d, 0xb5, 0x3d,
+ 0x49, 0x00, 0x0c, 0x06, 0xd4, 0x50, 0xf9, 0x54,
+ 0x45, 0xb2, 0x5b, 0x43, 0xdb, 0x6d, 0xcf, 0x1a,
+ 0xe9, 0x7a, 0x7a, 0xcf, 0xfc, 0x8a, 0x4e, 0x4d,
+ 0x0b, 0x07, 0x63, 0x28, 0xd8, 0xe7, 0x08, 0x95,
+ 0xdf, 0xa6, 0x72, 0x93, 0x2e, 0xbb, 0xa0, 0x42,
+ 0x89, 0x16, 0xf1, 0xd9, 0x0c, 0xf9, 0xa1, 0x16,
+ 0xfd, 0xd9, 0x03, 0xb4, 0x3b, 0x8a, 0xf5, 0xf6,
+ 0xe7, 0x6b, 0x2e, 0x8e, 0x4c, 0x3d, 0xe2, 0xaf,
+ 0x08, 0x45, 0x03, 0xff, 0x09, 0xb6, 0xeb, 0x2d,
+ 0xc6, 0x1b, 0x88, 0x94, 0xac, 0x3e, 0xf1, 0x9f,
+ 0x0e, 0x0e, 0x2b, 0xd5, 0x00, 0x4d, 0x3f, 0x3b,
+ 0x53, 0xae, 0xaf, 0x1c, 0x33, 0x5f, 0x55, 0x6e,
+ 0x8d, 0xaf, 0x05, 0x7a, 0x10, 0x34, 0xc9, 0xf4,
+ 0x66, 0xcb, 0x62, 0x12, 0xa6, 0xee, 0xe8, 0x1c,
+ 0x5d, 0x12, 0x86, 0xdb, 0x6f, 0x1c, 0x33, 0xc4,
+ 0x1c, 0xda, 0x82, 0x2d, 0x3b, 0x59, 0xfe, 0xb1,
+ 0xa4, 0x59, 0x41, 0x86, 0xd0, 0xef, 0xae, 0xfb,
+ 0xda, 0x6d, 0x11, 0xb8, 0xca, 0xe9, 0x6e, 0xff,
+ 0xf7, 0xa9, 0xd9, 0x70, 0x30, 0xfc, 0x53, 0xe2,
+ 0xd7, 0xa2, 0x4e, 0xc7, 0x91, 0xd9, 0x07, 0x06,
+ 0xaa, 0xdd, 0xb0, 0x59, 0x28, 0x1d, 0x00, 0x66,
+ 0xc5, 0x54, 0xc2, 0xfc, 0x06, 0xda, 0x05, 0x90,
+ 0x52, 0x1d, 0x37, 0x66, 0xee, 0xf0, 0xb2, 0x55,
+ 0x8a, 0x5d, 0xd2, 0x38, 0x86, 0x94, 0x9b, 0xfc,
+ 0x10, 0x4c, 0xa1, 0xb9, 0x64, 0x3e, 0x44, 0xb8,
+ 0x5f, 0xb0, 0x0c, 0xec, 0xe0, 0xc9, 0xe5, 0x62,
+ 0x75, 0x3f, 0x09, 0xd5, 0xf5, 0xd9, 0x26, 0xba,
+ 0x9e, 0xd2, 0xf4, 0xb9, 0x48, 0x0a, 0xbc, 0xa2,
+ 0xd6, 0x7c, 0x36, 0x11, 0x7d, 0x26, 0x81, 0x89,
+ 0xcf, 0xa4, 0xad, 0x73, 0x0e, 0xee, 0xcc, 0x06,
+ 0xa9, 0xdb, 0xb1, 0xfd, 0xfb, 0x09, 0x7f, 0x90,
+ 0x42, 0x37, 0x2f, 0xe1, 0x9c, 0x0f, 0x6f, 0xcf,
+ 0x43, 0xb5, 0xd9, 0x90, 0xe1, 0x85, 0xf5, 0xa8,
+ 0xae
+};
+static const u8 enc_assoc009[] __initconst = {
+ 0x5a, 0x27, 0xff, 0xeb, 0xdf, 0x84, 0xb2, 0x9e,
+ 0xef
+};
+static const u8 enc_nonce009[] __initconst = {
+ 0xef, 0x2d, 0x63, 0xee, 0x6b, 0x80, 0x8b, 0x78
+};
+static const u8 enc_key009[] __initconst = {
+ 0xea, 0xbc, 0x56, 0x99, 0xe3, 0x50, 0xff, 0xc5,
+ 0xcc, 0x1a, 0xd7, 0xc1, 0x57, 0x72, 0xea, 0x86,
+ 0x5b, 0x89, 0x88, 0x61, 0x3d, 0x2f, 0x9b, 0xb2,
+ 0xe7, 0x9c, 0xec, 0x74, 0x6e, 0x3e, 0xf4, 0x3b
+};
+
+static const u8 enc_input010[] __initconst = {
+ 0x42, 0x93, 0xe4, 0xeb, 0x97, 0xb0, 0x57, 0xbf,
+ 0x1a, 0x8b, 0x1f, 0xe4, 0x5f, 0x36, 0x20, 0x3c,
+ 0xef, 0x0a, 0xa9, 0x48, 0x5f, 0x5f, 0x37, 0x22,
+ 0x3a, 0xde, 0xe3, 0xae, 0xbe, 0xad, 0x07, 0xcc,
+ 0xb1, 0xf6, 0xf5, 0xf9, 0x56, 0xdd, 0xe7, 0x16,
+ 0x1e, 0x7f, 0xdf, 0x7a, 0x9e, 0x75, 0xb7, 0xc7,
+ 0xbe, 0xbe, 0x8a, 0x36, 0x04, 0xc0, 0x10, 0xf4,
+ 0x95, 0x20, 0x03, 0xec, 0xdc, 0x05, 0xa1, 0x7d,
+ 0xc4, 0xa9, 0x2c, 0x82, 0xd0, 0xbc, 0x8b, 0xc5,
+ 0xc7, 0x45, 0x50, 0xf6, 0xa2, 0x1a, 0xb5, 0x46,
+ 0x3b, 0x73, 0x02, 0xa6, 0x83, 0x4b, 0x73, 0x82,
+ 0x58, 0x5e, 0x3b, 0x65, 0x2f, 0x0e, 0xfd, 0x2b,
+ 0x59, 0x16, 0xce, 0xa1, 0x60, 0x9c, 0xe8, 0x3a,
+ 0x99, 0xed, 0x8d, 0x5a, 0xcf, 0xf6, 0x83, 0xaf,
+ 0xba, 0xd7, 0x73, 0x73, 0x40, 0x97, 0x3d, 0xca,
+ 0xef, 0x07, 0x57, 0xe6, 0xd9, 0x70, 0x0e, 0x95,
+ 0xae, 0xa6, 0x8d, 0x04, 0xcc, 0xee, 0xf7, 0x09,
+ 0x31, 0x77, 0x12, 0xa3, 0x23, 0x97, 0x62, 0xb3,
+ 0x7b, 0x32, 0xfb, 0x80, 0x14, 0x48, 0x81, 0xc3,
+ 0xe5, 0xea, 0x91, 0x39, 0x52, 0x81, 0xa2, 0x4f,
+ 0xe4, 0xb3, 0x09, 0xff, 0xde, 0x5e, 0xe9, 0x58,
+ 0x84, 0x6e, 0xf9, 0x3d, 0xdf, 0x25, 0xea, 0xad,
+ 0xae, 0xe6, 0x9a, 0xd1, 0x89, 0x55, 0xd3, 0xde,
+ 0x6c, 0x52, 0xdb, 0x70, 0xfe, 0x37, 0xce, 0x44,
+ 0x0a, 0xa8, 0x25, 0x5f, 0x92, 0xc1, 0x33, 0x4a,
+ 0x4f, 0x9b, 0x62, 0x35, 0xff, 0xce, 0xc0, 0xa9,
+ 0x60, 0xce, 0x52, 0x00, 0x97, 0x51, 0x35, 0x26,
+ 0x2e, 0xb9, 0x36, 0xa9, 0x87, 0x6e, 0x1e, 0xcc,
+ 0x91, 0x78, 0x53, 0x98, 0x86, 0x5b, 0x9c, 0x74,
+ 0x7d, 0x88, 0x33, 0xe1, 0xdf, 0x37, 0x69, 0x2b,
+ 0xbb, 0xf1, 0x4d, 0xf4, 0xd1, 0xf1, 0x39, 0x93,
+ 0x17, 0x51, 0x19, 0xe3, 0x19, 0x1e, 0x76, 0x37,
+ 0x25, 0xfb, 0x09, 0x27, 0x6a, 0xab, 0x67, 0x6f,
+ 0x14, 0x12, 0x64, 0xe7, 0xc4, 0x07, 0xdf, 0x4d,
+ 0x17, 0xbb, 0x6d, 0xe0, 0xe9, 0xb9, 0xab, 0xca,
+ 0x10, 0x68, 0xaf, 0x7e, 0xb7, 0x33, 0x54, 0x73,
+ 0x07, 0x6e, 0xf7, 0x81, 0x97, 0x9c, 0x05, 0x6f,
+ 0x84, 0x5f, 0xd2, 0x42, 0xfb, 0x38, 0xcf, 0xd1,
+ 0x2f, 0x14, 0x30, 0x88, 0x98, 0x4d, 0x5a, 0xa9,
+ 0x76, 0xd5, 0x4f, 0x3e, 0x70, 0x6c, 0x85, 0x76,
+ 0xd7, 0x01, 0xa0, 0x1a, 0xc8, 0x4e, 0xaa, 0xac,
+ 0x78, 0xfe, 0x46, 0xde, 0x6a, 0x05, 0x46, 0xa7,
+ 0x43, 0x0c, 0xb9, 0xde, 0xb9, 0x68, 0xfb, 0xce,
+ 0x42, 0x99, 0x07, 0x4d, 0x0b, 0x3b, 0x5a, 0x30,
+ 0x35, 0xa8, 0xf9, 0x3a, 0x73, 0xef, 0x0f, 0xdb,
+ 0x1e, 0x16, 0x42, 0xc4, 0xba, 0xae, 0x58, 0xaa,
+ 0xf8, 0xe5, 0x75, 0x2f, 0x1b, 0x15, 0x5c, 0xfd,
+ 0x0a, 0x97, 0xd0, 0xe4, 0x37, 0x83, 0x61, 0x5f,
+ 0x43, 0xa6, 0xc7, 0x3f, 0x38, 0x59, 0xe6, 0xeb,
+ 0xa3, 0x90, 0xc3, 0xaa, 0xaa, 0x5a, 0xd3, 0x34,
+ 0xd4, 0x17, 0xc8, 0x65, 0x3e, 0x57, 0xbc, 0x5e,
+ 0xdd, 0x9e, 0xb7, 0xf0, 0x2e, 0x5b, 0xb2, 0x1f,
+ 0x8a, 0x08, 0x0d, 0x45, 0x91, 0x0b, 0x29, 0x53,
+ 0x4f, 0x4c, 0x5a, 0x73, 0x56, 0xfe, 0xaf, 0x41,
+ 0x01, 0x39, 0x0a, 0x24, 0x3c, 0x7e, 0xbe, 0x4e,
+ 0x53, 0xf3, 0xeb, 0x06, 0x66, 0x51, 0x28, 0x1d,
+ 0xbd, 0x41, 0x0a, 0x01, 0xab, 0x16, 0x47, 0x27,
+ 0x47, 0x47, 0xf7, 0xcb, 0x46, 0x0a, 0x70, 0x9e,
+ 0x01, 0x9c, 0x09, 0xe1, 0x2a, 0x00, 0x1a, 0xd8,
+ 0xd4, 0x79, 0x9d, 0x80, 0x15, 0x8e, 0x53, 0x2a,
+ 0x65, 0x83, 0x78, 0x3e, 0x03, 0x00, 0x07, 0x12,
+ 0x1f, 0x33, 0x3e, 0x7b, 0x13, 0x37, 0xf1, 0xc3,
+ 0xef, 0xb7, 0xc1, 0x20, 0x3c, 0x3e, 0x67, 0x66,
+ 0x5d, 0x88, 0xa7, 0x7d, 0x33, 0x50, 0x77, 0xb0,
+ 0x28, 0x8e, 0xe7, 0x2c, 0x2e, 0x7a, 0xf4, 0x3c,
+ 0x8d, 0x74, 0x83, 0xaf, 0x8e, 0x87, 0x0f, 0xe4,
+ 0x50, 0xff, 0x84, 0x5c, 0x47, 0x0c, 0x6a, 0x49,
+ 0xbf, 0x42, 0x86, 0x77, 0x15, 0x48, 0xa5, 0x90,
+ 0x5d, 0x93, 0xd6, 0x2a, 0x11, 0xd5, 0xd5, 0x11,
+ 0xaa, 0xce, 0xe7, 0x6f, 0xa5, 0xb0, 0x09, 0x2c,
+ 0x8d, 0xd3, 0x92, 0xf0, 0x5a, 0x2a, 0xda, 0x5b,
+ 0x1e, 0xd5, 0x9a, 0xc4, 0xc4, 0xf3, 0x49, 0x74,
+ 0x41, 0xca, 0xe8, 0xc1, 0xf8, 0x44, 0xd6, 0x3c,
+ 0xae, 0x6c, 0x1d, 0x9a, 0x30, 0x04, 0x4d, 0x27,
+ 0x0e, 0xb1, 0x5f, 0x59, 0xa2, 0x24, 0xe8, 0xe1,
+ 0x98, 0xc5, 0x6a, 0x4c, 0xfe, 0x41, 0xd2, 0x27,
+ 0x42, 0x52, 0xe1, 0xe9, 0x7d, 0x62, 0xe4, 0x88,
+ 0x0f, 0xad, 0xb2, 0x70, 0xcb, 0x9d, 0x4c, 0x27,
+ 0x2e, 0x76, 0x1e, 0x1a, 0x63, 0x65, 0xf5, 0x3b,
+ 0xf8, 0x57, 0x69, 0xeb, 0x5b, 0x38, 0x26, 0x39,
+ 0x33, 0x25, 0x45, 0x3e, 0x91, 0xb8, 0xd8, 0xc7,
+ 0xd5, 0x42, 0xc0, 0x22, 0x31, 0x74, 0xf4, 0xbc,
+ 0x0c, 0x23, 0xf1, 0xca, 0xc1, 0x8d, 0xd7, 0xbe,
+ 0xc9, 0x62, 0xe4, 0x08, 0x1a, 0xcf, 0x36, 0xd5,
+ 0xfe, 0x55, 0x21, 0x59, 0x91, 0x87, 0x87, 0xdf,
+ 0x06, 0xdb, 0xdf, 0x96, 0x45, 0x58, 0xda, 0x05,
+ 0xcd, 0x50, 0x4d, 0xd2, 0x7d, 0x05, 0x18, 0x73,
+ 0x6a, 0x8d, 0x11, 0x85, 0xa6, 0x88, 0xe8, 0xda,
+ 0xe6, 0x30, 0x33, 0xa4, 0x89, 0x31, 0x75, 0xbe,
+ 0x69, 0x43, 0x84, 0x43, 0x50, 0x87, 0xdd, 0x71,
+ 0x36, 0x83, 0xc3, 0x78, 0x74, 0x24, 0x0a, 0xed,
+ 0x7b, 0xdb, 0xa4, 0x24, 0x0b, 0xb9, 0x7e, 0x5d,
+ 0xff, 0xde, 0xb1, 0xef, 0x61, 0x5a, 0x45, 0x33,
+ 0xf6, 0x17, 0x07, 0x08, 0x98, 0x83, 0x92, 0x0f,
+ 0x23, 0x6d, 0xe6, 0xaa, 0x17, 0x54, 0xad, 0x6a,
+ 0xc8, 0xdb, 0x26, 0xbe, 0xb8, 0xb6, 0x08, 0xfa,
+ 0x68, 0xf1, 0xd7, 0x79, 0x6f, 0x18, 0xb4, 0x9e,
+ 0x2d, 0x3f, 0x1b, 0x64, 0xaf, 0x8d, 0x06, 0x0e,
+ 0x49, 0x28, 0xe0, 0x5d, 0x45, 0x68, 0x13, 0x87,
+ 0xfa, 0xde, 0x40, 0x7b, 0xd2, 0xc3, 0x94, 0xd5,
+ 0xe1, 0xd9, 0xc2, 0xaf, 0x55, 0x89, 0xeb, 0xb4,
+ 0x12, 0x59, 0xa8, 0xd4, 0xc5, 0x29, 0x66, 0x38,
+ 0xe6, 0xac, 0x22, 0x22, 0xd9, 0x64, 0x9b, 0x34,
+ 0x0a, 0x32, 0x9f, 0xc2, 0xbf, 0x17, 0x6c, 0x3f,
+ 0x71, 0x7a, 0x38, 0x6b, 0x98, 0xfb, 0x49, 0x36,
+ 0x89, 0xc9, 0xe2, 0xd6, 0xc7, 0x5d, 0xd0, 0x69,
+ 0x5f, 0x23, 0x35, 0xc9, 0x30, 0xe2, 0xfd, 0x44,
+ 0x58, 0x39, 0xd7, 0x97, 0xfb, 0x5c, 0x00, 0xd5,
+ 0x4f, 0x7a, 0x1a, 0x95, 0x8b, 0x62, 0x4b, 0xce,
+ 0xe5, 0x91, 0x21, 0x7b, 0x30, 0x00, 0xd6, 0xdd,
+ 0x6d, 0x02, 0x86, 0x49, 0x0f, 0x3c, 0x1a, 0x27,
+ 0x3c, 0xd3, 0x0e, 0x71, 0xf2, 0xff, 0xf5, 0x2f,
+ 0x87, 0xac, 0x67, 0x59, 0x81, 0xa3, 0xf7, 0xf8,
+ 0xd6, 0x11, 0x0c, 0x84, 0xa9, 0x03, 0xee, 0x2a,
+ 0xc4, 0xf3, 0x22, 0xab, 0x7c, 0xe2, 0x25, 0xf5,
+ 0x67, 0xa3, 0xe4, 0x11, 0xe0, 0x59, 0xb3, 0xca,
+ 0x87, 0xa0, 0xae, 0xc9, 0xa6, 0x62, 0x1b, 0x6e,
+ 0x4d, 0x02, 0x6b, 0x07, 0x9d, 0xfd, 0xd0, 0x92,
+ 0x06, 0xe1, 0xb2, 0x9a, 0x4a, 0x1f, 0x1f, 0x13,
+ 0x49, 0x99, 0x97, 0x08, 0xde, 0x7f, 0x98, 0xaf,
+ 0x51, 0x98, 0xee, 0x2c, 0xcb, 0xf0, 0x0b, 0xc6,
+ 0xb6, 0xb7, 0x2d, 0x9a, 0xb1, 0xac, 0xa6, 0xe3,
+ 0x15, 0x77, 0x9d, 0x6b, 0x1a, 0xe4, 0xfc, 0x8b,
+ 0xf2, 0x17, 0x59, 0x08, 0x04, 0x58, 0x81, 0x9d,
+ 0x1b, 0x1b, 0x69, 0x55, 0xc2, 0xb4, 0x3c, 0x1f,
+ 0x50, 0xf1, 0x7f, 0x77, 0x90, 0x4c, 0x66, 0x40,
+ 0x5a, 0xc0, 0x33, 0x1f, 0xcb, 0x05, 0x6d, 0x5c,
+ 0x06, 0x87, 0x52, 0xa2, 0x8f, 0x26, 0xd5, 0x4f
+};
+static const u8 enc_output010[] __initconst = {
+ 0xe5, 0x26, 0xa4, 0x3d, 0xbd, 0x33, 0xd0, 0x4b,
+ 0x6f, 0x05, 0xa7, 0x6e, 0x12, 0x7a, 0xd2, 0x74,
+ 0xa6, 0xdd, 0xbd, 0x95, 0xeb, 0xf9, 0xa4, 0xf1,
+ 0x59, 0x93, 0x91, 0x70, 0xd9, 0xfe, 0x9a, 0xcd,
+ 0x53, 0x1f, 0x3a, 0xab, 0xa6, 0x7c, 0x9f, 0xa6,
+ 0x9e, 0xbd, 0x99, 0xd9, 0xb5, 0x97, 0x44, 0xd5,
+ 0x14, 0x48, 0x4d, 0x9d, 0xc0, 0xd0, 0x05, 0x96,
+ 0xeb, 0x4c, 0x78, 0x55, 0x09, 0x08, 0x01, 0x02,
+ 0x30, 0x90, 0x7b, 0x96, 0x7a, 0x7b, 0x5f, 0x30,
+ 0x41, 0x24, 0xce, 0x68, 0x61, 0x49, 0x86, 0x57,
+ 0x82, 0xdd, 0x53, 0x1c, 0x51, 0x28, 0x2b, 0x53,
+ 0x6e, 0x2d, 0xc2, 0x20, 0x4c, 0xdd, 0x8f, 0x65,
+ 0x10, 0x20, 0x50, 0xdd, 0x9d, 0x50, 0xe5, 0x71,
+ 0x40, 0x53, 0x69, 0xfc, 0x77, 0x48, 0x11, 0xb9,
+ 0xde, 0xa4, 0x8d, 0x58, 0xe4, 0xa6, 0x1a, 0x18,
+ 0x47, 0x81, 0x7e, 0xfc, 0xdd, 0xf6, 0xef, 0xce,
+ 0x2f, 0x43, 0x68, 0xd6, 0x06, 0xe2, 0x74, 0x6a,
+ 0xad, 0x90, 0xf5, 0x37, 0xf3, 0x3d, 0x82, 0x69,
+ 0x40, 0xe9, 0x6b, 0xa7, 0x3d, 0xa8, 0x1e, 0xd2,
+ 0x02, 0x7c, 0xb7, 0x9b, 0xe4, 0xda, 0x8f, 0x95,
+ 0x06, 0xc5, 0xdf, 0x73, 0xa3, 0x20, 0x9a, 0x49,
+ 0xde, 0x9c, 0xbc, 0xee, 0x14, 0x3f, 0x81, 0x5e,
+ 0xf8, 0x3b, 0x59, 0x3c, 0xe1, 0x68, 0x12, 0x5a,
+ 0x3a, 0x76, 0x3a, 0x3f, 0xf7, 0x87, 0x33, 0x0a,
+ 0x01, 0xb8, 0xd4, 0xed, 0xb6, 0xbe, 0x94, 0x5e,
+ 0x70, 0x40, 0x56, 0x67, 0x1f, 0x50, 0x44, 0x19,
+ 0xce, 0x82, 0x70, 0x10, 0x87, 0x13, 0x20, 0x0b,
+ 0x4c, 0x5a, 0xb6, 0xf6, 0xa7, 0xae, 0x81, 0x75,
+ 0x01, 0x81, 0xe6, 0x4b, 0x57, 0x7c, 0xdd, 0x6d,
+ 0xf8, 0x1c, 0x29, 0x32, 0xf7, 0xda, 0x3c, 0x2d,
+ 0xf8, 0x9b, 0x25, 0x6e, 0x00, 0xb4, 0xf7, 0x2f,
+ 0xf7, 0x04, 0xf7, 0xa1, 0x56, 0xac, 0x4f, 0x1a,
+ 0x64, 0xb8, 0x47, 0x55, 0x18, 0x7b, 0x07, 0x4d,
+ 0xbd, 0x47, 0x24, 0x80, 0x5d, 0xa2, 0x70, 0xc5,
+ 0xdd, 0x8e, 0x82, 0xd4, 0xeb, 0xec, 0xb2, 0x0c,
+ 0x39, 0xd2, 0x97, 0xc1, 0xcb, 0xeb, 0xf4, 0x77,
+ 0x59, 0xb4, 0x87, 0xef, 0xcb, 0x43, 0x2d, 0x46,
+ 0x54, 0xd1, 0xa7, 0xd7, 0x15, 0x99, 0x0a, 0x43,
+ 0xa1, 0xe0, 0x99, 0x33, 0x71, 0xc1, 0xed, 0xfe,
+ 0x72, 0x46, 0x33, 0x8e, 0x91, 0x08, 0x9f, 0xc8,
+ 0x2e, 0xca, 0xfa, 0xdc, 0x59, 0xd5, 0xc3, 0x76,
+ 0x84, 0x9f, 0xa3, 0x37, 0x68, 0xc3, 0xf0, 0x47,
+ 0x2c, 0x68, 0xdb, 0x5e, 0xc3, 0x49, 0x4c, 0xe8,
+ 0x92, 0x85, 0xe2, 0x23, 0xd3, 0x3f, 0xad, 0x32,
+ 0xe5, 0x2b, 0x82, 0xd7, 0x8f, 0x99, 0x0a, 0x59,
+ 0x5c, 0x45, 0xd9, 0xb4, 0x51, 0x52, 0xc2, 0xae,
+ 0xbf, 0x80, 0xcf, 0xc9, 0xc9, 0x51, 0x24, 0x2a,
+ 0x3b, 0x3a, 0x4d, 0xae, 0xeb, 0xbd, 0x22, 0xc3,
+ 0x0e, 0x0f, 0x59, 0x25, 0x92, 0x17, 0xe9, 0x74,
+ 0xc7, 0x8b, 0x70, 0x70, 0x36, 0x55, 0x95, 0x75,
+ 0x4b, 0xad, 0x61, 0x2b, 0x09, 0xbc, 0x82, 0xf2,
+ 0x6e, 0x94, 0x43, 0xae, 0xc3, 0xd5, 0xcd, 0x8e,
+ 0xfe, 0x5b, 0x9a, 0x88, 0x43, 0x01, 0x75, 0xb2,
+ 0x23, 0x09, 0xf7, 0x89, 0x83, 0xe7, 0xfa, 0xf9,
+ 0xb4, 0x9b, 0xf8, 0xef, 0xbd, 0x1c, 0x92, 0xc1,
+ 0xda, 0x7e, 0xfe, 0x05, 0xba, 0x5a, 0xcd, 0x07,
+ 0x6a, 0x78, 0x9e, 0x5d, 0xfb, 0x11, 0x2f, 0x79,
+ 0x38, 0xb6, 0xc2, 0x5b, 0x6b, 0x51, 0xb4, 0x71,
+ 0xdd, 0xf7, 0x2a, 0xe4, 0xf4, 0x72, 0x76, 0xad,
+ 0xc2, 0xdd, 0x64, 0x5d, 0x79, 0xb6, 0xf5, 0x7a,
+ 0x77, 0x20, 0x05, 0x3d, 0x30, 0x06, 0xd4, 0x4c,
+ 0x0a, 0x2c, 0x98, 0x5a, 0xb9, 0xd4, 0x98, 0xa9,
+ 0x3f, 0xc6, 0x12, 0xea, 0x3b, 0x4b, 0xc5, 0x79,
+ 0x64, 0x63, 0x6b, 0x09, 0x54, 0x3b, 0x14, 0x27,
+ 0xba, 0x99, 0x80, 0xc8, 0x72, 0xa8, 0x12, 0x90,
+ 0x29, 0xba, 0x40, 0x54, 0x97, 0x2b, 0x7b, 0xfe,
+ 0xeb, 0xcd, 0x01, 0x05, 0x44, 0x72, 0xdb, 0x99,
+ 0xe4, 0x61, 0xc9, 0x69, 0xd6, 0xb9, 0x28, 0xd1,
+ 0x05, 0x3e, 0xf9, 0x0b, 0x49, 0x0a, 0x49, 0xe9,
+ 0x8d, 0x0e, 0xa7, 0x4a, 0x0f, 0xaf, 0x32, 0xd0,
+ 0xe0, 0xb2, 0x3a, 0x55, 0x58, 0xfe, 0x5c, 0x28,
+ 0x70, 0x51, 0x23, 0xb0, 0x7b, 0x6a, 0x5f, 0x1e,
+ 0xb8, 0x17, 0xd7, 0x94, 0x15, 0x8f, 0xee, 0x20,
+ 0xc7, 0x42, 0x25, 0x3e, 0x9a, 0x14, 0xd7, 0x60,
+ 0x72, 0x39, 0x47, 0x48, 0xa9, 0xfe, 0xdd, 0x47,
+ 0x0a, 0xb1, 0xe6, 0x60, 0x28, 0x8c, 0x11, 0x68,
+ 0xe1, 0xff, 0xd7, 0xce, 0xc8, 0xbe, 0xb3, 0xfe,
+ 0x27, 0x30, 0x09, 0x70, 0xd7, 0xfa, 0x02, 0x33,
+ 0x3a, 0x61, 0x2e, 0xc7, 0xff, 0xa4, 0x2a, 0xa8,
+ 0x6e, 0xb4, 0x79, 0x35, 0x6d, 0x4c, 0x1e, 0x38,
+ 0xf8, 0xee, 0xd4, 0x84, 0x4e, 0x6e, 0x28, 0xa7,
+ 0xce, 0xc8, 0xc1, 0xcf, 0x80, 0x05, 0xf3, 0x04,
+ 0xef, 0xc8, 0x18, 0x28, 0x2e, 0x8d, 0x5e, 0x0c,
+ 0xdf, 0xb8, 0x5f, 0x96, 0xe8, 0xc6, 0x9c, 0x2f,
+ 0xe5, 0xa6, 0x44, 0xd7, 0xe7, 0x99, 0x44, 0x0c,
+ 0xec, 0xd7, 0x05, 0x60, 0x97, 0xbb, 0x74, 0x77,
+ 0x58, 0xd5, 0xbb, 0x48, 0xde, 0x5a, 0xb2, 0x54,
+ 0x7f, 0x0e, 0x46, 0x70, 0x6a, 0x6f, 0x78, 0xa5,
+ 0x08, 0x89, 0x05, 0x4e, 0x7e, 0xa0, 0x69, 0xb4,
+ 0x40, 0x60, 0x55, 0x77, 0x75, 0x9b, 0x19, 0xf2,
+ 0xd5, 0x13, 0x80, 0x77, 0xf9, 0x4b, 0x3f, 0x1e,
+ 0xee, 0xe6, 0x76, 0x84, 0x7b, 0x8c, 0xe5, 0x27,
+ 0xa8, 0x0a, 0x91, 0x01, 0x68, 0x71, 0x8a, 0x3f,
+ 0x06, 0xab, 0xf6, 0xa9, 0xa5, 0xe6, 0x72, 0x92,
+ 0xe4, 0x67, 0xe2, 0xa2, 0x46, 0x35, 0x84, 0x55,
+ 0x7d, 0xca, 0xa8, 0x85, 0xd0, 0xf1, 0x3f, 0xbe,
+ 0xd7, 0x34, 0x64, 0xfc, 0xae, 0xe3, 0xe4, 0x04,
+ 0x9f, 0x66, 0x02, 0xb9, 0x88, 0x10, 0xd9, 0xc4,
+ 0x4c, 0x31, 0x43, 0x7a, 0x93, 0xe2, 0x9b, 0x56,
+ 0x43, 0x84, 0xdc, 0xdc, 0xde, 0x1d, 0xa4, 0x02,
+ 0x0e, 0xc2, 0xef, 0xc3, 0xf8, 0x78, 0xd1, 0xb2,
+ 0x6b, 0x63, 0x18, 0xc9, 0xa9, 0xe5, 0x72, 0xd8,
+ 0xf3, 0xb9, 0xd1, 0x8a, 0xc7, 0x1a, 0x02, 0x27,
+ 0x20, 0x77, 0x10, 0xe5, 0xc8, 0xd4, 0x4a, 0x47,
+ 0xe5, 0xdf, 0x5f, 0x01, 0xaa, 0xb0, 0xd4, 0x10,
+ 0xbb, 0x69, 0xe3, 0x36, 0xc8, 0xe1, 0x3d, 0x43,
+ 0xfb, 0x86, 0xcd, 0xcc, 0xbf, 0xf4, 0x88, 0xe0,
+ 0x20, 0xca, 0xb7, 0x1b, 0xf1, 0x2f, 0x5c, 0xee,
+ 0xd4, 0xd3, 0xa3, 0xcc, 0xa4, 0x1e, 0x1c, 0x47,
+ 0xfb, 0xbf, 0xfc, 0xa2, 0x41, 0x55, 0x9d, 0xf6,
+ 0x5a, 0x5e, 0x65, 0x32, 0x34, 0x7b, 0x52, 0x8d,
+ 0xd5, 0xd0, 0x20, 0x60, 0x03, 0xab, 0x3f, 0x8c,
+ 0xd4, 0x21, 0xea, 0x2a, 0xd9, 0xc4, 0xd0, 0xd3,
+ 0x65, 0xd8, 0x7a, 0x13, 0x28, 0x62, 0x32, 0x4b,
+ 0x2c, 0x87, 0x93, 0xa8, 0xb4, 0x52, 0x45, 0x09,
+ 0x44, 0xec, 0xec, 0xc3, 0x17, 0xdb, 0x9a, 0x4d,
+ 0x5c, 0xa9, 0x11, 0xd4, 0x7d, 0xaf, 0x9e, 0xf1,
+ 0x2d, 0xb2, 0x66, 0xc5, 0x1d, 0xed, 0xb7, 0xcd,
+ 0x0b, 0x25, 0x5e, 0x30, 0x47, 0x3f, 0x40, 0xf4,
+ 0xa1, 0xa0, 0x00, 0x94, 0x10, 0xc5, 0x6a, 0x63,
+ 0x1a, 0xd5, 0x88, 0x92, 0x8e, 0x82, 0x39, 0x87,
+ 0x3c, 0x78, 0x65, 0x58, 0x42, 0x75, 0x5b, 0xdd,
+ 0x77, 0x3e, 0x09, 0x4e, 0x76, 0x5b, 0xe6, 0x0e,
+ 0x4d, 0x38, 0xb2, 0xc0, 0xb8, 0x95, 0x01, 0x7a,
+ 0x10, 0xe0, 0xfb, 0x07, 0xf2, 0xab, 0x2d, 0x8c,
+ 0x32, 0xed, 0x2b, 0xc0, 0x46, 0xc2, 0xf5, 0x38,
+ 0x83, 0xf0, 0x17, 0xec, 0xc1, 0x20, 0x6a, 0x9a,
+ 0x0b, 0x00, 0xa0, 0x98, 0x22, 0x50, 0x23, 0xd5,
+ 0x80, 0x6b, 0xf6, 0x1f, 0xc3, 0xcc, 0x97, 0xc9,
+ 0x24, 0x9f, 0xf3, 0xaf, 0x43, 0x14, 0xd5, 0xa0
+};
+static const u8 enc_assoc010[] __initconst = {
+ 0xd2, 0xa1, 0x70, 0xdb, 0x7a, 0xf8, 0xfa, 0x27,
+ 0xba, 0x73, 0x0f, 0xbf, 0x3d, 0x1e, 0x82, 0xb2
+};
+static const u8 enc_nonce010[] __initconst = {
+ 0xdb, 0x92, 0x0f, 0x7f, 0x17, 0x54, 0x0c, 0x30
+};
+static const u8 enc_key010[] __initconst = {
+ 0x47, 0x11, 0xeb, 0x86, 0x2b, 0x2c, 0xab, 0x44,
+ 0x34, 0xda, 0x7f, 0x57, 0x03, 0x39, 0x0c, 0xaf,
+ 0x2c, 0x14, 0xfd, 0x65, 0x23, 0xe9, 0x8e, 0x74,
+ 0xd5, 0x08, 0x68, 0x08, 0xe7, 0xb4, 0x72, 0xd7
+};
+
+static const u8 enc_input011[] __initconst = {
+ 0x7a, 0x57, 0xf2, 0xc7, 0x06, 0x3f, 0x50, 0x7b,
+ 0x36, 0x1a, 0x66, 0x5c, 0xb9, 0x0e, 0x5e, 0x3b,
+ 0x45, 0x60, 0xbe, 0x9a, 0x31, 0x9f, 0xff, 0x5d,
+ 0x66, 0x34, 0xb4, 0xdc, 0xfb, 0x9d, 0x8e, 0xee,
+ 0x6a, 0x33, 0xa4, 0x07, 0x3c, 0xf9, 0x4c, 0x30,
+ 0xa1, 0x24, 0x52, 0xf9, 0x50, 0x46, 0x88, 0x20,
+ 0x02, 0x32, 0x3a, 0x0e, 0x99, 0x63, 0xaf, 0x1f,
+ 0x15, 0x28, 0x2a, 0x05, 0xff, 0x57, 0x59, 0x5e,
+ 0x18, 0xa1, 0x1f, 0xd0, 0x92, 0x5c, 0x88, 0x66,
+ 0x1b, 0x00, 0x64, 0xa5, 0x93, 0x8d, 0x06, 0x46,
+ 0xb0, 0x64, 0x8b, 0x8b, 0xef, 0x99, 0x05, 0x35,
+ 0x85, 0xb3, 0xf3, 0x33, 0xbb, 0xec, 0x66, 0xb6,
+ 0x3d, 0x57, 0x42, 0xe3, 0xb4, 0xc6, 0xaa, 0xb0,
+ 0x41, 0x2a, 0xb9, 0x59, 0xa9, 0xf6, 0x3e, 0x15,
+ 0x26, 0x12, 0x03, 0x21, 0x4c, 0x74, 0x43, 0x13,
+ 0x2a, 0x03, 0x27, 0x09, 0xb4, 0xfb, 0xe7, 0xb7,
+ 0x40, 0xff, 0x5e, 0xce, 0x48, 0x9a, 0x60, 0xe3,
+ 0x8b, 0x80, 0x8c, 0x38, 0x2d, 0xcb, 0x93, 0x37,
+ 0x74, 0x05, 0x52, 0x6f, 0x73, 0x3e, 0xc3, 0xbc,
+ 0xca, 0x72, 0x0a, 0xeb, 0xf1, 0x3b, 0xa0, 0x95,
+ 0xdc, 0x8a, 0xc4, 0xa9, 0xdc, 0xca, 0x44, 0xd8,
+ 0x08, 0x63, 0x6a, 0x36, 0xd3, 0x3c, 0xb8, 0xac,
+ 0x46, 0x7d, 0xfd, 0xaa, 0xeb, 0x3e, 0x0f, 0x45,
+ 0x8f, 0x49, 0xda, 0x2b, 0xf2, 0x12, 0xbd, 0xaf,
+ 0x67, 0x8a, 0x63, 0x48, 0x4b, 0x55, 0x5f, 0x6d,
+ 0x8c, 0xb9, 0x76, 0x34, 0x84, 0xae, 0xc2, 0xfc,
+ 0x52, 0x64, 0x82, 0xf7, 0xb0, 0x06, 0xf0, 0x45,
+ 0x73, 0x12, 0x50, 0x30, 0x72, 0xea, 0x78, 0x9a,
+ 0xa8, 0xaf, 0xb5, 0xe3, 0xbb, 0x77, 0x52, 0xec,
+ 0x59, 0x84, 0xbf, 0x6b, 0x8f, 0xce, 0x86, 0x5e,
+ 0x1f, 0x23, 0xe9, 0xfb, 0x08, 0x86, 0xf7, 0x10,
+ 0xb9, 0xf2, 0x44, 0x96, 0x44, 0x63, 0xa9, 0xa8,
+ 0x78, 0x00, 0x23, 0xd6, 0xc7, 0xe7, 0x6e, 0x66,
+ 0x4f, 0xcc, 0xee, 0x15, 0xb3, 0xbd, 0x1d, 0xa0,
+ 0xe5, 0x9c, 0x1b, 0x24, 0x2c, 0x4d, 0x3c, 0x62,
+ 0x35, 0x9c, 0x88, 0x59, 0x09, 0xdd, 0x82, 0x1b,
+ 0xcf, 0x0a, 0x83, 0x6b, 0x3f, 0xae, 0x03, 0xc4,
+ 0xb4, 0xdd, 0x7e, 0x5b, 0x28, 0x76, 0x25, 0x96,
+ 0xd9, 0xc9, 0x9d, 0x5f, 0x86, 0xfa, 0xf6, 0xd7,
+ 0xd2, 0xe6, 0x76, 0x1d, 0x0f, 0xa1, 0xdc, 0x74,
+ 0x05, 0x1b, 0x1d, 0xe0, 0xcd, 0x16, 0xb0, 0xa8,
+ 0x8a, 0x34, 0x7b, 0x15, 0x11, 0x77, 0xe5, 0x7b,
+ 0x7e, 0x20, 0xf7, 0xda, 0x38, 0xda, 0xce, 0x70,
+ 0xe9, 0xf5, 0x6c, 0xd9, 0xbe, 0x0c, 0x4c, 0x95,
+ 0x4c, 0xc2, 0x9b, 0x34, 0x55, 0x55, 0xe1, 0xf3,
+ 0x46, 0x8e, 0x48, 0x74, 0x14, 0x4f, 0x9d, 0xc9,
+ 0xf5, 0xe8, 0x1a, 0xf0, 0x11, 0x4a, 0xc1, 0x8d,
+ 0xe0, 0x93, 0xa0, 0xbe, 0x09, 0x1c, 0x2b, 0x4e,
+ 0x0f, 0xb2, 0x87, 0x8b, 0x84, 0xfe, 0x92, 0x32,
+ 0x14, 0xd7, 0x93, 0xdf, 0xe7, 0x44, 0xbc, 0xc5,
+ 0xae, 0x53, 0x69, 0xd8, 0xb3, 0x79, 0x37, 0x80,
+ 0xe3, 0x17, 0x5c, 0xec, 0x53, 0x00, 0x9a, 0xe3,
+ 0x8e, 0xdc, 0x38, 0xb8, 0x66, 0xf0, 0xd3, 0xad,
+ 0x1d, 0x02, 0x96, 0x86, 0x3e, 0x9d, 0x3b, 0x5d,
+ 0xa5, 0x7f, 0x21, 0x10, 0xf1, 0x1f, 0x13, 0x20,
+ 0xf9, 0x57, 0x87, 0x20, 0xf5, 0x5f, 0xf1, 0x17,
+ 0x48, 0x0a, 0x51, 0x5a, 0xcd, 0x19, 0x03, 0xa6,
+ 0x5a, 0xd1, 0x12, 0x97, 0xe9, 0x48, 0xe2, 0x1d,
+ 0x83, 0x75, 0x50, 0xd9, 0x75, 0x7d, 0x6a, 0x82,
+ 0xa1, 0xf9, 0x4e, 0x54, 0x87, 0x89, 0xc9, 0x0c,
+ 0xb7, 0x5b, 0x6a, 0x91, 0xc1, 0x9c, 0xb2, 0xa9,
+ 0xdc, 0x9a, 0xa4, 0x49, 0x0a, 0x6d, 0x0d, 0xbb,
+ 0xde, 0x86, 0x44, 0xdd, 0x5d, 0x89, 0x2b, 0x96,
+ 0x0f, 0x23, 0x95, 0xad, 0xcc, 0xa2, 0xb3, 0xb9,
+ 0x7e, 0x74, 0x38, 0xba, 0x9f, 0x73, 0xae, 0x5f,
+ 0xf8, 0x68, 0xa2, 0xe0, 0xa9, 0xce, 0xbd, 0x40,
+ 0xd4, 0x4c, 0x6b, 0xd2, 0x56, 0x62, 0xb0, 0xcc,
+ 0x63, 0x7e, 0x5b, 0xd3, 0xae, 0xd1, 0x75, 0xce,
+ 0xbb, 0xb4, 0x5b, 0xa8, 0xf8, 0xb4, 0xac, 0x71,
+ 0x75, 0xaa, 0xc9, 0x9f, 0xbb, 0x6c, 0xad, 0x0f,
+ 0x55, 0x5d, 0xe8, 0x85, 0x7d, 0xf9, 0x21, 0x35,
+ 0xea, 0x92, 0x85, 0x2b, 0x00, 0xec, 0x84, 0x90,
+ 0x0a, 0x63, 0x96, 0xe4, 0x6b, 0xa9, 0x77, 0xb8,
+ 0x91, 0xf8, 0x46, 0x15, 0x72, 0x63, 0x70, 0x01,
+ 0x40, 0xa3, 0xa5, 0x76, 0x62, 0x2b, 0xbf, 0xf1,
+ 0xe5, 0x8d, 0x9f, 0xa3, 0xfa, 0x9b, 0x03, 0xbe,
+ 0xfe, 0x65, 0x6f, 0xa2, 0x29, 0x0d, 0x54, 0xb4,
+ 0x71, 0xce, 0xa9, 0xd6, 0x3d, 0x88, 0xf9, 0xaf,
+ 0x6b, 0xa8, 0x9e, 0xf4, 0x16, 0x96, 0x36, 0xb9,
+ 0x00, 0xdc, 0x10, 0xab, 0xb5, 0x08, 0x31, 0x1f,
+ 0x00, 0xb1, 0x3c, 0xd9, 0x38, 0x3e, 0xc6, 0x04,
+ 0xa7, 0x4e, 0xe8, 0xae, 0xed, 0x98, 0xc2, 0xf7,
+ 0xb9, 0x00, 0x5f, 0x8c, 0x60, 0xd1, 0xe5, 0x15,
+ 0xf7, 0xae, 0x1e, 0x84, 0x88, 0xd1, 0xf6, 0xbc,
+ 0x3a, 0x89, 0x35, 0x22, 0x83, 0x7c, 0xca, 0xf0,
+ 0x33, 0x82, 0x4c, 0x79, 0x3c, 0xfd, 0xb1, 0xae,
+ 0x52, 0x62, 0x55, 0xd2, 0x41, 0x60, 0xc6, 0xbb,
+ 0xfa, 0x0e, 0x59, 0xd6, 0xa8, 0xfe, 0x5d, 0xed,
+ 0x47, 0x3d, 0xe0, 0xea, 0x1f, 0x6e, 0x43, 0x51,
+ 0xec, 0x10, 0x52, 0x56, 0x77, 0x42, 0x6b, 0x52,
+ 0x87, 0xd8, 0xec, 0xe0, 0xaa, 0x76, 0xa5, 0x84,
+ 0x2a, 0x22, 0x24, 0xfd, 0x92, 0x40, 0x88, 0xd5,
+ 0x85, 0x1c, 0x1f, 0x6b, 0x47, 0xa0, 0xc4, 0xe4,
+ 0xef, 0xf4, 0xea, 0xd7, 0x59, 0xac, 0x2a, 0x9e,
+ 0x8c, 0xfa, 0x1f, 0x42, 0x08, 0xfe, 0x4f, 0x74,
+ 0xa0, 0x26, 0xf5, 0xb3, 0x84, 0xf6, 0x58, 0x5f,
+ 0x26, 0x66, 0x3e, 0xd7, 0xe4, 0x22, 0x91, 0x13,
+ 0xc8, 0xac, 0x25, 0x96, 0x23, 0xd8, 0x09, 0xea,
+ 0x45, 0x75, 0x23, 0xb8, 0x5f, 0xc2, 0x90, 0x8b,
+ 0x09, 0xc4, 0xfc, 0x47, 0x6c, 0x6d, 0x0a, 0xef,
+ 0x69, 0xa4, 0x38, 0x19, 0xcf, 0x7d, 0xf9, 0x09,
+ 0x73, 0x9b, 0x60, 0x5a, 0xf7, 0x37, 0xb5, 0xfe,
+ 0x9f, 0xe3, 0x2b, 0x4c, 0x0d, 0x6e, 0x19, 0xf1,
+ 0xd6, 0xc0, 0x70, 0xf3, 0x9d, 0x22, 0x3c, 0xf9,
+ 0x49, 0xce, 0x30, 0x8e, 0x44, 0xb5, 0x76, 0x15,
+ 0x8f, 0x52, 0xfd, 0xa5, 0x04, 0xb8, 0x55, 0x6a,
+ 0x36, 0x59, 0x7c, 0xc4, 0x48, 0xb8, 0xd7, 0xab,
+ 0x05, 0x66, 0xe9, 0x5e, 0x21, 0x6f, 0x6b, 0x36,
+ 0x29, 0xbb, 0xe9, 0xe3, 0xa2, 0x9a, 0xa8, 0xcd,
+ 0x55, 0x25, 0x11, 0xba, 0x5a, 0x58, 0xa0, 0xde,
+ 0xae, 0x19, 0x2a, 0x48, 0x5a, 0xff, 0x36, 0xcd,
+ 0x6d, 0x16, 0x7a, 0x73, 0x38, 0x46, 0xe5, 0x47,
+ 0x59, 0xc8, 0xa2, 0xf6, 0xe2, 0x6c, 0x83, 0xc5,
+ 0x36, 0x2c, 0x83, 0x7d, 0xb4, 0x01, 0x05, 0x69,
+ 0xe7, 0xaf, 0x5c, 0xc4, 0x64, 0x82, 0x12, 0x21,
+ 0xef, 0xf7, 0xd1, 0x7d, 0xb8, 0x8d, 0x8c, 0x98,
+ 0x7c, 0x5f, 0x7d, 0x92, 0x88, 0xb9, 0x94, 0x07,
+ 0x9c, 0xd8, 0xe9, 0x9c, 0x17, 0x38, 0xe3, 0x57,
+ 0x6c, 0xe0, 0xdc, 0xa5, 0x92, 0x42, 0xb3, 0xbd,
+ 0x50, 0xa2, 0x7e, 0xb5, 0xb1, 0x52, 0x72, 0x03,
+ 0x97, 0xd8, 0xaa, 0x9a, 0x1e, 0x75, 0x41, 0x11,
+ 0xa3, 0x4f, 0xcc, 0xd4, 0xe3, 0x73, 0xad, 0x96,
+ 0xdc, 0x47, 0x41, 0x9f, 0xb0, 0xbe, 0x79, 0x91,
+ 0xf5, 0xb6, 0x18, 0xfe, 0xc2, 0x83, 0x18, 0x7d,
+ 0x73, 0xd9, 0x4f, 0x83, 0x84, 0x03, 0xb3, 0xf0,
+ 0x77, 0x66, 0x3d, 0x83, 0x63, 0x2e, 0x2c, 0xf9,
+ 0xdd, 0xa6, 0x1f, 0x89, 0x82, 0xb8, 0x23, 0x42,
+ 0xeb, 0xe2, 0xca, 0x70, 0x82, 0x61, 0x41, 0x0a,
+ 0x6d, 0x5f, 0x75, 0xc5, 0xe2, 0xc4, 0x91, 0x18,
+ 0x44, 0x22, 0xfa, 0x34, 0x10, 0xf5, 0x20, 0xdc,
+ 0xb7, 0xdd, 0x2a, 0x20, 0x77, 0xf5, 0xf9, 0xce,
+ 0xdb, 0xa0, 0x0a, 0x52, 0x2a, 0x4e, 0xdd, 0xcc,
+ 0x97, 0xdf, 0x05, 0xe4, 0x5e, 0xb7, 0xaa, 0xf0,
+ 0xe2, 0x80, 0xff, 0xba, 0x1a, 0x0f, 0xac, 0xdf,
+ 0x02, 0x32, 0xe6, 0xf7, 0xc7, 0x17, 0x13, 0xb7,
+ 0xfc, 0x98, 0x48, 0x8c, 0x0d, 0x82, 0xc9, 0x80,
+ 0x7a, 0xe2, 0x0a, 0xc5, 0xb4, 0xde, 0x7c, 0x3c,
+ 0x79, 0x81, 0x0e, 0x28, 0x65, 0x79, 0x67, 0x82,
+ 0x69, 0x44, 0x66, 0x09, 0xf7, 0x16, 0x1a, 0xf9,
+ 0x7d, 0x80, 0xa1, 0x79, 0x14, 0xa9, 0xc8, 0x20,
+ 0xfb, 0xa2, 0x46, 0xbe, 0x08, 0x35, 0x17, 0x58,
+ 0xc1, 0x1a, 0xda, 0x2a, 0x6b, 0x2e, 0x1e, 0xe6,
+ 0x27, 0x55, 0x7b, 0x19, 0xe2, 0xfb, 0x64, 0xfc,
+ 0x5e, 0x15, 0x54, 0x3c, 0xe7, 0xc2, 0x11, 0x50,
+ 0x30, 0xb8, 0x72, 0x03, 0x0b, 0x1a, 0x9f, 0x86,
+ 0x27, 0x11, 0x5c, 0x06, 0x2b, 0xbd, 0x75, 0x1a,
+ 0x0a, 0xda, 0x01, 0xfa, 0x5c, 0x4a, 0xc1, 0x80,
+ 0x3a, 0x6e, 0x30, 0xc8, 0x2c, 0xeb, 0x56, 0xec,
+ 0x89, 0xfa, 0x35, 0x7b, 0xb2, 0xf0, 0x97, 0x08,
+ 0x86, 0x53, 0xbe, 0xbd, 0x40, 0x41, 0x38, 0x1c,
+ 0xb4, 0x8b, 0x79, 0x2e, 0x18, 0x96, 0x94, 0xde,
+ 0xe8, 0xca, 0xe5, 0x9f, 0x92, 0x9f, 0x15, 0x5d,
+ 0x56, 0x60, 0x5c, 0x09, 0xf9, 0x16, 0xf4, 0x17,
+ 0x0f, 0xf6, 0x4c, 0xda, 0xe6, 0x67, 0x89, 0x9f,
+ 0xca, 0x6c, 0xe7, 0x9b, 0x04, 0x62, 0x0e, 0x26,
+ 0xa6, 0x52, 0xbd, 0x29, 0xff, 0xc7, 0xa4, 0x96,
+ 0xe6, 0x6a, 0x02, 0xa5, 0x2e, 0x7b, 0xfe, 0x97,
+ 0x68, 0x3e, 0x2e, 0x5f, 0x3b, 0x0f, 0x36, 0xd6,
+ 0x98, 0x19, 0x59, 0x48, 0xd2, 0xc6, 0xe1, 0x55,
+ 0x1a, 0x6e, 0xd6, 0xed, 0x2c, 0xba, 0xc3, 0x9e,
+ 0x64, 0xc9, 0x95, 0x86, 0x35, 0x5e, 0x3e, 0x88,
+ 0x69, 0x99, 0x4b, 0xee, 0xbe, 0x9a, 0x99, 0xb5,
+ 0x6e, 0x58, 0xae, 0xdd, 0x22, 0xdb, 0xdd, 0x6b,
+ 0xfc, 0xaf, 0x90, 0xa3, 0x3d, 0xa4, 0xc1, 0x15,
+ 0x92, 0x18, 0x8d, 0xd2, 0x4b, 0x7b, 0x06, 0xd1,
+ 0x37, 0xb5, 0xe2, 0x7c, 0x2c, 0xf0, 0x25, 0xe4,
+ 0x94, 0x2a, 0xbd, 0xe3, 0x82, 0x70, 0x78, 0xa3,
+ 0x82, 0x10, 0x5a, 0x90, 0xd7, 0xa4, 0xfa, 0xaf,
+ 0x1a, 0x88, 0x59, 0xdc, 0x74, 0x12, 0xb4, 0x8e,
+ 0xd7, 0x19, 0x46, 0xf4, 0x84, 0x69, 0x9f, 0xbb,
+ 0x70, 0xa8, 0x4c, 0x52, 0x81, 0xa9, 0xff, 0x76,
+ 0x1c, 0xae, 0xd8, 0x11, 0x3d, 0x7f, 0x7d, 0xc5,
+ 0x12, 0x59, 0x28, 0x18, 0xc2, 0xa2, 0xb7, 0x1c,
+ 0x88, 0xf8, 0xd6, 0x1b, 0xa6, 0x7d, 0x9e, 0xde,
+ 0x29, 0xf8, 0xed, 0xff, 0xeb, 0x92, 0x24, 0x4f,
+ 0x05, 0xaa, 0xd9, 0x49, 0xba, 0x87, 0x59, 0x51,
+ 0xc9, 0x20, 0x5c, 0x9b, 0x74, 0xcf, 0x03, 0xd9,
+ 0x2d, 0x34, 0xc7, 0x5b, 0xa5, 0x40, 0xb2, 0x99,
+ 0xf5, 0xcb, 0xb4, 0xf6, 0xb7, 0x72, 0x4a, 0xd6,
+ 0xbd, 0xb0, 0xf3, 0x93, 0xe0, 0x1b, 0xa8, 0x04,
+ 0x1e, 0x35, 0xd4, 0x80, 0x20, 0xf4, 0x9c, 0x31,
+ 0x6b, 0x45, 0xb9, 0x15, 0xb0, 0x5e, 0xdd, 0x0a,
+ 0x33, 0x9c, 0x83, 0xcd, 0x58, 0x89, 0x50, 0x56,
+ 0xbb, 0x81, 0x00, 0x91, 0x32, 0xf3, 0x1b, 0x3e,
+ 0xcf, 0x45, 0xe1, 0xf9, 0xe1, 0x2c, 0x26, 0x78,
+ 0x93, 0x9a, 0x60, 0x46, 0xc9, 0xb5, 0x5e, 0x6a,
+ 0x28, 0x92, 0x87, 0x3f, 0x63, 0x7b, 0xdb, 0xf7,
+ 0xd0, 0x13, 0x9d, 0x32, 0x40, 0x5e, 0xcf, 0xfb,
+ 0x79, 0x68, 0x47, 0x4c, 0xfd, 0x01, 0x17, 0xe6,
+ 0x97, 0x93, 0x78, 0xbb, 0xa6, 0x27, 0xa3, 0xe8,
+ 0x1a, 0xe8, 0x94, 0x55, 0x7d, 0x08, 0xe5, 0xdc,
+ 0x66, 0xa3, 0x69, 0xc8, 0xca, 0xc5, 0xa1, 0x84,
+ 0x55, 0xde, 0x08, 0x91, 0x16, 0x3a, 0x0c, 0x86,
+ 0xab, 0x27, 0x2b, 0x64, 0x34, 0x02, 0x6c, 0x76,
+ 0x8b, 0xc6, 0xaf, 0xcc, 0xe1, 0xd6, 0x8c, 0x2a,
+ 0x18, 0x3d, 0xa6, 0x1b, 0x37, 0x75, 0x45, 0x73,
+ 0xc2, 0x75, 0xd7, 0x53, 0x78, 0x3a, 0xd6, 0xe8,
+ 0x29, 0xd2, 0x4a, 0xa8, 0x1e, 0x82, 0xf6, 0xb6,
+ 0x81, 0xde, 0x21, 0xed, 0x2b, 0x56, 0xbb, 0xf2,
+ 0xd0, 0x57, 0xc1, 0x7c, 0xd2, 0x6a, 0xd2, 0x56,
+ 0xf5, 0x13, 0x5f, 0x1c, 0x6a, 0x0b, 0x74, 0xfb,
+ 0xe9, 0xfe, 0x9e, 0xea, 0x95, 0xb2, 0x46, 0xab,
+ 0x0a, 0xfc, 0xfd, 0xf3, 0xbb, 0x04, 0x2b, 0x76,
+ 0x1b, 0xa4, 0x74, 0xb0, 0xc1, 0x78, 0xc3, 0x69,
+ 0xe2, 0xb0, 0x01, 0xe1, 0xde, 0x32, 0x4c, 0x8d,
+ 0x1a, 0xb3, 0x38, 0x08, 0xd5, 0xfc, 0x1f, 0xdc,
+ 0x0e, 0x2c, 0x9c, 0xb1, 0xa1, 0x63, 0x17, 0x22,
+ 0xf5, 0x6c, 0x93, 0x70, 0x74, 0x00, 0xf8, 0x39,
+ 0x01, 0x94, 0xd1, 0x32, 0x23, 0x56, 0x5d, 0xa6,
+ 0x02, 0x76, 0x76, 0x93, 0xce, 0x2f, 0x19, 0xe9,
+ 0x17, 0x52, 0xae, 0x6e, 0x2c, 0x6d, 0x61, 0x7f,
+ 0x3b, 0xaa, 0xe0, 0x52, 0x85, 0xc5, 0x65, 0xc1,
+ 0xbb, 0x8e, 0x5b, 0x21, 0xd5, 0xc9, 0x78, 0x83,
+ 0x07, 0x97, 0x4c, 0x62, 0x61, 0x41, 0xd4, 0xfc,
+ 0xc9, 0x39, 0xe3, 0x9b, 0xd0, 0xcc, 0x75, 0xc4,
+ 0x97, 0xe6, 0xdd, 0x2a, 0x5f, 0xa6, 0xe8, 0x59,
+ 0x6c, 0x98, 0xb9, 0x02, 0xe2, 0xa2, 0xd6, 0x68,
+ 0xee, 0x3b, 0x1d, 0xe3, 0x4d, 0x5b, 0x30, 0xef,
+ 0x03, 0xf2, 0xeb, 0x18, 0x57, 0x36, 0xe8, 0xa1,
+ 0xf4, 0x47, 0xfb, 0xcb, 0x8f, 0xcb, 0xc8, 0xf3,
+ 0x4f, 0x74, 0x9d, 0x9d, 0xb1, 0x8d, 0x14, 0x44,
+ 0xd9, 0x19, 0xb4, 0x54, 0x4f, 0x75, 0x19, 0x09,
+ 0xa0, 0x75, 0xbc, 0x3b, 0x82, 0xc6, 0x3f, 0xb8,
+ 0x83, 0x19, 0x6e, 0xd6, 0x37, 0xfe, 0x6e, 0x8a,
+ 0x4e, 0xe0, 0x4a, 0xab, 0x7b, 0xc8, 0xb4, 0x1d,
+ 0xf4, 0xed, 0x27, 0x03, 0x65, 0xa2, 0xa1, 0xae,
+ 0x11, 0xe7, 0x98, 0x78, 0x48, 0x91, 0xd2, 0xd2,
+ 0xd4, 0x23, 0x78, 0x50, 0xb1, 0x5b, 0x85, 0x10,
+ 0x8d, 0xca, 0x5f, 0x0f, 0x71, 0xae, 0x72, 0x9a,
+ 0xf6, 0x25, 0x19, 0x60, 0x06, 0xf7, 0x10, 0x34,
+ 0x18, 0x0d, 0xc9, 0x9f, 0x7b, 0x0c, 0x9b, 0x8f,
+ 0x91, 0x1b, 0x9f, 0xcd, 0x10, 0xee, 0x75, 0xf9,
+ 0x97, 0x66, 0xfc, 0x4d, 0x33, 0x6e, 0x28, 0x2b,
+ 0x92, 0x85, 0x4f, 0xab, 0x43, 0x8d, 0x8f, 0x7d,
+ 0x86, 0xa7, 0xc7, 0xd8, 0xd3, 0x0b, 0x8b, 0x57,
+ 0xb6, 0x1d, 0x95, 0x0d, 0xe9, 0xbc, 0xd9, 0x03,
+ 0xd9, 0x10, 0x19, 0xc3, 0x46, 0x63, 0x55, 0x87,
+ 0x61, 0x79, 0x6c, 0x95, 0x0e, 0x9c, 0xdd, 0xca,
+ 0xc3, 0xf3, 0x64, 0xf0, 0x7d, 0x76, 0xb7, 0x53,
+ 0x67, 0x2b, 0x1e, 0x44, 0x56, 0x81, 0xea, 0x8f,
+ 0x5c, 0x42, 0x16, 0xb8, 0x28, 0xeb, 0x1b, 0x61,
+ 0x10, 0x1e, 0xbf, 0xec, 0xa8
+};
+static const u8 enc_output011[] __initconst = {
+ 0x6a, 0xfc, 0x4b, 0x25, 0xdf, 0xc0, 0xe4, 0xe8,
+ 0x17, 0x4d, 0x4c, 0xc9, 0x7e, 0xde, 0x3a, 0xcc,
+ 0x3c, 0xba, 0x6a, 0x77, 0x47, 0xdb, 0xe3, 0x74,
+ 0x7a, 0x4d, 0x5f, 0x8d, 0x37, 0x55, 0x80, 0x73,
+ 0x90, 0x66, 0x5d, 0x3a, 0x7d, 0x5d, 0x86, 0x5e,
+ 0x8d, 0xfd, 0x83, 0xff, 0x4e, 0x74, 0x6f, 0xf9,
+ 0xe6, 0x70, 0x17, 0x70, 0x3e, 0x96, 0xa7, 0x7e,
+ 0xcb, 0xab, 0x8f, 0x58, 0x24, 0x9b, 0x01, 0xfd,
+ 0xcb, 0xe6, 0x4d, 0x9b, 0xf0, 0x88, 0x94, 0x57,
+ 0x66, 0xef, 0x72, 0x4c, 0x42, 0x6e, 0x16, 0x19,
+ 0x15, 0xea, 0x70, 0x5b, 0xac, 0x13, 0xdb, 0x9f,
+ 0x18, 0xe2, 0x3c, 0x26, 0x97, 0xbc, 0xdc, 0x45,
+ 0x8c, 0x6c, 0x24, 0x69, 0x9c, 0xf7, 0x65, 0x1e,
+ 0x18, 0x59, 0x31, 0x7c, 0xe4, 0x73, 0xbc, 0x39,
+ 0x62, 0xc6, 0x5c, 0x9f, 0xbf, 0xfa, 0x90, 0x03,
+ 0xc9, 0x72, 0x26, 0xb6, 0x1b, 0xc2, 0xb7, 0x3f,
+ 0xf2, 0x13, 0x77, 0xf2, 0x8d, 0xb9, 0x47, 0xd0,
+ 0x53, 0xdd, 0xc8, 0x91, 0x83, 0x8b, 0xb1, 0xce,
+ 0xa3, 0xfe, 0xcd, 0xd9, 0xdd, 0x92, 0x7b, 0xdb,
+ 0xb8, 0xfb, 0xc9, 0x2d, 0x01, 0x59, 0x39, 0x52,
+ 0xad, 0x1b, 0xec, 0xcf, 0xd7, 0x70, 0x13, 0x21,
+ 0xf5, 0x47, 0xaa, 0x18, 0x21, 0x5c, 0xc9, 0x9a,
+ 0xd2, 0x6b, 0x05, 0x9c, 0x01, 0xa1, 0xda, 0x35,
+ 0x5d, 0xb3, 0x70, 0xe6, 0xa9, 0x80, 0x8b, 0x91,
+ 0xb7, 0xb3, 0x5f, 0x24, 0x9a, 0xb7, 0xd1, 0x6b,
+ 0xa1, 0x1c, 0x50, 0xba, 0x49, 0xe0, 0xee, 0x2e,
+ 0x75, 0xac, 0x69, 0xc0, 0xeb, 0x03, 0xdd, 0x19,
+ 0xe5, 0xf6, 0x06, 0xdd, 0xc3, 0xd7, 0x2b, 0x07,
+ 0x07, 0x30, 0xa7, 0x19, 0x0c, 0xbf, 0xe6, 0x18,
+ 0xcc, 0xb1, 0x01, 0x11, 0x85, 0x77, 0x1d, 0x96,
+ 0xa7, 0xa3, 0x00, 0x84, 0x02, 0xa2, 0x83, 0x68,
+ 0xda, 0x17, 0x27, 0xc8, 0x7f, 0x23, 0xb7, 0xf4,
+ 0x13, 0x85, 0xcf, 0xdd, 0x7a, 0x7d, 0x24, 0x57,
+ 0xfe, 0x05, 0x93, 0xf5, 0x74, 0xce, 0xed, 0x0c,
+ 0x20, 0x98, 0x8d, 0x92, 0x30, 0xa1, 0x29, 0x23,
+ 0x1a, 0xa0, 0x4f, 0x69, 0x56, 0x4c, 0xe1, 0xc8,
+ 0xce, 0xf6, 0x9a, 0x0c, 0xa4, 0xfa, 0x04, 0xf6,
+ 0x62, 0x95, 0xf2, 0xfa, 0xc7, 0x40, 0x68, 0x40,
+ 0x8f, 0x41, 0xda, 0xb4, 0x26, 0x6f, 0x70, 0xab,
+ 0x40, 0x61, 0xa4, 0x0e, 0x75, 0xfb, 0x86, 0xeb,
+ 0x9d, 0x9a, 0x1f, 0xec, 0x76, 0x99, 0xe7, 0xea,
+ 0xaa, 0x1e, 0x2d, 0xb5, 0xd4, 0xa6, 0x1a, 0xb8,
+ 0x61, 0x0a, 0x1d, 0x16, 0x5b, 0x98, 0xc2, 0x31,
+ 0x40, 0xe7, 0x23, 0x1d, 0x66, 0x99, 0xc8, 0xc0,
+ 0xd7, 0xce, 0xf3, 0x57, 0x40, 0x04, 0x3f, 0xfc,
+ 0xea, 0xb3, 0xfc, 0xd2, 0xd3, 0x99, 0xa4, 0x94,
+ 0x69, 0xa0, 0xef, 0xd1, 0x85, 0xb3, 0xa6, 0xb1,
+ 0x28, 0xbf, 0x94, 0x67, 0x22, 0xc3, 0x36, 0x46,
+ 0xf8, 0xd2, 0x0f, 0x5f, 0xf4, 0x59, 0x80, 0xe6,
+ 0x2d, 0x43, 0x08, 0x7d, 0x19, 0x09, 0x97, 0xa7,
+ 0x4c, 0x3d, 0x8d, 0xba, 0x65, 0x62, 0xa3, 0x71,
+ 0x33, 0x29, 0x62, 0xdb, 0xc1, 0x33, 0x34, 0x1a,
+ 0x63, 0x33, 0x16, 0xb6, 0x64, 0x7e, 0xab, 0x33,
+ 0xf0, 0xe6, 0x26, 0x68, 0xba, 0x1d, 0x2e, 0x38,
+ 0x08, 0xe6, 0x02, 0xd3, 0x25, 0x2c, 0x47, 0x23,
+ 0x58, 0x34, 0x0f, 0x9d, 0x63, 0x4f, 0x63, 0xbb,
+ 0x7f, 0x3b, 0x34, 0x38, 0xa7, 0xb5, 0x8d, 0x65,
+ 0xd9, 0x9f, 0x79, 0x55, 0x3e, 0x4d, 0xe7, 0x73,
+ 0xd8, 0xf6, 0x98, 0x97, 0x84, 0x60, 0x9c, 0xc8,
+ 0xa9, 0x3c, 0xf6, 0xdc, 0x12, 0x5c, 0xe1, 0xbb,
+ 0x0b, 0x8b, 0x98, 0x9c, 0x9d, 0x26, 0x7c, 0x4a,
+ 0xe6, 0x46, 0x36, 0x58, 0x21, 0x4a, 0xee, 0xca,
+ 0xd7, 0x3b, 0xc2, 0x6c, 0x49, 0x2f, 0xe5, 0xd5,
+ 0x03, 0x59, 0x84, 0x53, 0xcb, 0xfe, 0x92, 0x71,
+ 0x2e, 0x7c, 0x21, 0xcc, 0x99, 0x85, 0x7f, 0xb8,
+ 0x74, 0x90, 0x13, 0x42, 0x3f, 0xe0, 0x6b, 0x1d,
+ 0xf2, 0x4d, 0x54, 0xd4, 0xfc, 0x3a, 0x05, 0xe6,
+ 0x74, 0xaf, 0xa6, 0xa0, 0x2a, 0x20, 0x23, 0x5d,
+ 0x34, 0x5c, 0xd9, 0x3e, 0x4e, 0xfa, 0x93, 0xe7,
+ 0xaa, 0xe9, 0x6f, 0x08, 0x43, 0x67, 0x41, 0xc5,
+ 0xad, 0xfb, 0x31, 0x95, 0x82, 0x73, 0x32, 0xd8,
+ 0xa6, 0xa3, 0xed, 0x0e, 0x2d, 0xf6, 0x5f, 0xfd,
+ 0x80, 0xa6, 0x7a, 0xe0, 0xdf, 0x78, 0x15, 0x29,
+ 0x74, 0x33, 0xd0, 0x9e, 0x83, 0x86, 0x72, 0x22,
+ 0x57, 0x29, 0xb9, 0x9e, 0x5d, 0xd3, 0x1a, 0xb5,
+ 0x96, 0x72, 0x41, 0x3d, 0xf1, 0x64, 0x43, 0x67,
+ 0xee, 0xaa, 0x5c, 0xd3, 0x9a, 0x96, 0x13, 0x11,
+ 0x5d, 0xf3, 0x0c, 0x87, 0x82, 0x1e, 0x41, 0x9e,
+ 0xd0, 0x27, 0xd7, 0x54, 0x3b, 0x67, 0x73, 0x09,
+ 0x91, 0xe9, 0xd5, 0x36, 0xa7, 0xb5, 0x55, 0xe4,
+ 0xf3, 0x21, 0x51, 0x49, 0x22, 0x07, 0x55, 0x4f,
+ 0x44, 0x4b, 0xd2, 0x15, 0x93, 0x17, 0x2a, 0xfa,
+ 0x4d, 0x4a, 0x57, 0xdb, 0x4c, 0xa6, 0xeb, 0xec,
+ 0x53, 0x25, 0x6c, 0x21, 0xed, 0x00, 0x4c, 0x3b,
+ 0xca, 0x14, 0x57, 0xa9, 0xd6, 0x6a, 0xcd, 0x8d,
+ 0x5e, 0x74, 0xac, 0x72, 0xc1, 0x97, 0xe5, 0x1b,
+ 0x45, 0x4e, 0xda, 0xfc, 0xcc, 0x40, 0xe8, 0x48,
+ 0x88, 0x0b, 0xa3, 0xe3, 0x8d, 0x83, 0x42, 0xc3,
+ 0x23, 0xfd, 0x68, 0xb5, 0x8e, 0xf1, 0x9d, 0x63,
+ 0x77, 0xe9, 0xa3, 0x8e, 0x8c, 0x26, 0x6b, 0xbd,
+ 0x72, 0x73, 0x35, 0x0c, 0x03, 0xf8, 0x43, 0x78,
+ 0x52, 0x71, 0x15, 0x1f, 0x71, 0x5d, 0x6e, 0xed,
+ 0xb9, 0xcc, 0x86, 0x30, 0xdb, 0x2b, 0xd3, 0x82,
+ 0x88, 0x23, 0x71, 0x90, 0x53, 0x5c, 0xa9, 0x2f,
+ 0x76, 0x01, 0xb7, 0x9a, 0xfe, 0x43, 0x55, 0xa3,
+ 0x04, 0x9b, 0x0e, 0xe4, 0x59, 0xdf, 0xc9, 0xe9,
+ 0xb1, 0xea, 0x29, 0x28, 0x3c, 0x5c, 0xae, 0x72,
+ 0x84, 0xb6, 0xc6, 0xeb, 0x0c, 0x27, 0x07, 0x74,
+ 0x90, 0x0d, 0x31, 0xb0, 0x00, 0x77, 0xe9, 0x40,
+ 0x70, 0x6f, 0x68, 0xa7, 0xfd, 0x06, 0xec, 0x4b,
+ 0xc0, 0xb7, 0xac, 0xbc, 0x33, 0xb7, 0x6d, 0x0a,
+ 0xbd, 0x12, 0x1b, 0x59, 0xcb, 0xdd, 0x32, 0xf5,
+ 0x1d, 0x94, 0x57, 0x76, 0x9e, 0x0c, 0x18, 0x98,
+ 0x71, 0xd7, 0x2a, 0xdb, 0x0b, 0x7b, 0xa7, 0x71,
+ 0xb7, 0x67, 0x81, 0x23, 0x96, 0xae, 0xb9, 0x7e,
+ 0x32, 0x43, 0x92, 0x8a, 0x19, 0xa0, 0xc4, 0xd4,
+ 0x3b, 0x57, 0xf9, 0x4a, 0x2c, 0xfb, 0x51, 0x46,
+ 0xbb, 0xcb, 0x5d, 0xb3, 0xef, 0x13, 0x93, 0x6e,
+ 0x68, 0x42, 0x54, 0x57, 0xd3, 0x6a, 0x3a, 0x8f,
+ 0x9d, 0x66, 0xbf, 0xbd, 0x36, 0x23, 0xf5, 0x93,
+ 0x83, 0x7b, 0x9c, 0xc0, 0xdd, 0xc5, 0x49, 0xc0,
+ 0x64, 0xed, 0x07, 0x12, 0xb3, 0xe6, 0xe4, 0xe5,
+ 0x38, 0x95, 0x23, 0xb1, 0xa0, 0x3b, 0x1a, 0x61,
+ 0xda, 0x17, 0xac, 0xc3, 0x58, 0xdd, 0x74, 0x64,
+ 0x22, 0x11, 0xe8, 0x32, 0x1d, 0x16, 0x93, 0x85,
+ 0x99, 0xa5, 0x9c, 0x34, 0x55, 0xb1, 0xe9, 0x20,
+ 0x72, 0xc9, 0x28, 0x7b, 0x79, 0x00, 0xa1, 0xa6,
+ 0xa3, 0x27, 0x40, 0x18, 0x8a, 0x54, 0xe0, 0xcc,
+ 0xe8, 0x4e, 0x8e, 0x43, 0x96, 0xe7, 0x3f, 0xc8,
+ 0xe9, 0xb2, 0xf9, 0xc9, 0xda, 0x04, 0x71, 0x50,
+ 0x47, 0xe4, 0xaa, 0xce, 0xa2, 0x30, 0xc8, 0xe4,
+ 0xac, 0xc7, 0x0d, 0x06, 0x2e, 0xe6, 0xe8, 0x80,
+ 0x36, 0x29, 0x9e, 0x01, 0xb8, 0xc3, 0xf0, 0xa0,
+ 0x5d, 0x7a, 0xca, 0x4d, 0xa0, 0x57, 0xbd, 0x2a,
+ 0x45, 0xa7, 0x7f, 0x9c, 0x93, 0x07, 0x8f, 0x35,
+ 0x67, 0x92, 0xe3, 0xe9, 0x7f, 0xa8, 0x61, 0x43,
+ 0x9e, 0x25, 0x4f, 0x33, 0x76, 0x13, 0x6e, 0x12,
+ 0xb9, 0xdd, 0xa4, 0x7c, 0x08, 0x9f, 0x7c, 0xe7,
+ 0x0a, 0x8d, 0x84, 0x06, 0xa4, 0x33, 0x17, 0x34,
+ 0x5e, 0x10, 0x7c, 0xc0, 0xa8, 0x3d, 0x1f, 0x42,
+ 0x20, 0x51, 0x65, 0x5d, 0x09, 0xc3, 0xaa, 0xc0,
+ 0xc8, 0x0d, 0xf0, 0x79, 0xbc, 0x20, 0x1b, 0x95,
+ 0xe7, 0x06, 0x7d, 0x47, 0x20, 0x03, 0x1a, 0x74,
+ 0xdd, 0xe2, 0xd4, 0xae, 0x38, 0x71, 0x9b, 0xf5,
+ 0x80, 0xec, 0x08, 0x4e, 0x56, 0xba, 0x76, 0x12,
+ 0x1a, 0xdf, 0x48, 0xf3, 0xae, 0xb3, 0xe6, 0xe6,
+ 0xbe, 0xc0, 0x91, 0x2e, 0x01, 0xb3, 0x01, 0x86,
+ 0xa2, 0xb9, 0x52, 0xd1, 0x21, 0xae, 0xd4, 0x97,
+ 0x1d, 0xef, 0x41, 0x12, 0x95, 0x3d, 0x48, 0x45,
+ 0x1c, 0x56, 0x32, 0x8f, 0xb8, 0x43, 0xbb, 0x19,
+ 0xf3, 0xca, 0xe9, 0xeb, 0x6d, 0x84, 0xbe, 0x86,
+ 0x06, 0xe2, 0x36, 0xb2, 0x62, 0x9d, 0xd3, 0x4c,
+ 0x48, 0x18, 0x54, 0x13, 0x4e, 0xcf, 0xfd, 0xba,
+ 0x84, 0xb9, 0x30, 0x53, 0xcf, 0xfb, 0xb9, 0x29,
+ 0x8f, 0xdc, 0x9f, 0xef, 0x60, 0x0b, 0x64, 0xf6,
+ 0x8b, 0xee, 0xa6, 0x91, 0xc2, 0x41, 0x6c, 0xf6,
+ 0xfa, 0x79, 0x67, 0x4b, 0xc1, 0x3f, 0xaf, 0x09,
+ 0x81, 0xd4, 0x5d, 0xcb, 0x09, 0xdf, 0x36, 0x31,
+ 0xc0, 0x14, 0x3c, 0x7c, 0x0e, 0x65, 0x95, 0x99,
+ 0x6d, 0xa3, 0xf4, 0xd7, 0x38, 0xee, 0x1a, 0x2b,
+ 0x37, 0xe2, 0xa4, 0x3b, 0x4b, 0xd0, 0x65, 0xca,
+ 0xf8, 0xc3, 0xe8, 0x15, 0x20, 0xef, 0xf2, 0x00,
+ 0xfd, 0x01, 0x09, 0xc5, 0xc8, 0x17, 0x04, 0x93,
+ 0xd0, 0x93, 0x03, 0x55, 0xc5, 0xfe, 0x32, 0xa3,
+ 0x3e, 0x28, 0x2d, 0x3b, 0x93, 0x8a, 0xcc, 0x07,
+ 0x72, 0x80, 0x8b, 0x74, 0x16, 0x24, 0xbb, 0xda,
+ 0x94, 0x39, 0x30, 0x8f, 0xb1, 0xcd, 0x4a, 0x90,
+ 0x92, 0x7c, 0x14, 0x8f, 0x95, 0x4e, 0xac, 0x9b,
+ 0xd8, 0x8f, 0x1a, 0x87, 0xa4, 0x32, 0x27, 0x8a,
+ 0xba, 0xf7, 0x41, 0xcf, 0x84, 0x37, 0x19, 0xe6,
+ 0x06, 0xf5, 0x0e, 0xcf, 0x36, 0xf5, 0x9e, 0x6c,
+ 0xde, 0xbc, 0xff, 0x64, 0x7e, 0x4e, 0x59, 0x57,
+ 0x48, 0xfe, 0x14, 0xf7, 0x9c, 0x93, 0x5d, 0x15,
+ 0xad, 0xcc, 0x11, 0xb1, 0x17, 0x18, 0xb2, 0x7e,
+ 0xcc, 0xab, 0xe9, 0xce, 0x7d, 0x77, 0x5b, 0x51,
+ 0x1b, 0x1e, 0x20, 0xa8, 0x32, 0x06, 0x0e, 0x75,
+ 0x93, 0xac, 0xdb, 0x35, 0x37, 0x1f, 0xe9, 0x19,
+ 0x1d, 0xb4, 0x71, 0x97, 0xd6, 0x4e, 0x2c, 0x08,
+ 0xa5, 0x13, 0xf9, 0x0e, 0x7e, 0x78, 0x6e, 0x14,
+ 0xe0, 0xa9, 0xb9, 0x96, 0x4c, 0x80, 0x82, 0xba,
+ 0x17, 0xb3, 0x9d, 0x69, 0xb0, 0x84, 0x46, 0xff,
+ 0xf9, 0x52, 0x79, 0x94, 0x58, 0x3a, 0x62, 0x90,
+ 0x15, 0x35, 0x71, 0x10, 0x37, 0xed, 0xa1, 0x8e,
+ 0x53, 0x6e, 0xf4, 0x26, 0x57, 0x93, 0x15, 0x93,
+ 0xf6, 0x81, 0x2c, 0x5a, 0x10, 0xda, 0x92, 0xad,
+ 0x2f, 0xdb, 0x28, 0x31, 0x2d, 0x55, 0x04, 0xd2,
+ 0x06, 0x28, 0x8c, 0x1e, 0xdc, 0xea, 0x54, 0xac,
+ 0xff, 0xb7, 0x6c, 0x30, 0x15, 0xd4, 0xb4, 0x0d,
+ 0x00, 0x93, 0x57, 0xdd, 0xd2, 0x07, 0x07, 0x06,
+ 0xd9, 0x43, 0x9b, 0xcd, 0x3a, 0xf4, 0x7d, 0x4c,
+ 0x36, 0x5d, 0x23, 0xa2, 0xcc, 0x57, 0x40, 0x91,
+ 0xe9, 0x2c, 0x2f, 0x2c, 0xd5, 0x30, 0x9b, 0x17,
+ 0xb0, 0xc9, 0xf7, 0xa7, 0x2f, 0xd1, 0x93, 0x20,
+ 0x6b, 0xc6, 0xc1, 0xe4, 0x6f, 0xcb, 0xd1, 0xe7,
+ 0x09, 0x0f, 0x9e, 0xdc, 0xaa, 0x9f, 0x2f, 0xdf,
+ 0x56, 0x9f, 0xd4, 0x33, 0x04, 0xaf, 0xd3, 0x6c,
+ 0x58, 0x61, 0xf0, 0x30, 0xec, 0xf2, 0x7f, 0xf2,
+ 0x9c, 0xdf, 0x39, 0xbb, 0x6f, 0xa2, 0x8c, 0x7e,
+ 0xc4, 0x22, 0x51, 0x71, 0xc0, 0x4d, 0x14, 0x1a,
+ 0xc4, 0xcd, 0x04, 0xd9, 0x87, 0x08, 0x50, 0x05,
+ 0xcc, 0xaf, 0xf6, 0xf0, 0x8f, 0x92, 0x54, 0x58,
+ 0xc2, 0xc7, 0x09, 0x7a, 0x59, 0x02, 0x05, 0xe8,
+ 0xb0, 0x86, 0xd9, 0xbf, 0x7b, 0x35, 0x51, 0x4d,
+ 0xaf, 0x08, 0x97, 0x2c, 0x65, 0xda, 0x2a, 0x71,
+ 0x3a, 0xa8, 0x51, 0xcc, 0xf2, 0x73, 0x27, 0xc3,
+ 0xfd, 0x62, 0xcf, 0xe3, 0xb2, 0xca, 0xcb, 0xbe,
+ 0x1a, 0x0a, 0xa1, 0x34, 0x7b, 0x77, 0xc4, 0x62,
+ 0x68, 0x78, 0x5f, 0x94, 0x07, 0x04, 0x65, 0x16,
+ 0x4b, 0x61, 0xcb, 0xff, 0x75, 0x26, 0x50, 0x66,
+ 0x1f, 0x6e, 0x93, 0xf8, 0xc5, 0x51, 0xeb, 0xa4,
+ 0x4a, 0x48, 0x68, 0x6b, 0xe2, 0x5e, 0x44, 0xb2,
+ 0x50, 0x2c, 0x6c, 0xae, 0x79, 0x4e, 0x66, 0x35,
+ 0x81, 0x50, 0xac, 0xbc, 0x3f, 0xb1, 0x0c, 0xf3,
+ 0x05, 0x3c, 0x4a, 0xa3, 0x6c, 0x2a, 0x79, 0xb4,
+ 0xb7, 0xab, 0xca, 0xc7, 0x9b, 0x8e, 0xcd, 0x5f,
+ 0x11, 0x03, 0xcb, 0x30, 0xa3, 0xab, 0xda, 0xfe,
+ 0x64, 0xb9, 0xbb, 0xd8, 0x5e, 0x3a, 0x1a, 0x56,
+ 0xe5, 0x05, 0x48, 0x90, 0x1e, 0x61, 0x69, 0x1b,
+ 0x22, 0xe6, 0x1a, 0x3c, 0x75, 0xad, 0x1f, 0x37,
+ 0x28, 0xdc, 0xe4, 0x6d, 0xbd, 0x42, 0xdc, 0xd3,
+ 0xc8, 0xb6, 0x1c, 0x48, 0xfe, 0x94, 0x77, 0x7f,
+ 0xbd, 0x62, 0xac, 0xa3, 0x47, 0x27, 0xcf, 0x5f,
+ 0xd9, 0xdb, 0xaf, 0xec, 0xf7, 0x5e, 0xc1, 0xb0,
+ 0x9d, 0x01, 0x26, 0x99, 0x7e, 0x8f, 0x03, 0x70,
+ 0xb5, 0x42, 0xbe, 0x67, 0x28, 0x1b, 0x7c, 0xbd,
+ 0x61, 0x21, 0x97, 0xcc, 0x5c, 0xe1, 0x97, 0x8f,
+ 0x8d, 0xde, 0x2b, 0xaa, 0xa7, 0x71, 0x1d, 0x1e,
+ 0x02, 0x73, 0x70, 0x58, 0x32, 0x5b, 0x1d, 0x67,
+ 0x3d, 0xe0, 0x74, 0x4f, 0x03, 0xf2, 0x70, 0x51,
+ 0x79, 0xf1, 0x61, 0x70, 0x15, 0x74, 0x9d, 0x23,
+ 0x89, 0xde, 0xac, 0xfd, 0xde, 0xd0, 0x1f, 0xc3,
+ 0x87, 0x44, 0x35, 0x4b, 0xe5, 0xb0, 0x60, 0xc5,
+ 0x22, 0xe4, 0x9e, 0xca, 0xeb, 0xd5, 0x3a, 0x09,
+ 0x45, 0xa4, 0xdb, 0xfa, 0x3f, 0xeb, 0x1b, 0xc7,
+ 0xc8, 0x14, 0x99, 0x51, 0x92, 0x10, 0xed, 0xed,
+ 0x28, 0xe0, 0xa1, 0xf8, 0x26, 0xcf, 0xcd, 0xcb,
+ 0x63, 0xa1, 0x3b, 0xe3, 0xdf, 0x7e, 0xfe, 0xa6,
+ 0xf0, 0x81, 0x9a, 0xbf, 0x55, 0xde, 0x54, 0xd5,
+ 0x56, 0x60, 0x98, 0x10, 0x68, 0xf4, 0x38, 0x96,
+ 0x8e, 0x6f, 0x1d, 0x44, 0x7f, 0xd6, 0x2f, 0xfe,
+ 0x55, 0xfb, 0x0c, 0x7e, 0x67, 0xe2, 0x61, 0x44,
+ 0xed, 0xf2, 0x35, 0x30, 0x5d, 0xe9, 0xc7, 0xd6,
+ 0x6d, 0xe0, 0xa0, 0xed, 0xf3, 0xfc, 0xd8, 0x3e,
+ 0x0a, 0x7b, 0xcd, 0xaf, 0x65, 0x68, 0x18, 0xc0,
+ 0xec, 0x04, 0x1c, 0x74, 0x6d, 0xe2, 0x6e, 0x79,
+ 0xd4, 0x11, 0x2b, 0x62, 0xd5, 0x27, 0xad, 0x4f,
+ 0x01, 0x59, 0x73, 0xcc, 0x6a, 0x53, 0xfb, 0x2d,
+ 0xd5, 0x4e, 0x99, 0x21, 0x65, 0x4d, 0xf5, 0x82,
+ 0xf7, 0xd8, 0x42, 0xce, 0x6f, 0x3d, 0x36, 0x47,
+ 0xf1, 0x05, 0x16, 0xe8, 0x1b, 0x6a, 0x8f, 0x93,
+ 0xf2, 0x8f, 0x37, 0x40, 0x12, 0x28, 0xa3, 0xe6,
+ 0xb9, 0x17, 0x4a, 0x1f, 0xb1, 0xd1, 0x66, 0x69,
+ 0x86, 0xc4, 0xfc, 0x97, 0xae, 0x3f, 0x8f, 0x1e,
+ 0x2b, 0xdf, 0xcd, 0xf9, 0x3c
+};
+static const u8 enc_assoc011[] __initconst = {
+ 0xd6, 0x31, 0xda, 0x5d, 0x42, 0x5e, 0xd7
+};
+static const u8 enc_nonce011[] __initconst = {
+ 0xfd, 0x87, 0xd4, 0xd8, 0x62, 0xfd, 0xec, 0xaa
+};
+static const u8 enc_key011[] __initconst = {
+ 0x35, 0x4e, 0xb5, 0x70, 0x50, 0x42, 0x8a, 0x85,
+ 0xf2, 0xfb, 0xed, 0x7b, 0xd0, 0x9e, 0x97, 0xca,
+ 0xfa, 0x98, 0x66, 0x63, 0xee, 0x37, 0xcc, 0x52,
+ 0xfe, 0xd1, 0xdf, 0x95, 0x15, 0x34, 0x29, 0x38
+};
+
+static const u8 enc_input012[] __initconst = {
+ 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0,
+ 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5,
+ 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57,
+ 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff,
+ 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5,
+ 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b,
+ 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46,
+ 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b,
+ 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71,
+ 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0,
+ 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b,
+ 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d,
+ 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f,
+ 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24,
+ 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23,
+ 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e,
+ 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14,
+ 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d,
+ 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb,
+ 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4,
+ 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf,
+ 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e,
+ 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6,
+ 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33,
+ 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb,
+ 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0,
+ 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe,
+ 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00,
+ 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d,
+ 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b,
+ 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50,
+ 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e,
+ 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4,
+ 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28,
+ 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8,
+ 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b,
+ 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86,
+ 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67,
+ 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff,
+ 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59,
+ 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe,
+ 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6,
+ 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e,
+ 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b,
+ 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50,
+ 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39,
+ 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02,
+ 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9,
+ 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a,
+ 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38,
+ 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9,
+ 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65,
+ 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb,
+ 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2,
+ 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae,
+ 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee,
+ 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00,
+ 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c,
+ 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8,
+ 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31,
+ 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68,
+ 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4,
+ 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0,
+ 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11,
+ 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7,
+ 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39,
+ 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1,
+ 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1,
+ 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2,
+ 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66,
+ 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49,
+ 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2,
+ 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5,
+ 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3,
+ 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c,
+ 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa,
+ 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00,
+ 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54,
+ 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87,
+ 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03,
+ 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39,
+ 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40,
+ 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6,
+ 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22,
+ 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5,
+ 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e,
+ 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32,
+ 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53,
+ 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42,
+ 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c,
+ 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68,
+ 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48,
+ 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c,
+ 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce,
+ 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd,
+ 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa,
+ 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69,
+ 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8,
+ 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58,
+ 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0,
+ 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45,
+ 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb,
+ 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33,
+ 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c,
+ 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23,
+ 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80,
+ 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1,
+ 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff,
+ 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24,
+ 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9,
+ 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46,
+ 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8,
+ 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20,
+ 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35,
+ 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63,
+ 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb,
+ 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36,
+ 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a,
+ 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c,
+ 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f,
+ 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02,
+ 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03,
+ 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa,
+ 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16,
+ 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d,
+ 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5,
+ 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7,
+ 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac,
+ 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47,
+ 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3,
+ 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35,
+ 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e,
+ 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6,
+ 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74,
+ 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e,
+ 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a,
+ 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0,
+ 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4,
+ 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8,
+ 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16,
+ 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32,
+ 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65,
+ 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06,
+ 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a,
+ 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7,
+ 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85,
+ 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb,
+ 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46,
+ 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e,
+ 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61,
+ 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb,
+ 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d,
+ 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00,
+ 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5,
+ 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6,
+ 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1,
+ 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a,
+ 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7,
+ 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63,
+ 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38,
+ 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3,
+ 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed,
+ 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49,
+ 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42,
+ 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0,
+ 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f,
+ 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1,
+ 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd,
+ 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d,
+ 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88,
+ 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1,
+ 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25,
+ 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22,
+ 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28,
+ 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f,
+ 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53,
+ 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28,
+ 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8,
+ 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc,
+ 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8,
+ 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb,
+ 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3,
+ 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3,
+ 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac,
+ 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2,
+ 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a,
+ 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad,
+ 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e,
+ 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd,
+ 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf,
+ 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba,
+ 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41,
+ 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91,
+ 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d,
+ 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6,
+ 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf,
+ 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92,
+ 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e,
+ 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72,
+ 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04,
+ 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46,
+ 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55,
+ 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84,
+ 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61,
+ 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d,
+ 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8,
+ 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d,
+ 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87,
+ 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70,
+ 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94,
+ 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f,
+ 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb,
+ 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90,
+ 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31,
+ 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06,
+ 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05,
+ 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7,
+ 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e,
+ 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae,
+ 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2,
+ 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21,
+ 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0,
+ 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d,
+ 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0,
+ 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6,
+ 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5,
+ 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9,
+ 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8,
+ 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57,
+ 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1,
+ 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c,
+ 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b,
+ 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69,
+ 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d,
+ 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d,
+ 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19,
+ 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82,
+ 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20,
+ 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f,
+ 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e,
+ 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f,
+ 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47,
+ 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b,
+ 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4,
+ 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b,
+ 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4,
+ 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9,
+ 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3,
+ 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0,
+ 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16,
+ 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d,
+ 0x78, 0xec, 0x00
+};
+static const u8 enc_output012[] __initconst = {
+ 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3,
+ 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf,
+ 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1,
+ 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f,
+ 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e,
+ 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5,
+ 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b,
+ 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b,
+ 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2,
+ 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1,
+ 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74,
+ 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e,
+ 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae,
+ 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd,
+ 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04,
+ 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55,
+ 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef,
+ 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b,
+ 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74,
+ 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26,
+ 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f,
+ 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64,
+ 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd,
+ 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad,
+ 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b,
+ 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e,
+ 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e,
+ 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0,
+ 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f,
+ 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50,
+ 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97,
+ 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03,
+ 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a,
+ 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15,
+ 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb,
+ 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34,
+ 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47,
+ 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86,
+ 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24,
+ 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c,
+ 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9,
+ 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7,
+ 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48,
+ 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b,
+ 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e,
+ 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61,
+ 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75,
+ 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26,
+ 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74,
+ 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43,
+ 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1,
+ 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79,
+ 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3,
+ 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5,
+ 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9,
+ 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d,
+ 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8,
+ 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26,
+ 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5,
+ 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d,
+ 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29,
+ 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57,
+ 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92,
+ 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9,
+ 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc,
+ 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd,
+ 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57,
+ 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3,
+ 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4,
+ 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c,
+ 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27,
+ 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c,
+ 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5,
+ 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14,
+ 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94,
+ 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b,
+ 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99,
+ 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84,
+ 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a,
+ 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa,
+ 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75,
+ 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74,
+ 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40,
+ 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72,
+ 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f,
+ 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92,
+ 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8,
+ 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c,
+ 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f,
+ 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb,
+ 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a,
+ 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b,
+ 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d,
+ 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c,
+ 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4,
+ 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00,
+ 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b,
+ 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4,
+ 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84,
+ 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba,
+ 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47,
+ 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4,
+ 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88,
+ 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81,
+ 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1,
+ 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a,
+ 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e,
+ 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1,
+ 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07,
+ 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24,
+ 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f,
+ 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a,
+ 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9,
+ 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9,
+ 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51,
+ 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1,
+ 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c,
+ 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53,
+ 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40,
+ 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a,
+ 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2,
+ 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2,
+ 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8,
+ 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07,
+ 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9,
+ 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d,
+ 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde,
+ 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f,
+ 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d,
+ 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d,
+ 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56,
+ 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c,
+ 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3,
+ 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d,
+ 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26,
+ 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10,
+ 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c,
+ 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11,
+ 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf,
+ 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c,
+ 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb,
+ 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79,
+ 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa,
+ 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80,
+ 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08,
+ 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c,
+ 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc,
+ 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab,
+ 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6,
+ 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9,
+ 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7,
+ 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2,
+ 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33,
+ 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2,
+ 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e,
+ 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c,
+ 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b,
+ 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66,
+ 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6,
+ 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44,
+ 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74,
+ 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6,
+ 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f,
+ 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24,
+ 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1,
+ 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2,
+ 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5,
+ 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d,
+ 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0,
+ 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b,
+ 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3,
+ 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0,
+ 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3,
+ 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c,
+ 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b,
+ 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5,
+ 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51,
+ 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71,
+ 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68,
+ 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb,
+ 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e,
+ 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b,
+ 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8,
+ 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb,
+ 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54,
+ 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7,
+ 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff,
+ 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd,
+ 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde,
+ 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c,
+ 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1,
+ 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8,
+ 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14,
+ 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c,
+ 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4,
+ 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06,
+ 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52,
+ 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d,
+ 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c,
+ 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6,
+ 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5,
+ 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f,
+ 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e,
+ 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98,
+ 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8,
+ 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb,
+ 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b,
+ 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79,
+ 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11,
+ 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d,
+ 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10,
+ 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23,
+ 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23,
+ 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90,
+ 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4,
+ 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1,
+ 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7,
+ 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11,
+ 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50,
+ 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8,
+ 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97,
+ 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38,
+ 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f,
+ 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33,
+ 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f,
+ 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75,
+ 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21,
+ 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90,
+ 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8,
+ 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91,
+ 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1,
+ 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f,
+ 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3,
+ 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc,
+ 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a,
+ 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62,
+ 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55,
+ 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23,
+ 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6,
+ 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac,
+ 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12,
+ 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a,
+ 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7,
+ 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec,
+ 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28,
+ 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88,
+ 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4,
+ 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17,
+ 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2,
+ 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33,
+ 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a,
+ 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28,
+ 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62,
+ 0x70, 0xcf, 0xd6
+};
+static const u8 enc_assoc012[] __initconst = {
+ 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8,
+ 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce,
+ 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c,
+ 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc,
+ 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e,
+ 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f,
+ 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b,
+ 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9
+};
+static const u8 enc_nonce012[] __initconst = {
+ 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06
+};
+static const u8 enc_key012[] __initconst = {
+ 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e,
+ 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d,
+ 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e,
+ 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64
+};
+
+/* wycheproof - rfc7539 */
+static const u8 enc_input013[] __initconst = {
+ 0x4c, 0x61, 0x64, 0x69, 0x65, 0x73, 0x20, 0x61,
+ 0x6e, 0x64, 0x20, 0x47, 0x65, 0x6e, 0x74, 0x6c,
+ 0x65, 0x6d, 0x65, 0x6e, 0x20, 0x6f, 0x66, 0x20,
+ 0x74, 0x68, 0x65, 0x20, 0x63, 0x6c, 0x61, 0x73,
+ 0x73, 0x20, 0x6f, 0x66, 0x20, 0x27, 0x39, 0x39,
+ 0x3a, 0x20, 0x49, 0x66, 0x20, 0x49, 0x20, 0x63,
+ 0x6f, 0x75, 0x6c, 0x64, 0x20, 0x6f, 0x66, 0x66,
+ 0x65, 0x72, 0x20, 0x79, 0x6f, 0x75, 0x20, 0x6f,
+ 0x6e, 0x6c, 0x79, 0x20, 0x6f, 0x6e, 0x65, 0x20,
+ 0x74, 0x69, 0x70, 0x20, 0x66, 0x6f, 0x72, 0x20,
+ 0x74, 0x68, 0x65, 0x20, 0x66, 0x75, 0x74, 0x75,
+ 0x72, 0x65, 0x2c, 0x20, 0x73, 0x75, 0x6e, 0x73,
+ 0x63, 0x72, 0x65, 0x65, 0x6e, 0x20, 0x77, 0x6f,
+ 0x75, 0x6c, 0x64, 0x20, 0x62, 0x65, 0x20, 0x69,
+ 0x74, 0x2e
+};
+static const u8 enc_output013[] __initconst = {
+ 0xd3, 0x1a, 0x8d, 0x34, 0x64, 0x8e, 0x60, 0xdb,
+ 0x7b, 0x86, 0xaf, 0xbc, 0x53, 0xef, 0x7e, 0xc2,
+ 0xa4, 0xad, 0xed, 0x51, 0x29, 0x6e, 0x08, 0xfe,
+ 0xa9, 0xe2, 0xb5, 0xa7, 0x36, 0xee, 0x62, 0xd6,
+ 0x3d, 0xbe, 0xa4, 0x5e, 0x8c, 0xa9, 0x67, 0x12,
+ 0x82, 0xfa, 0xfb, 0x69, 0xda, 0x92, 0x72, 0x8b,
+ 0x1a, 0x71, 0xde, 0x0a, 0x9e, 0x06, 0x0b, 0x29,
+ 0x05, 0xd6, 0xa5, 0xb6, 0x7e, 0xcd, 0x3b, 0x36,
+ 0x92, 0xdd, 0xbd, 0x7f, 0x2d, 0x77, 0x8b, 0x8c,
+ 0x98, 0x03, 0xae, 0xe3, 0x28, 0x09, 0x1b, 0x58,
+ 0xfa, 0xb3, 0x24, 0xe4, 0xfa, 0xd6, 0x75, 0x94,
+ 0x55, 0x85, 0x80, 0x8b, 0x48, 0x31, 0xd7, 0xbc,
+ 0x3f, 0xf4, 0xde, 0xf0, 0x8e, 0x4b, 0x7a, 0x9d,
+ 0xe5, 0x76, 0xd2, 0x65, 0x86, 0xce, 0xc6, 0x4b,
+ 0x61, 0x16, 0x1a, 0xe1, 0x0b, 0x59, 0x4f, 0x09,
+ 0xe2, 0x6a, 0x7e, 0x90, 0x2e, 0xcb, 0xd0, 0x60,
+ 0x06, 0x91
+};
+static const u8 enc_assoc013[] __initconst = {
+ 0x50, 0x51, 0x52, 0x53, 0xc0, 0xc1, 0xc2, 0xc3,
+ 0xc4, 0xc5, 0xc6, 0xc7
+};
+static const u8 enc_nonce013[] __initconst = {
+ 0x07, 0x00, 0x00, 0x00, 0x40, 0x41, 0x42, 0x43,
+ 0x44, 0x45, 0x46, 0x47
+};
+static const u8 enc_key013[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input014[] __initconst = { };
+static const u8 enc_output014[] __initconst = {
+ 0x76, 0xac, 0xb3, 0x42, 0xcf, 0x31, 0x66, 0xa5,
+ 0xb6, 0x3c, 0x0c, 0x0e, 0xa1, 0x38, 0x3c, 0x8d
+};
+static const u8 enc_assoc014[] __initconst = { };
+static const u8 enc_nonce014[] __initconst = {
+ 0x4d, 0xa5, 0xbf, 0x8d, 0xfd, 0x58, 0x52, 0xc1,
+ 0xea, 0x12, 0x37, 0x9d
+};
+static const u8 enc_key014[] __initconst = {
+ 0x80, 0xba, 0x31, 0x92, 0xc8, 0x03, 0xce, 0x96,
+ 0x5e, 0xa3, 0x71, 0xd5, 0xff, 0x07, 0x3c, 0xf0,
+ 0xf4, 0x3b, 0x6a, 0x2a, 0xb5, 0x76, 0xb2, 0x08,
+ 0x42, 0x6e, 0x11, 0x40, 0x9c, 0x09, 0xb9, 0xb0
+};
+
+/* wycheproof - misc */
+static const u8 enc_input015[] __initconst = { };
+static const u8 enc_output015[] __initconst = {
+ 0x90, 0x6f, 0xa6, 0x28, 0x4b, 0x52, 0xf8, 0x7b,
+ 0x73, 0x59, 0xcb, 0xaa, 0x75, 0x63, 0xc7, 0x09
+};
+static const u8 enc_assoc015[] __initconst = {
+ 0xbd, 0x50, 0x67, 0x64, 0xf2, 0xd2, 0xc4, 0x10
+};
+static const u8 enc_nonce015[] __initconst = {
+ 0xa9, 0x2e, 0xf0, 0xac, 0x99, 0x1d, 0xd5, 0x16,
+ 0xa3, 0xc6, 0xf6, 0x89
+};
+static const u8 enc_key015[] __initconst = {
+ 0x7a, 0x4c, 0xd7, 0x59, 0x17, 0x2e, 0x02, 0xeb,
+ 0x20, 0x4d, 0xb2, 0xc3, 0xf5, 0xc7, 0x46, 0x22,
+ 0x7d, 0xf5, 0x84, 0xfc, 0x13, 0x45, 0x19, 0x63,
+ 0x91, 0xdb, 0xb9, 0x57, 0x7a, 0x25, 0x07, 0x42
+};
+
+/* wycheproof - misc */
+static const u8 enc_input016[] __initconst = {
+ 0x2a
+};
+static const u8 enc_output016[] __initconst = {
+ 0x3a, 0xca, 0xc2, 0x7d, 0xec, 0x09, 0x68, 0x80,
+ 0x1e, 0x9f, 0x6e, 0xde, 0xd6, 0x9d, 0x80, 0x75,
+ 0x22
+};
+static const u8 enc_assoc016[] __initconst = { };
+static const u8 enc_nonce016[] __initconst = {
+ 0x99, 0xe2, 0x3e, 0xc4, 0x89, 0x85, 0xbc, 0xcd,
+ 0xee, 0xab, 0x60, 0xf1
+};
+static const u8 enc_key016[] __initconst = {
+ 0xcc, 0x56, 0xb6, 0x80, 0x55, 0x2e, 0xb7, 0x50,
+ 0x08, 0xf5, 0x48, 0x4b, 0x4c, 0xb8, 0x03, 0xfa,
+ 0x50, 0x63, 0xeb, 0xd6, 0xea, 0xb9, 0x1f, 0x6a,
+ 0xb6, 0xae, 0xf4, 0x91, 0x6a, 0x76, 0x62, 0x73
+};
+
+/* wycheproof - misc */
+static const u8 enc_input017[] __initconst = {
+ 0x51
+};
+static const u8 enc_output017[] __initconst = {
+ 0xc4, 0x16, 0x83, 0x10, 0xca, 0x45, 0xb1, 0xf7,
+ 0xc6, 0x6c, 0xad, 0x4e, 0x99, 0xe4, 0x3f, 0x72,
+ 0xb9
+};
+static const u8 enc_assoc017[] __initconst = {
+ 0x91, 0xca, 0x6c, 0x59, 0x2c, 0xbc, 0xca, 0x53
+};
+static const u8 enc_nonce017[] __initconst = {
+ 0xab, 0x0d, 0xca, 0x71, 0x6e, 0xe0, 0x51, 0xd2,
+ 0x78, 0x2f, 0x44, 0x03
+};
+static const u8 enc_key017[] __initconst = {
+ 0x46, 0xf0, 0x25, 0x49, 0x65, 0xf7, 0x69, 0xd5,
+ 0x2b, 0xdb, 0x4a, 0x70, 0xb4, 0x43, 0x19, 0x9f,
+ 0x8e, 0xf2, 0x07, 0x52, 0x0d, 0x12, 0x20, 0xc5,
+ 0x5e, 0x4b, 0x70, 0xf0, 0xfd, 0xa6, 0x20, 0xee
+};
+
+/* wycheproof - misc */
+static const u8 enc_input018[] __initconst = {
+ 0x5c, 0x60
+};
+static const u8 enc_output018[] __initconst = {
+ 0x4d, 0x13, 0x91, 0xe8, 0xb6, 0x1e, 0xfb, 0x39,
+ 0xc1, 0x22, 0x19, 0x54, 0x53, 0x07, 0x7b, 0x22,
+ 0xe5, 0xe2
+};
+static const u8 enc_assoc018[] __initconst = { };
+static const u8 enc_nonce018[] __initconst = {
+ 0x46, 0x1a, 0xf1, 0x22, 0xe9, 0xf2, 0xe0, 0x34,
+ 0x7e, 0x03, 0xf2, 0xdb
+};
+static const u8 enc_key018[] __initconst = {
+ 0x2f, 0x7f, 0x7e, 0x4f, 0x59, 0x2b, 0xb3, 0x89,
+ 0x19, 0x49, 0x89, 0x74, 0x35, 0x07, 0xbf, 0x3e,
+ 0xe9, 0xcb, 0xde, 0x17, 0x86, 0xb6, 0x69, 0x5f,
+ 0xe6, 0xc0, 0x25, 0xfd, 0x9b, 0xa4, 0xc1, 0x00
+};
+
+/* wycheproof - misc */
+static const u8 enc_input019[] __initconst = {
+ 0xdd, 0xf2
+};
+static const u8 enc_output019[] __initconst = {
+ 0xb6, 0x0d, 0xea, 0xd0, 0xfd, 0x46, 0x97, 0xec,
+ 0x2e, 0x55, 0x58, 0x23, 0x77, 0x19, 0xd0, 0x24,
+ 0x37, 0xa2
+};
+static const u8 enc_assoc019[] __initconst = {
+ 0x88, 0x36, 0x4f, 0xc8, 0x06, 0x05, 0x18, 0xbf
+};
+static const u8 enc_nonce019[] __initconst = {
+ 0x61, 0x54, 0x6b, 0xa5, 0xf1, 0x72, 0x05, 0x90,
+ 0xb6, 0x04, 0x0a, 0xc6
+};
+static const u8 enc_key019[] __initconst = {
+ 0xc8, 0x83, 0x3d, 0xce, 0x5e, 0xa9, 0xf2, 0x48,
+ 0xaa, 0x20, 0x30, 0xea, 0xcf, 0xe7, 0x2b, 0xff,
+ 0xe6, 0x9a, 0x62, 0x0c, 0xaf, 0x79, 0x33, 0x44,
+ 0xe5, 0x71, 0x8f, 0xe0, 0xd7, 0xab, 0x1a, 0x58
+};
+
+/* wycheproof - misc */
+static const u8 enc_input020[] __initconst = {
+ 0xab, 0x85, 0xe9, 0xc1, 0x57, 0x17, 0x31
+};
+static const u8 enc_output020[] __initconst = {
+ 0x5d, 0xfe, 0x34, 0x40, 0xdb, 0xb3, 0xc3, 0xed,
+ 0x7a, 0x43, 0x4e, 0x26, 0x02, 0xd3, 0x94, 0x28,
+ 0x1e, 0x0a, 0xfa, 0x9f, 0xb7, 0xaa, 0x42
+};
+static const u8 enc_assoc020[] __initconst = { };
+static const u8 enc_nonce020[] __initconst = {
+ 0x3c, 0x4e, 0x65, 0x4d, 0x66, 0x3f, 0xa4, 0x59,
+ 0x6d, 0xc5, 0x5b, 0xb7
+};
+static const u8 enc_key020[] __initconst = {
+ 0x55, 0x56, 0x81, 0x58, 0xd3, 0xa6, 0x48, 0x3f,
+ 0x1f, 0x70, 0x21, 0xea, 0xb6, 0x9b, 0x70, 0x3f,
+ 0x61, 0x42, 0x51, 0xca, 0xdc, 0x1a, 0xf5, 0xd3,
+ 0x4a, 0x37, 0x4f, 0xdb, 0xfc, 0x5a, 0xda, 0xc7
+};
+
+/* wycheproof - misc */
+static const u8 enc_input021[] __initconst = {
+ 0x4e, 0xe5, 0xcd, 0xa2, 0x0d, 0x42, 0x90
+};
+static const u8 enc_output021[] __initconst = {
+ 0x4b, 0xd4, 0x72, 0x12, 0x94, 0x1c, 0xe3, 0x18,
+ 0x5f, 0x14, 0x08, 0xee, 0x7f, 0xbf, 0x18, 0xf5,
+ 0xab, 0xad, 0x6e, 0x22, 0x53, 0xa1, 0xba
+};
+static const u8 enc_assoc021[] __initconst = {
+ 0x84, 0xe4, 0x6b, 0xe8, 0xc0, 0x91, 0x90, 0x53
+};
+static const u8 enc_nonce021[] __initconst = {
+ 0x58, 0x38, 0x93, 0x75, 0xc6, 0x9e, 0xe3, 0x98,
+ 0xde, 0x94, 0x83, 0x96
+};
+static const u8 enc_key021[] __initconst = {
+ 0xe3, 0xc0, 0x9e, 0x7f, 0xab, 0x1a, 0xef, 0xb5,
+ 0x16, 0xda, 0x6a, 0x33, 0x02, 0x2a, 0x1d, 0xd4,
+ 0xeb, 0x27, 0x2c, 0x80, 0xd5, 0x40, 0xc5, 0xda,
+ 0x52, 0xa7, 0x30, 0xf3, 0x4d, 0x84, 0x0d, 0x7f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input022[] __initconst = {
+ 0xbe, 0x33, 0x08, 0xf7, 0x2a, 0x2c, 0x6a, 0xed
+};
+static const u8 enc_output022[] __initconst = {
+ 0x8e, 0x94, 0x39, 0xa5, 0x6e, 0xee, 0xc8, 0x17,
+ 0xfb, 0xe8, 0xa6, 0xed, 0x8f, 0xab, 0xb1, 0x93,
+ 0x75, 0x39, 0xdd, 0x6c, 0x00, 0xe9, 0x00, 0x21
+};
+static const u8 enc_assoc022[] __initconst = { };
+static const u8 enc_nonce022[] __initconst = {
+ 0x4f, 0x07, 0xaf, 0xed, 0xfd, 0xc3, 0xb6, 0xc2,
+ 0x36, 0x18, 0x23, 0xd3
+};
+static const u8 enc_key022[] __initconst = {
+ 0x51, 0xe4, 0xbf, 0x2b, 0xad, 0x92, 0xb7, 0xaf,
+ 0xf1, 0xa4, 0xbc, 0x05, 0x55, 0x0b, 0xa8, 0x1d,
+ 0xf4, 0xb9, 0x6f, 0xab, 0xf4, 0x1c, 0x12, 0xc7,
+ 0xb0, 0x0e, 0x60, 0xe4, 0x8d, 0xb7, 0xe1, 0x52
+};
+
+/* wycheproof - misc */
+static const u8 enc_input023[] __initconst = {
+ 0xa4, 0xc9, 0xc2, 0x80, 0x1b, 0x71, 0xf7, 0xdf
+};
+static const u8 enc_output023[] __initconst = {
+ 0xb9, 0xb9, 0x10, 0x43, 0x3a, 0xf0, 0x52, 0xb0,
+ 0x45, 0x30, 0xf5, 0x1a, 0xee, 0xe0, 0x24, 0xe0,
+ 0xa4, 0x45, 0xa6, 0x32, 0x8f, 0xa6, 0x7a, 0x18
+};
+static const u8 enc_assoc023[] __initconst = {
+ 0x66, 0xc0, 0xae, 0x70, 0x07, 0x6c, 0xb1, 0x4d
+};
+static const u8 enc_nonce023[] __initconst = {
+ 0xb4, 0xea, 0x66, 0x6e, 0xe1, 0x19, 0x56, 0x33,
+ 0x66, 0x48, 0x4a, 0x78
+};
+static const u8 enc_key023[] __initconst = {
+ 0x11, 0x31, 0xc1, 0x41, 0x85, 0x77, 0xa0, 0x54,
+ 0xde, 0x7a, 0x4a, 0xc5, 0x51, 0x95, 0x0f, 0x1a,
+ 0x05, 0x3f, 0x9a, 0xe4, 0x6e, 0x5b, 0x75, 0xfe,
+ 0x4a, 0xbd, 0x56, 0x08, 0xd7, 0xcd, 0xda, 0xdd
+};
+
+/* wycheproof - misc */
+static const u8 enc_input024[] __initconst = {
+ 0x42, 0xba, 0xae, 0x59, 0x78, 0xfe, 0xaf, 0x5c,
+ 0x36, 0x8d, 0x14, 0xe0
+};
+static const u8 enc_output024[] __initconst = {
+ 0xff, 0x7d, 0xc2, 0x03, 0xb2, 0x6c, 0x46, 0x7a,
+ 0x6b, 0x50, 0xdb, 0x33, 0x57, 0x8c, 0x0f, 0x27,
+ 0x58, 0xc2, 0xe1, 0x4e, 0x36, 0xd4, 0xfc, 0x10,
+ 0x6d, 0xcb, 0x29, 0xb4
+};
+static const u8 enc_assoc024[] __initconst = { };
+static const u8 enc_nonce024[] __initconst = {
+ 0x9a, 0x59, 0xfc, 0xe2, 0x6d, 0xf0, 0x00, 0x5e,
+ 0x07, 0x53, 0x86, 0x56
+};
+static const u8 enc_key024[] __initconst = {
+ 0x99, 0xb6, 0x2b, 0xd5, 0xaf, 0xbe, 0x3f, 0xb0,
+ 0x15, 0xbd, 0xe9, 0x3f, 0x0a, 0xbf, 0x48, 0x39,
+ 0x57, 0xa1, 0xc3, 0xeb, 0x3c, 0xa5, 0x9c, 0xb5,
+ 0x0b, 0x39, 0xf7, 0xf8, 0xa9, 0xcc, 0x51, 0xbe
+};
+
+/* wycheproof - misc */
+static const u8 enc_input025[] __initconst = {
+ 0xfd, 0xc8, 0x5b, 0x94, 0xa4, 0xb2, 0xa6, 0xb7,
+ 0x59, 0xb1, 0xa0, 0xda
+};
+static const u8 enc_output025[] __initconst = {
+ 0x9f, 0x88, 0x16, 0xde, 0x09, 0x94, 0xe9, 0x38,
+ 0xd9, 0xe5, 0x3f, 0x95, 0xd0, 0x86, 0xfc, 0x6c,
+ 0x9d, 0x8f, 0xa9, 0x15, 0xfd, 0x84, 0x23, 0xa7,
+ 0xcf, 0x05, 0x07, 0x2f
+};
+static const u8 enc_assoc025[] __initconst = {
+ 0xa5, 0x06, 0xe1, 0xa5, 0xc6, 0x90, 0x93, 0xf9
+};
+static const u8 enc_nonce025[] __initconst = {
+ 0x58, 0xdb, 0xd4, 0xad, 0x2c, 0x4a, 0xd3, 0x5d,
+ 0xd9, 0x06, 0xe9, 0xce
+};
+static const u8 enc_key025[] __initconst = {
+ 0x85, 0xf3, 0x5b, 0x62, 0x82, 0xcf, 0xf4, 0x40,
+ 0xbc, 0x10, 0x20, 0xc8, 0x13, 0x6f, 0xf2, 0x70,
+ 0x31, 0x11, 0x0f, 0xa6, 0x3e, 0xc1, 0x6f, 0x1e,
+ 0x82, 0x51, 0x18, 0xb0, 0x06, 0xb9, 0x12, 0x57
+};
+
+/* wycheproof - misc */
+static const u8 enc_input026[] __initconst = {
+ 0x51, 0xf8, 0xc1, 0xf7, 0x31, 0xea, 0x14, 0xac,
+ 0xdb, 0x21, 0x0a, 0x6d, 0x97, 0x3e, 0x07
+};
+static const u8 enc_output026[] __initconst = {
+ 0x0b, 0x29, 0x63, 0x8e, 0x1f, 0xbd, 0xd6, 0xdf,
+ 0x53, 0x97, 0x0b, 0xe2, 0x21, 0x00, 0x42, 0x2a,
+ 0x91, 0x34, 0x08, 0x7d, 0x67, 0xa4, 0x6e, 0x79,
+ 0x17, 0x8d, 0x0a, 0x93, 0xf5, 0xe1, 0xd2
+};
+static const u8 enc_assoc026[] __initconst = { };
+static const u8 enc_nonce026[] __initconst = {
+ 0x68, 0xab, 0x7f, 0xdb, 0xf6, 0x19, 0x01, 0xda,
+ 0xd4, 0x61, 0xd2, 0x3c
+};
+static const u8 enc_key026[] __initconst = {
+ 0x67, 0x11, 0x96, 0x27, 0xbd, 0x98, 0x8e, 0xda,
+ 0x90, 0x62, 0x19, 0xe0, 0x8c, 0x0d, 0x0d, 0x77,
+ 0x9a, 0x07, 0xd2, 0x08, 0xce, 0x8a, 0x4f, 0xe0,
+ 0x70, 0x9a, 0xf7, 0x55, 0xee, 0xec, 0x6d, 0xcb
+};
+
+/* wycheproof - misc */
+static const u8 enc_input027[] __initconst = {
+ 0x97, 0x46, 0x9d, 0xa6, 0x67, 0xd6, 0x11, 0x0f,
+ 0x9c, 0xbd, 0xa1, 0xd1, 0xa2, 0x06, 0x73
+};
+static const u8 enc_output027[] __initconst = {
+ 0x32, 0xdb, 0x66, 0xc4, 0xa3, 0x81, 0x9d, 0x81,
+ 0x55, 0x74, 0x55, 0xe5, 0x98, 0x0f, 0xed, 0xfe,
+ 0xae, 0x30, 0xde, 0xc9, 0x4e, 0x6a, 0xd3, 0xa9,
+ 0xee, 0xa0, 0x6a, 0x0d, 0x70, 0x39, 0x17
+};
+static const u8 enc_assoc027[] __initconst = {
+ 0x64, 0x53, 0xa5, 0x33, 0x84, 0x63, 0x22, 0x12
+};
+static const u8 enc_nonce027[] __initconst = {
+ 0xd9, 0x5b, 0x32, 0x43, 0xaf, 0xae, 0xf7, 0x14,
+ 0xc5, 0x03, 0x5b, 0x6a
+};
+static const u8 enc_key027[] __initconst = {
+ 0xe6, 0xf1, 0x11, 0x8d, 0x41, 0xe4, 0xb4, 0x3f,
+ 0xb5, 0x82, 0x21, 0xb7, 0xed, 0x79, 0x67, 0x38,
+ 0x34, 0xe0, 0xd8, 0xac, 0x5c, 0x4f, 0xa6, 0x0b,
+ 0xbc, 0x8b, 0xc4, 0x89, 0x3a, 0x58, 0x89, 0x4d
+};
+
+/* wycheproof - misc */
+static const u8 enc_input028[] __initconst = {
+ 0x54, 0x9b, 0x36, 0x5a, 0xf9, 0x13, 0xf3, 0xb0,
+ 0x81, 0x13, 0x1c, 0xcb, 0x6b, 0x82, 0x55, 0x88
+};
+static const u8 enc_output028[] __initconst = {
+ 0xe9, 0x11, 0x0e, 0x9f, 0x56, 0xab, 0x3c, 0xa4,
+ 0x83, 0x50, 0x0c, 0xea, 0xba, 0xb6, 0x7a, 0x13,
+ 0x83, 0x6c, 0xca, 0xbf, 0x15, 0xa6, 0xa2, 0x2a,
+ 0x51, 0xc1, 0x07, 0x1c, 0xfa, 0x68, 0xfa, 0x0c
+};
+static const u8 enc_assoc028[] __initconst = { };
+static const u8 enc_nonce028[] __initconst = {
+ 0x2f, 0xcb, 0x1b, 0x38, 0xa9, 0x9e, 0x71, 0xb8,
+ 0x47, 0x40, 0xad, 0x9b
+};
+static const u8 enc_key028[] __initconst = {
+ 0x59, 0xd4, 0xea, 0xfb, 0x4d, 0xe0, 0xcf, 0xc7,
+ 0xd3, 0xdb, 0x99, 0xa8, 0xf5, 0x4b, 0x15, 0xd7,
+ 0xb3, 0x9f, 0x0a, 0xcc, 0x8d, 0xa6, 0x97, 0x63,
+ 0xb0, 0x19, 0xc1, 0x69, 0x9f, 0x87, 0x67, 0x4a
+};
+
+/* wycheproof - misc */
+static const u8 enc_input029[] __initconst = {
+ 0x55, 0xa4, 0x65, 0x64, 0x4f, 0x5b, 0x65, 0x09,
+ 0x28, 0xcb, 0xee, 0x7c, 0x06, 0x32, 0x14, 0xd6
+};
+static const u8 enc_output029[] __initconst = {
+ 0xe4, 0xb1, 0x13, 0xcb, 0x77, 0x59, 0x45, 0xf3,
+ 0xd3, 0xa8, 0xae, 0x9e, 0xc1, 0x41, 0xc0, 0x0c,
+ 0x7c, 0x43, 0xf1, 0x6c, 0xe0, 0x96, 0xd0, 0xdc,
+ 0x27, 0xc9, 0x58, 0x49, 0xdc, 0x38, 0x3b, 0x7d
+};
+static const u8 enc_assoc029[] __initconst = {
+ 0x03, 0x45, 0x85, 0x62, 0x1a, 0xf8, 0xd7, 0xff
+};
+static const u8 enc_nonce029[] __initconst = {
+ 0x11, 0x8a, 0x69, 0x64, 0xc2, 0xd3, 0xe3, 0x80,
+ 0x07, 0x1f, 0x52, 0x66
+};
+static const u8 enc_key029[] __initconst = {
+ 0xb9, 0x07, 0xa4, 0x50, 0x75, 0x51, 0x3f, 0xe8,
+ 0xa8, 0x01, 0x9e, 0xde, 0xe3, 0xf2, 0x59, 0x14,
+ 0x87, 0xb2, 0xa0, 0x30, 0xb0, 0x3c, 0x6e, 0x1d,
+ 0x77, 0x1c, 0x86, 0x25, 0x71, 0xd2, 0xea, 0x1e
+};
+
+/* wycheproof - misc */
+static const u8 enc_input030[] __initconst = {
+ 0x3f, 0xf1, 0x51, 0x4b, 0x1c, 0x50, 0x39, 0x15,
+ 0x91, 0x8f, 0x0c, 0x0c, 0x31, 0x09, 0x4a, 0x6e,
+ 0x1f
+};
+static const u8 enc_output030[] __initconst = {
+ 0x02, 0xcc, 0x3a, 0xcb, 0x5e, 0xe1, 0xfc, 0xdd,
+ 0x12, 0xa0, 0x3b, 0xb8, 0x57, 0x97, 0x64, 0x74,
+ 0xd3, 0xd8, 0x3b, 0x74, 0x63, 0xa2, 0xc3, 0x80,
+ 0x0f, 0xe9, 0x58, 0xc2, 0x8e, 0xaa, 0x29, 0x08,
+ 0x13
+};
+static const u8 enc_assoc030[] __initconst = { };
+static const u8 enc_nonce030[] __initconst = {
+ 0x45, 0xaa, 0xa3, 0xe5, 0xd1, 0x6d, 0x2d, 0x42,
+ 0xdc, 0x03, 0x44, 0x5d
+};
+static const u8 enc_key030[] __initconst = {
+ 0x3b, 0x24, 0x58, 0xd8, 0x17, 0x6e, 0x16, 0x21,
+ 0xc0, 0xcc, 0x24, 0xc0, 0xc0, 0xe2, 0x4c, 0x1e,
+ 0x80, 0xd7, 0x2f, 0x7e, 0xe9, 0x14, 0x9a, 0x4b,
+ 0x16, 0x61, 0x76, 0x62, 0x96, 0x16, 0xd0, 0x11
+};
+
+/* wycheproof - misc */
+static const u8 enc_input031[] __initconst = {
+ 0x63, 0x85, 0x8c, 0xa3, 0xe2, 0xce, 0x69, 0x88,
+ 0x7b, 0x57, 0x8a, 0x3c, 0x16, 0x7b, 0x42, 0x1c,
+ 0x9c
+};
+static const u8 enc_output031[] __initconst = {
+ 0x35, 0x76, 0x64, 0x88, 0xd2, 0xbc, 0x7c, 0x2b,
+ 0x8d, 0x17, 0xcb, 0xbb, 0x9a, 0xbf, 0xad, 0x9e,
+ 0x6d, 0x1f, 0x39, 0x1e, 0x65, 0x7b, 0x27, 0x38,
+ 0xdd, 0xa0, 0x84, 0x48, 0xcb, 0xa2, 0x81, 0x1c,
+ 0xeb
+};
+static const u8 enc_assoc031[] __initconst = {
+ 0x9a, 0xaf, 0x29, 0x9e, 0xee, 0xa7, 0x8f, 0x79
+};
+static const u8 enc_nonce031[] __initconst = {
+ 0xf0, 0x38, 0x4f, 0xb8, 0x76, 0x12, 0x14, 0x10,
+ 0x63, 0x3d, 0x99, 0x3d
+};
+static const u8 enc_key031[] __initconst = {
+ 0xf6, 0x0c, 0x6a, 0x1b, 0x62, 0x57, 0x25, 0xf7,
+ 0x6c, 0x70, 0x37, 0xb4, 0x8f, 0xe3, 0x57, 0x7f,
+ 0xa7, 0xf7, 0xb8, 0x7b, 0x1b, 0xd5, 0xa9, 0x82,
+ 0x17, 0x6d, 0x18, 0x23, 0x06, 0xff, 0xb8, 0x70
+};
+
+/* wycheproof - misc */
+static const u8 enc_input032[] __initconst = {
+ 0x10, 0xf1, 0xec, 0xf9, 0xc6, 0x05, 0x84, 0x66,
+ 0x5d, 0x9a, 0xe5, 0xef, 0xe2, 0x79, 0xe7, 0xf7,
+ 0x37, 0x7e, 0xea, 0x69, 0x16, 0xd2, 0xb1, 0x11
+};
+static const u8 enc_output032[] __initconst = {
+ 0x42, 0xf2, 0x6c, 0x56, 0xcb, 0x4b, 0xe2, 0x1d,
+ 0x9d, 0x8d, 0x0c, 0x80, 0xfc, 0x99, 0xdd, 0xe0,
+ 0x0d, 0x75, 0xf3, 0x80, 0x74, 0xbf, 0xe7, 0x64,
+ 0x54, 0xaa, 0x7e, 0x13, 0xd4, 0x8f, 0xff, 0x7d,
+ 0x75, 0x57, 0x03, 0x94, 0x57, 0x04, 0x0a, 0x3a
+};
+static const u8 enc_assoc032[] __initconst = { };
+static const u8 enc_nonce032[] __initconst = {
+ 0xe6, 0xb1, 0xad, 0xf2, 0xfd, 0x58, 0xa8, 0x76,
+ 0x2c, 0x65, 0xf3, 0x1b
+};
+static const u8 enc_key032[] __initconst = {
+ 0x02, 0x12, 0xa8, 0xde, 0x50, 0x07, 0xed, 0x87,
+ 0xb3, 0x3f, 0x1a, 0x70, 0x90, 0xb6, 0x11, 0x4f,
+ 0x9e, 0x08, 0xce, 0xfd, 0x96, 0x07, 0xf2, 0xc2,
+ 0x76, 0xbd, 0xcf, 0xdb, 0xc5, 0xce, 0x9c, 0xd7
+};
+
+/* wycheproof - misc */
+static const u8 enc_input033[] __initconst = {
+ 0x92, 0x22, 0xf9, 0x01, 0x8e, 0x54, 0xfd, 0x6d,
+ 0xe1, 0x20, 0x08, 0x06, 0xa9, 0xee, 0x8e, 0x4c,
+ 0xc9, 0x04, 0xd2, 0x9f, 0x25, 0xcb, 0xa1, 0x93
+};
+static const u8 enc_output033[] __initconst = {
+ 0x12, 0x30, 0x32, 0x43, 0x7b, 0x4b, 0xfd, 0x69,
+ 0x20, 0xe8, 0xf7, 0xe7, 0xe0, 0x08, 0x7a, 0xe4,
+ 0x88, 0x9e, 0xbe, 0x7a, 0x0a, 0xd0, 0xe9, 0x00,
+ 0x3c, 0xf6, 0x8f, 0x17, 0x95, 0x50, 0xda, 0x63,
+ 0xd3, 0xb9, 0x6c, 0x2d, 0x55, 0x41, 0x18, 0x65
+};
+static const u8 enc_assoc033[] __initconst = {
+ 0x3e, 0x8b, 0xc5, 0xad, 0xe1, 0x82, 0xff, 0x08
+};
+static const u8 enc_nonce033[] __initconst = {
+ 0x6b, 0x28, 0x2e, 0xbe, 0xcc, 0x54, 0x1b, 0xcd,
+ 0x78, 0x34, 0xed, 0x55
+};
+static const u8 enc_key033[] __initconst = {
+ 0xc5, 0xbc, 0x09, 0x56, 0x56, 0x46, 0xe7, 0xed,
+ 0xda, 0x95, 0x4f, 0x1f, 0x73, 0x92, 0x23, 0xda,
+ 0xda, 0x20, 0xb9, 0x5c, 0x44, 0xab, 0x03, 0x3d,
+ 0x0f, 0xae, 0x4b, 0x02, 0x83, 0xd1, 0x8b, 0xe3
+};
+
+/* wycheproof - misc */
+static const u8 enc_input034[] __initconst = {
+ 0xb0, 0x53, 0x99, 0x92, 0x86, 0xa2, 0x82, 0x4f,
+ 0x42, 0xcc, 0x8c, 0x20, 0x3a, 0xb2, 0x4e, 0x2c,
+ 0x97, 0xa6, 0x85, 0xad, 0xcc, 0x2a, 0xd3, 0x26,
+ 0x62, 0x55, 0x8e, 0x55, 0xa5, 0xc7, 0x29
+};
+static const u8 enc_output034[] __initconst = {
+ 0x45, 0xc7, 0xd6, 0xb5, 0x3a, 0xca, 0xd4, 0xab,
+ 0xb6, 0x88, 0x76, 0xa6, 0xe9, 0x6a, 0x48, 0xfb,
+ 0x59, 0x52, 0x4d, 0x2c, 0x92, 0xc9, 0xd8, 0xa1,
+ 0x89, 0xc9, 0xfd, 0x2d, 0xb9, 0x17, 0x46, 0x56,
+ 0x6d, 0x3c, 0xa1, 0x0e, 0x31, 0x1b, 0x69, 0x5f,
+ 0x3e, 0xae, 0x15, 0x51, 0x65, 0x24, 0x93
+};
+static const u8 enc_assoc034[] __initconst = { };
+static const u8 enc_nonce034[] __initconst = {
+ 0x04, 0xa9, 0xbe, 0x03, 0x50, 0x8a, 0x5f, 0x31,
+ 0x37, 0x1a, 0x6f, 0xd2
+};
+static const u8 enc_key034[] __initconst = {
+ 0x2e, 0xb5, 0x1c, 0x46, 0x9a, 0xa8, 0xeb, 0x9e,
+ 0x6c, 0x54, 0xa8, 0x34, 0x9b, 0xae, 0x50, 0xa2,
+ 0x0f, 0x0e, 0x38, 0x27, 0x11, 0xbb, 0xa1, 0x15,
+ 0x2c, 0x42, 0x4f, 0x03, 0xb6, 0x67, 0x1d, 0x71
+};
+
+/* wycheproof - misc */
+static const u8 enc_input035[] __initconst = {
+ 0xf4, 0x52, 0x06, 0xab, 0xc2, 0x55, 0x52, 0xb2,
+ 0xab, 0xc9, 0xab, 0x7f, 0xa2, 0x43, 0x03, 0x5f,
+ 0xed, 0xaa, 0xdd, 0xc3, 0xb2, 0x29, 0x39, 0x56,
+ 0xf1, 0xea, 0x6e, 0x71, 0x56, 0xe7, 0xeb
+};
+static const u8 enc_output035[] __initconst = {
+ 0x46, 0xa8, 0x0c, 0x41, 0x87, 0x02, 0x47, 0x20,
+ 0x08, 0x46, 0x27, 0x58, 0x00, 0x80, 0xdd, 0xe5,
+ 0xa3, 0xf4, 0xa1, 0x10, 0x93, 0xa7, 0x07, 0x6e,
+ 0xd6, 0xf3, 0xd3, 0x26, 0xbc, 0x7b, 0x70, 0x53,
+ 0x4d, 0x4a, 0xa2, 0x83, 0x5a, 0x52, 0xe7, 0x2d,
+ 0x14, 0xdf, 0x0e, 0x4f, 0x47, 0xf2, 0x5f
+};
+static const u8 enc_assoc035[] __initconst = {
+ 0x37, 0x46, 0x18, 0xa0, 0x6e, 0xa9, 0x8a, 0x48
+};
+static const u8 enc_nonce035[] __initconst = {
+ 0x47, 0x0a, 0x33, 0x9e, 0xcb, 0x32, 0x19, 0xb8,
+ 0xb8, 0x1a, 0x1f, 0x8b
+};
+static const u8 enc_key035[] __initconst = {
+ 0x7f, 0x5b, 0x74, 0xc0, 0x7e, 0xd1, 0xb4, 0x0f,
+ 0xd1, 0x43, 0x58, 0xfe, 0x2f, 0xf2, 0xa7, 0x40,
+ 0xc1, 0x16, 0xc7, 0x70, 0x65, 0x10, 0xe6, 0xa4,
+ 0x37, 0xf1, 0x9e, 0xa4, 0x99, 0x11, 0xce, 0xc4
+};
+
+/* wycheproof - misc */
+static const u8 enc_input036[] __initconst = {
+ 0xb9, 0xc5, 0x54, 0xcb, 0xc3, 0x6a, 0xc1, 0x8a,
+ 0xe8, 0x97, 0xdf, 0x7b, 0xee, 0xca, 0xc1, 0xdb,
+ 0xeb, 0x4e, 0xaf, 0xa1, 0x56, 0xbb, 0x60, 0xce,
+ 0x2e, 0x5d, 0x48, 0xf0, 0x57, 0x15, 0xe6, 0x78
+};
+static const u8 enc_output036[] __initconst = {
+ 0xea, 0x29, 0xaf, 0xa4, 0x9d, 0x36, 0xe8, 0x76,
+ 0x0f, 0x5f, 0xe1, 0x97, 0x23, 0xb9, 0x81, 0x1e,
+ 0xd5, 0xd5, 0x19, 0x93, 0x4a, 0x44, 0x0f, 0x50,
+ 0x81, 0xac, 0x43, 0x0b, 0x95, 0x3b, 0x0e, 0x21,
+ 0x22, 0x25, 0x41, 0xaf, 0x46, 0xb8, 0x65, 0x33,
+ 0xc6, 0xb6, 0x8d, 0x2f, 0xf1, 0x08, 0xa7, 0xea
+};
+static const u8 enc_assoc036[] __initconst = { };
+static const u8 enc_nonce036[] __initconst = {
+ 0x72, 0xcf, 0xd9, 0x0e, 0xf3, 0x02, 0x6c, 0xa2,
+ 0x2b, 0x7e, 0x6e, 0x6a
+};
+static const u8 enc_key036[] __initconst = {
+ 0xe1, 0x73, 0x1d, 0x58, 0x54, 0xe1, 0xb7, 0x0c,
+ 0xb3, 0xff, 0xe8, 0xb7, 0x86, 0xa2, 0xb3, 0xeb,
+ 0xf0, 0x99, 0x43, 0x70, 0x95, 0x47, 0x57, 0xb9,
+ 0xdc, 0x8c, 0x7b, 0xc5, 0x35, 0x46, 0x34, 0xa3
+};
+
+/* wycheproof - misc */
+static const u8 enc_input037[] __initconst = {
+ 0x6b, 0x26, 0x04, 0x99, 0x6c, 0xd3, 0x0c, 0x14,
+ 0xa1, 0x3a, 0x52, 0x57, 0xed, 0x6c, 0xff, 0xd3,
+ 0xbc, 0x5e, 0x29, 0xd6, 0xb9, 0x7e, 0xb1, 0x79,
+ 0x9e, 0xb3, 0x35, 0xe2, 0x81, 0xea, 0x45, 0x1e
+};
+static const u8 enc_output037[] __initconst = {
+ 0x6d, 0xad, 0x63, 0x78, 0x97, 0x54, 0x4d, 0x8b,
+ 0xf6, 0xbe, 0x95, 0x07, 0xed, 0x4d, 0x1b, 0xb2,
+ 0xe9, 0x54, 0xbc, 0x42, 0x7e, 0x5d, 0xe7, 0x29,
+ 0xda, 0xf5, 0x07, 0x62, 0x84, 0x6f, 0xf2, 0xf4,
+ 0x7b, 0x99, 0x7d, 0x93, 0xc9, 0x82, 0x18, 0x9d,
+ 0x70, 0x95, 0xdc, 0x79, 0x4c, 0x74, 0x62, 0x32
+};
+static const u8 enc_assoc037[] __initconst = {
+ 0x23, 0x33, 0xe5, 0xce, 0x0f, 0x93, 0xb0, 0x59
+};
+static const u8 enc_nonce037[] __initconst = {
+ 0x26, 0x28, 0x80, 0xd4, 0x75, 0xf3, 0xda, 0xc5,
+ 0x34, 0x0d, 0xd1, 0xb8
+};
+static const u8 enc_key037[] __initconst = {
+ 0x27, 0xd8, 0x60, 0x63, 0x1b, 0x04, 0x85, 0xa4,
+ 0x10, 0x70, 0x2f, 0xea, 0x61, 0xbc, 0x87, 0x3f,
+ 0x34, 0x42, 0x26, 0x0c, 0xad, 0xed, 0x4a, 0xbd,
+ 0xe2, 0x5b, 0x78, 0x6a, 0x2d, 0x97, 0xf1, 0x45
+};
+
+/* wycheproof - misc */
+static const u8 enc_input038[] __initconst = {
+ 0x97, 0x3d, 0x0c, 0x75, 0x38, 0x26, 0xba, 0xe4,
+ 0x66, 0xcf, 0x9a, 0xbb, 0x34, 0x93, 0x15, 0x2e,
+ 0x9d, 0xe7, 0x81, 0x9e, 0x2b, 0xd0, 0xc7, 0x11,
+ 0x71, 0x34, 0x6b, 0x4d, 0x2c, 0xeb, 0xf8, 0x04,
+ 0x1a, 0xa3, 0xce, 0xdc, 0x0d, 0xfd, 0x7b, 0x46,
+ 0x7e, 0x26, 0x22, 0x8b, 0xc8, 0x6c, 0x9a
+};
+static const u8 enc_output038[] __initconst = {
+ 0xfb, 0xa7, 0x8a, 0xe4, 0xf9, 0xd8, 0x08, 0xa6,
+ 0x2e, 0x3d, 0xa4, 0x0b, 0xe2, 0xcb, 0x77, 0x00,
+ 0xc3, 0x61, 0x3d, 0x9e, 0xb2, 0xc5, 0x29, 0xc6,
+ 0x52, 0xe7, 0x6a, 0x43, 0x2c, 0x65, 0x8d, 0x27,
+ 0x09, 0x5f, 0x0e, 0xb8, 0xf9, 0x40, 0xc3, 0x24,
+ 0x98, 0x1e, 0xa9, 0x35, 0xe5, 0x07, 0xf9, 0x8f,
+ 0x04, 0x69, 0x56, 0xdb, 0x3a, 0x51, 0x29, 0x08,
+ 0xbd, 0x7a, 0xfc, 0x8f, 0x2a, 0xb0, 0xa9
+};
+static const u8 enc_assoc038[] __initconst = { };
+static const u8 enc_nonce038[] __initconst = {
+ 0xe7, 0x4a, 0x51, 0x5e, 0x7e, 0x21, 0x02, 0xb9,
+ 0x0b, 0xef, 0x55, 0xd2
+};
+static const u8 enc_key038[] __initconst = {
+ 0xcf, 0x0d, 0x40, 0xa4, 0x64, 0x4e, 0x5f, 0x51,
+ 0x81, 0x51, 0x65, 0xd5, 0x30, 0x1b, 0x22, 0x63,
+ 0x1f, 0x45, 0x44, 0xc4, 0x9a, 0x18, 0x78, 0xe3,
+ 0xa0, 0xa5, 0xe8, 0xe1, 0xaa, 0xe0, 0xf2, 0x64
+};
+
+/* wycheproof - misc */
+static const u8 enc_input039[] __initconst = {
+ 0xa9, 0x89, 0x95, 0x50, 0x4d, 0xf1, 0x6f, 0x74,
+ 0x8b, 0xfb, 0x77, 0x85, 0xff, 0x91, 0xee, 0xb3,
+ 0xb6, 0x60, 0xea, 0x9e, 0xd3, 0x45, 0x0c, 0x3d,
+ 0x5e, 0x7b, 0x0e, 0x79, 0xef, 0x65, 0x36, 0x59,
+ 0xa9, 0x97, 0x8d, 0x75, 0x54, 0x2e, 0xf9, 0x1c,
+ 0x45, 0x67, 0x62, 0x21, 0x56, 0x40, 0xb9
+};
+static const u8 enc_output039[] __initconst = {
+ 0xa1, 0xff, 0xed, 0x80, 0x76, 0x18, 0x29, 0xec,
+ 0xce, 0x24, 0x2e, 0x0e, 0x88, 0xb1, 0x38, 0x04,
+ 0x90, 0x16, 0xbc, 0xa0, 0x18, 0xda, 0x2b, 0x6e,
+ 0x19, 0x98, 0x6b, 0x3e, 0x31, 0x8c, 0xae, 0x8d,
+ 0x80, 0x61, 0x98, 0xfb, 0x4c, 0x52, 0x7c, 0xc3,
+ 0x93, 0x50, 0xeb, 0xdd, 0xea, 0xc5, 0x73, 0xc4,
+ 0xcb, 0xf0, 0xbe, 0xfd, 0xa0, 0xb7, 0x02, 0x42,
+ 0xc6, 0x40, 0xd7, 0xcd, 0x02, 0xd7, 0xa3
+};
+static const u8 enc_assoc039[] __initconst = {
+ 0xb3, 0xe4, 0x06, 0x46, 0x83, 0xb0, 0x2d, 0x84
+};
+static const u8 enc_nonce039[] __initconst = {
+ 0xd4, 0xd8, 0x07, 0x34, 0x16, 0x83, 0x82, 0x5b,
+ 0x31, 0xcd, 0x4d, 0x95
+};
+static const u8 enc_key039[] __initconst = {
+ 0x6c, 0xbf, 0xd7, 0x1c, 0x64, 0x5d, 0x18, 0x4c,
+ 0xf5, 0xd2, 0x3c, 0x40, 0x2b, 0xdb, 0x0d, 0x25,
+ 0xec, 0x54, 0x89, 0x8c, 0x8a, 0x02, 0x73, 0xd4,
+ 0x2e, 0xb5, 0xbe, 0x10, 0x9f, 0xdc, 0xb2, 0xac
+};
+
+/* wycheproof - misc */
+static const u8 enc_input040[] __initconst = {
+ 0xd0, 0x96, 0x80, 0x31, 0x81, 0xbe, 0xef, 0x9e,
+ 0x00, 0x8f, 0xf8, 0x5d, 0x5d, 0xdc, 0x38, 0xdd,
+ 0xac, 0xf0, 0xf0, 0x9e, 0xe5, 0xf7, 0xe0, 0x7f,
+ 0x1e, 0x40, 0x79, 0xcb, 0x64, 0xd0, 0xdc, 0x8f,
+ 0x5e, 0x67, 0x11, 0xcd, 0x49, 0x21, 0xa7, 0x88,
+ 0x7d, 0xe7, 0x6e, 0x26, 0x78, 0xfd, 0xc6, 0x76,
+ 0x18, 0xf1, 0x18, 0x55, 0x86, 0xbf, 0xea, 0x9d,
+ 0x4c, 0x68, 0x5d, 0x50, 0xe4, 0xbb, 0x9a, 0x82
+};
+static const u8 enc_output040[] __initconst = {
+ 0x9a, 0x4e, 0xf2, 0x2b, 0x18, 0x16, 0x77, 0xb5,
+ 0x75, 0x5c, 0x08, 0xf7, 0x47, 0xc0, 0xf8, 0xd8,
+ 0xe8, 0xd4, 0xc1, 0x8a, 0x9c, 0xc2, 0x40, 0x5c,
+ 0x12, 0xbb, 0x51, 0xbb, 0x18, 0x72, 0xc8, 0xe8,
+ 0xb8, 0x77, 0x67, 0x8b, 0xec, 0x44, 0x2c, 0xfc,
+ 0xbb, 0x0f, 0xf4, 0x64, 0xa6, 0x4b, 0x74, 0x33,
+ 0x2c, 0xf0, 0x72, 0x89, 0x8c, 0x7e, 0x0e, 0xdd,
+ 0xf6, 0x23, 0x2e, 0xa6, 0xe2, 0x7e, 0xfe, 0x50,
+ 0x9f, 0xf3, 0x42, 0x7a, 0x0f, 0x32, 0xfa, 0x56,
+ 0x6d, 0x9c, 0xa0, 0xa7, 0x8a, 0xef, 0xc0, 0x13
+};
+static const u8 enc_assoc040[] __initconst = { };
+static const u8 enc_nonce040[] __initconst = {
+ 0xd6, 0x10, 0x40, 0xa3, 0x13, 0xed, 0x49, 0x28,
+ 0x23, 0xcc, 0x06, 0x5b
+};
+static const u8 enc_key040[] __initconst = {
+ 0x5b, 0x1d, 0x10, 0x35, 0xc0, 0xb1, 0x7e, 0xe0,
+ 0xb0, 0x44, 0x47, 0x67, 0xf8, 0x0a, 0x25, 0xb8,
+ 0xc1, 0xb7, 0x41, 0xf4, 0xb5, 0x0a, 0x4d, 0x30,
+ 0x52, 0x22, 0x6b, 0xaa, 0x1c, 0x6f, 0xb7, 0x01
+};
+
+/* wycheproof - misc */
+static const u8 enc_input041[] __initconst = {
+ 0x94, 0xee, 0x16, 0x6d, 0x6d, 0x6e, 0xcf, 0x88,
+ 0x32, 0x43, 0x71, 0x36, 0xb4, 0xae, 0x80, 0x5d,
+ 0x42, 0x88, 0x64, 0x35, 0x95, 0x86, 0xd9, 0x19,
+ 0x3a, 0x25, 0x01, 0x62, 0x93, 0xed, 0xba, 0x44,
+ 0x3c, 0x58, 0xe0, 0x7e, 0x7b, 0x71, 0x95, 0xec,
+ 0x5b, 0xd8, 0x45, 0x82, 0xa9, 0xd5, 0x6c, 0x8d,
+ 0x4a, 0x10, 0x8c, 0x7d, 0x7c, 0xe3, 0x4e, 0x6c,
+ 0x6f, 0x8e, 0xa1, 0xbe, 0xc0, 0x56, 0x73, 0x17
+};
+static const u8 enc_output041[] __initconst = {
+ 0x5f, 0xbb, 0xde, 0xcc, 0x34, 0xbe, 0x20, 0x16,
+ 0x14, 0xf6, 0x36, 0x03, 0x1e, 0xeb, 0x42, 0xf1,
+ 0xca, 0xce, 0x3c, 0x79, 0xa1, 0x2c, 0xff, 0xd8,
+ 0x71, 0xee, 0x8e, 0x73, 0x82, 0x0c, 0x82, 0x97,
+ 0x49, 0xf1, 0xab, 0xb4, 0x29, 0x43, 0x67, 0x84,
+ 0x9f, 0xb6, 0xc2, 0xaa, 0x56, 0xbd, 0xa8, 0xa3,
+ 0x07, 0x8f, 0x72, 0x3d, 0x7c, 0x1c, 0x85, 0x20,
+ 0x24, 0xb0, 0x17, 0xb5, 0x89, 0x73, 0xfb, 0x1e,
+ 0x09, 0x26, 0x3d, 0xa7, 0xb4, 0xcb, 0x92, 0x14,
+ 0x52, 0xf9, 0x7d, 0xca, 0x40, 0xf5, 0x80, 0xec
+};
+static const u8 enc_assoc041[] __initconst = {
+ 0x71, 0x93, 0xf6, 0x23, 0x66, 0x33, 0x21, 0xa2
+};
+static const u8 enc_nonce041[] __initconst = {
+ 0xd3, 0x1c, 0x21, 0xab, 0xa1, 0x75, 0xb7, 0x0d,
+ 0xe4, 0xeb, 0xb1, 0x9c
+};
+static const u8 enc_key041[] __initconst = {
+ 0x97, 0xd6, 0x35, 0xc4, 0xf4, 0x75, 0x74, 0xd9,
+ 0x99, 0x8a, 0x90, 0x87, 0x5d, 0xa1, 0xd3, 0xa2,
+ 0x84, 0xb7, 0x55, 0xb2, 0xd3, 0x92, 0x97, 0xa5,
+ 0x72, 0x52, 0x35, 0x19, 0x0e, 0x10, 0xa9, 0x7e
+};
+
+/* wycheproof - misc */
+static const u8 enc_input042[] __initconst = {
+ 0xb4, 0x29, 0xeb, 0x80, 0xfb, 0x8f, 0xe8, 0xba,
+ 0xed, 0xa0, 0xc8, 0x5b, 0x9c, 0x33, 0x34, 0x58,
+ 0xe7, 0xc2, 0x99, 0x2e, 0x55, 0x84, 0x75, 0x06,
+ 0x9d, 0x12, 0xd4, 0x5c, 0x22, 0x21, 0x75, 0x64,
+ 0x12, 0x15, 0x88, 0x03, 0x22, 0x97, 0xef, 0xf5,
+ 0x67, 0x83, 0x74, 0x2a, 0x5f, 0xc2, 0x2d, 0x74,
+ 0x10, 0xff, 0xb2, 0x9d, 0x66, 0x09, 0x86, 0x61,
+ 0xd7, 0x6f, 0x12, 0x6c, 0x3c, 0x27, 0x68, 0x9e,
+ 0x43, 0xb3, 0x72, 0x67, 0xca, 0xc5, 0xa3, 0xa6,
+ 0xd3, 0xab, 0x49, 0xe3, 0x91, 0xda, 0x29, 0xcd,
+ 0x30, 0x54, 0xa5, 0x69, 0x2e, 0x28, 0x07, 0xe4,
+ 0xc3, 0xea, 0x46, 0xc8, 0x76, 0x1d, 0x50, 0xf5,
+ 0x92
+};
+static const u8 enc_output042[] __initconst = {
+ 0xd0, 0x10, 0x2f, 0x6c, 0x25, 0x8b, 0xf4, 0x97,
+ 0x42, 0xce, 0xc3, 0x4c, 0xf2, 0xd0, 0xfe, 0xdf,
+ 0x23, 0xd1, 0x05, 0xfb, 0x4c, 0x84, 0xcf, 0x98,
+ 0x51, 0x5e, 0x1b, 0xc9, 0xa6, 0x4f, 0x8a, 0xd5,
+ 0xbe, 0x8f, 0x07, 0x21, 0xbd, 0xe5, 0x06, 0x45,
+ 0xd0, 0x00, 0x83, 0xc3, 0xa2, 0x63, 0xa3, 0x10,
+ 0x53, 0xb7, 0x60, 0x24, 0x5f, 0x52, 0xae, 0x28,
+ 0x66, 0xa5, 0xec, 0x83, 0xb1, 0x9f, 0x61, 0xbe,
+ 0x1d, 0x30, 0xd5, 0xc5, 0xd9, 0xfe, 0xcc, 0x4c,
+ 0xbb, 0xe0, 0x8f, 0xd3, 0x85, 0x81, 0x3a, 0x2a,
+ 0xa3, 0x9a, 0x00, 0xff, 0x9c, 0x10, 0xf7, 0xf2,
+ 0x37, 0x02, 0xad, 0xd1, 0xe4, 0xb2, 0xff, 0xa3,
+ 0x1c, 0x41, 0x86, 0x5f, 0xc7, 0x1d, 0xe1, 0x2b,
+ 0x19, 0x61, 0x21, 0x27, 0xce, 0x49, 0x99, 0x3b,
+ 0xb0
+};
+static const u8 enc_assoc042[] __initconst = { };
+static const u8 enc_nonce042[] __initconst = {
+ 0x17, 0xc8, 0x6a, 0x8a, 0xbb, 0xb7, 0xe0, 0x03,
+ 0xac, 0xde, 0x27, 0x99
+};
+static const u8 enc_key042[] __initconst = {
+ 0xfe, 0x6e, 0x55, 0xbd, 0xae, 0xd1, 0xf7, 0x28,
+ 0x4c, 0xa5, 0xfc, 0x0f, 0x8c, 0x5f, 0x2b, 0x8d,
+ 0xf5, 0x6d, 0xc0, 0xf4, 0x9e, 0x8c, 0xa6, 0x6a,
+ 0x41, 0x99, 0x5e, 0x78, 0x33, 0x51, 0xf9, 0x01
+};
+
+/* wycheproof - misc */
+static const u8 enc_input043[] __initconst = {
+ 0xce, 0xb5, 0x34, 0xce, 0x50, 0xdc, 0x23, 0xff,
+ 0x63, 0x8a, 0xce, 0x3e, 0xf6, 0x3a, 0xb2, 0xcc,
+ 0x29, 0x73, 0xee, 0xad, 0xa8, 0x07, 0x85, 0xfc,
+ 0x16, 0x5d, 0x06, 0xc2, 0xf5, 0x10, 0x0f, 0xf5,
+ 0xe8, 0xab, 0x28, 0x82, 0xc4, 0x75, 0xaf, 0xcd,
+ 0x05, 0xcc, 0xd4, 0x9f, 0x2e, 0x7d, 0x8f, 0x55,
+ 0xef, 0x3a, 0x72, 0xe3, 0xdc, 0x51, 0xd6, 0x85,
+ 0x2b, 0x8e, 0x6b, 0x9e, 0x7a, 0xec, 0xe5, 0x7b,
+ 0xe6, 0x55, 0x6b, 0x0b, 0x6d, 0x94, 0x13, 0xe3,
+ 0x3f, 0xc5, 0xfc, 0x24, 0xa9, 0xa2, 0x05, 0xad,
+ 0x59, 0x57, 0x4b, 0xb3, 0x9d, 0x94, 0x4a, 0x92,
+ 0xdc, 0x47, 0x97, 0x0d, 0x84, 0xa6, 0xad, 0x31,
+ 0x76
+};
+static const u8 enc_output043[] __initconst = {
+ 0x75, 0x45, 0x39, 0x1b, 0x51, 0xde, 0x01, 0xd5,
+ 0xc5, 0x3d, 0xfa, 0xca, 0x77, 0x79, 0x09, 0x06,
+ 0x3e, 0x58, 0xed, 0xee, 0x4b, 0xb1, 0x22, 0x7e,
+ 0x71, 0x10, 0xac, 0x4d, 0x26, 0x20, 0xc2, 0xae,
+ 0xc2, 0xf8, 0x48, 0xf5, 0x6d, 0xee, 0xb0, 0x37,
+ 0xa8, 0xdc, 0xed, 0x75, 0xaf, 0xa8, 0xa6, 0xc8,
+ 0x90, 0xe2, 0xde, 0xe4, 0x2f, 0x95, 0x0b, 0xb3,
+ 0x3d, 0x9e, 0x24, 0x24, 0xd0, 0x8a, 0x50, 0x5d,
+ 0x89, 0x95, 0x63, 0x97, 0x3e, 0xd3, 0x88, 0x70,
+ 0xf3, 0xde, 0x6e, 0xe2, 0xad, 0xc7, 0xfe, 0x07,
+ 0x2c, 0x36, 0x6c, 0x14, 0xe2, 0xcf, 0x7c, 0xa6,
+ 0x2f, 0xb3, 0xd3, 0x6b, 0xee, 0x11, 0x68, 0x54,
+ 0x61, 0xb7, 0x0d, 0x44, 0xef, 0x8c, 0x66, 0xc5,
+ 0xc7, 0xbb, 0xf1, 0x0d, 0xca, 0xdd, 0x7f, 0xac,
+ 0xf6
+};
+static const u8 enc_assoc043[] __initconst = {
+ 0xa1, 0x1c, 0x40, 0xb6, 0x03, 0x76, 0x73, 0x30
+};
+static const u8 enc_nonce043[] __initconst = {
+ 0x46, 0x36, 0x2f, 0x45, 0xd6, 0x37, 0x9e, 0x63,
+ 0xe5, 0x22, 0x94, 0x60
+};
+static const u8 enc_key043[] __initconst = {
+ 0xaa, 0xbc, 0x06, 0x34, 0x74, 0xe6, 0x5c, 0x4c,
+ 0x3e, 0x9b, 0xdc, 0x48, 0x0d, 0xea, 0x97, 0xb4,
+ 0x51, 0x10, 0xc8, 0x61, 0x88, 0x46, 0xff, 0x6b,
+ 0x15, 0xbd, 0xd2, 0xa4, 0xa5, 0x68, 0x2c, 0x4e
+};
+
+/* wycheproof - misc */
+static const u8 enc_input044[] __initconst = {
+ 0xe5, 0xcc, 0xaa, 0x44, 0x1b, 0xc8, 0x14, 0x68,
+ 0x8f, 0x8f, 0x6e, 0x8f, 0x28, 0xb5, 0x00, 0xb2
+};
+static const u8 enc_output044[] __initconst = {
+ 0x7e, 0x72, 0xf5, 0xa1, 0x85, 0xaf, 0x16, 0xa6,
+ 0x11, 0x92, 0x1b, 0x43, 0x8f, 0x74, 0x9f, 0x0b,
+ 0x12, 0x42, 0xc6, 0x70, 0x73, 0x23, 0x34, 0x02,
+ 0x9a, 0xdf, 0xe1, 0xc5, 0x00, 0x16, 0x51, 0xe4
+};
+static const u8 enc_assoc044[] __initconst = {
+ 0x02
+};
+static const u8 enc_nonce044[] __initconst = {
+ 0x87, 0x34, 0x5f, 0x10, 0x55, 0xfd, 0x9e, 0x21,
+ 0x02, 0xd5, 0x06, 0x56
+};
+static const u8 enc_key044[] __initconst = {
+ 0x7d, 0x00, 0xb4, 0x80, 0x95, 0xad, 0xfa, 0x32,
+ 0x72, 0x05, 0x06, 0x07, 0xb2, 0x64, 0x18, 0x50,
+ 0x02, 0xba, 0x99, 0x95, 0x7c, 0x49, 0x8b, 0xe0,
+ 0x22, 0x77, 0x0f, 0x2c, 0xe2, 0xf3, 0x14, 0x3c
+};
+
+/* wycheproof - misc */
+static const u8 enc_input045[] __initconst = {
+ 0x02, 0xcd, 0xe1, 0x68, 0xfb, 0xa3, 0xf5, 0x44,
+ 0xbb, 0xd0, 0x33, 0x2f, 0x7a, 0xde, 0xad, 0xa8
+};
+static const u8 enc_output045[] __initconst = {
+ 0x85, 0xf2, 0x9a, 0x71, 0x95, 0x57, 0xcd, 0xd1,
+ 0x4d, 0x1f, 0x8f, 0xff, 0xab, 0x6d, 0x9e, 0x60,
+ 0x73, 0x2c, 0xa3, 0x2b, 0xec, 0xd5, 0x15, 0xa1,
+ 0xed, 0x35, 0x3f, 0x54, 0x2e, 0x99, 0x98, 0x58
+};
+static const u8 enc_assoc045[] __initconst = {
+ 0xb6, 0x48
+};
+static const u8 enc_nonce045[] __initconst = {
+ 0x87, 0xa3, 0x16, 0x3e, 0xc0, 0x59, 0x8a, 0xd9,
+ 0x5b, 0x3a, 0xa7, 0x13
+};
+static const u8 enc_key045[] __initconst = {
+ 0x64, 0x32, 0x71, 0x7f, 0x1d, 0xb8, 0x5e, 0x41,
+ 0xac, 0x78, 0x36, 0xbc, 0xe2, 0x51, 0x85, 0xa0,
+ 0x80, 0xd5, 0x76, 0x2b, 0x9e, 0x2b, 0x18, 0x44,
+ 0x4b, 0x6e, 0xc7, 0x2c, 0x3b, 0xd8, 0xe4, 0xdc
+};
+
+/* wycheproof - misc */
+static const u8 enc_input046[] __initconst = {
+ 0x16, 0xdd, 0xd2, 0x3f, 0xf5, 0x3f, 0x3d, 0x23,
+ 0xc0, 0x63, 0x34, 0x48, 0x70, 0x40, 0xeb, 0x47
+};
+static const u8 enc_output046[] __initconst = {
+ 0xc1, 0xb2, 0x95, 0x93, 0x6d, 0x56, 0xfa, 0xda,
+ 0xc0, 0x3e, 0x5f, 0x74, 0x2b, 0xff, 0x73, 0xa1,
+ 0x39, 0xc4, 0x57, 0xdb, 0xab, 0x66, 0x38, 0x2b,
+ 0xab, 0xb3, 0xb5, 0x58, 0x00, 0xcd, 0xa5, 0xb8
+};
+static const u8 enc_assoc046[] __initconst = {
+ 0xbd, 0x4c, 0xd0, 0x2f, 0xc7, 0x50, 0x2b, 0xbd,
+ 0xbd, 0xf6, 0xc9, 0xa3, 0xcb, 0xe8, 0xf0
+};
+static const u8 enc_nonce046[] __initconst = {
+ 0x6f, 0x57, 0x3a, 0xa8, 0x6b, 0xaa, 0x49, 0x2b,
+ 0xa4, 0x65, 0x96, 0xdf
+};
+static const u8 enc_key046[] __initconst = {
+ 0x8e, 0x34, 0xcf, 0x73, 0xd2, 0x45, 0xa1, 0x08,
+ 0x2a, 0x92, 0x0b, 0x86, 0x36, 0x4e, 0xb8, 0x96,
+ 0xc4, 0x94, 0x64, 0x67, 0xbc, 0xb3, 0xd5, 0x89,
+ 0x29, 0xfc, 0xb3, 0x66, 0x90, 0xe6, 0x39, 0x4f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input047[] __initconst = {
+ 0x62, 0x3b, 0x78, 0x50, 0xc3, 0x21, 0xe2, 0xcf,
+ 0x0c, 0x6f, 0xbc, 0xc8, 0xdf, 0xd1, 0xaf, 0xf2
+};
+static const u8 enc_output047[] __initconst = {
+ 0xc8, 0x4c, 0x9b, 0xb7, 0xc6, 0x1c, 0x1b, 0xcb,
+ 0x17, 0x77, 0x2a, 0x1c, 0x50, 0x0c, 0x50, 0x95,
+ 0xdb, 0xad, 0xf7, 0xa5, 0x13, 0x8c, 0xa0, 0x34,
+ 0x59, 0xa2, 0xcd, 0x65, 0x83, 0x1e, 0x09, 0x2f
+};
+static const u8 enc_assoc047[] __initconst = {
+ 0x89, 0xcc, 0xe9, 0xfb, 0x47, 0x44, 0x1d, 0x07,
+ 0xe0, 0x24, 0x5a, 0x66, 0xfe, 0x8b, 0x77, 0x8b
+};
+static const u8 enc_nonce047[] __initconst = {
+ 0x1a, 0x65, 0x18, 0xf0, 0x2e, 0xde, 0x1d, 0xa6,
+ 0x80, 0x92, 0x66, 0xd9
+};
+static const u8 enc_key047[] __initconst = {
+ 0xcb, 0x55, 0x75, 0xf5, 0xc7, 0xc4, 0x5c, 0x91,
+ 0xcf, 0x32, 0x0b, 0x13, 0x9f, 0xb5, 0x94, 0x23,
+ 0x75, 0x60, 0xd0, 0xa3, 0xe6, 0xf8, 0x65, 0xa6,
+ 0x7d, 0x4f, 0x63, 0x3f, 0x2c, 0x08, 0xf0, 0x16
+};
+
+/* wycheproof - misc */
+static const u8 enc_input048[] __initconst = {
+ 0x87, 0xb3, 0xa4, 0xd7, 0xb2, 0x6d, 0x8d, 0x32,
+ 0x03, 0xa0, 0xde, 0x1d, 0x64, 0xef, 0x82, 0xe3
+};
+static const u8 enc_output048[] __initconst = {
+ 0x94, 0xbc, 0x80, 0x62, 0x1e, 0xd1, 0xe7, 0x1b,
+ 0x1f, 0xd2, 0xb5, 0xc3, 0xa1, 0x5e, 0x35, 0x68,
+ 0x33, 0x35, 0x11, 0x86, 0x17, 0x96, 0x97, 0x84,
+ 0x01, 0x59, 0x8b, 0x96, 0x37, 0x22, 0xf5, 0xb3
+};
+static const u8 enc_assoc048[] __initconst = {
+ 0xd1, 0x9f, 0x2d, 0x98, 0x90, 0x95, 0xf7, 0xab,
+ 0x03, 0xa5, 0xfd, 0xe8, 0x44, 0x16, 0xe0, 0x0c,
+ 0x0e
+};
+static const u8 enc_nonce048[] __initconst = {
+ 0x56, 0x4d, 0xee, 0x49, 0xab, 0x00, 0xd2, 0x40,
+ 0xfc, 0x10, 0x68, 0xc3
+};
+static const u8 enc_key048[] __initconst = {
+ 0xa5, 0x56, 0x9e, 0x72, 0x9a, 0x69, 0xb2, 0x4b,
+ 0xa6, 0xe0, 0xff, 0x15, 0xc4, 0x62, 0x78, 0x97,
+ 0x43, 0x68, 0x24, 0xc9, 0x41, 0xe9, 0xd0, 0x0b,
+ 0x2e, 0x93, 0xfd, 0xdc, 0x4b, 0xa7, 0x76, 0x57
+};
+
+/* wycheproof - misc */
+static const u8 enc_input049[] __initconst = {
+ 0xe6, 0x01, 0xb3, 0x85, 0x57, 0x79, 0x7d, 0xa2,
+ 0xf8, 0xa4, 0x10, 0x6a, 0x08, 0x9d, 0x1d, 0xa6
+};
+static const u8 enc_output049[] __initconst = {
+ 0x29, 0x9b, 0x5d, 0x3f, 0x3d, 0x03, 0xc0, 0x87,
+ 0x20, 0x9a, 0x16, 0xe2, 0x85, 0x14, 0x31, 0x11,
+ 0x4b, 0x45, 0x4e, 0xd1, 0x98, 0xde, 0x11, 0x7e,
+ 0x83, 0xec, 0x49, 0xfa, 0x8d, 0x85, 0x08, 0xd6
+};
+static const u8 enc_assoc049[] __initconst = {
+ 0x5e, 0x64, 0x70, 0xfa, 0xcd, 0x99, 0xc1, 0xd8,
+ 0x1e, 0x37, 0xcd, 0x44, 0x01, 0x5f, 0xe1, 0x94,
+ 0x80, 0xa2, 0xa4, 0xd3, 0x35, 0x2a, 0x4f, 0xf5,
+ 0x60, 0xc0, 0x64, 0x0f, 0xdb, 0xda
+};
+static const u8 enc_nonce049[] __initconst = {
+ 0xdf, 0x87, 0x13, 0xe8, 0x7e, 0xc3, 0xdb, 0xcf,
+ 0xad, 0x14, 0xd5, 0x3e
+};
+static const u8 enc_key049[] __initconst = {
+ 0x56, 0x20, 0x74, 0x65, 0xb4, 0xe4, 0x8e, 0x6d,
+ 0x04, 0x63, 0x0f, 0x4a, 0x42, 0xf3, 0x5c, 0xfc,
+ 0x16, 0x3a, 0xb2, 0x89, 0xc2, 0x2a, 0x2b, 0x47,
+ 0x84, 0xf6, 0xf9, 0x29, 0x03, 0x30, 0xbe, 0xe0
+};
+
+/* wycheproof - misc */
+static const u8 enc_input050[] __initconst = {
+ 0xdc, 0x9e, 0x9e, 0xaf, 0x11, 0xe3, 0x14, 0x18,
+ 0x2d, 0xf6, 0xa4, 0xeb, 0xa1, 0x7a, 0xec, 0x9c
+};
+static const u8 enc_output050[] __initconst = {
+ 0x60, 0x5b, 0xbf, 0x90, 0xae, 0xb9, 0x74, 0xf6,
+ 0x60, 0x2b, 0xc7, 0x78, 0x05, 0x6f, 0x0d, 0xca,
+ 0x38, 0xea, 0x23, 0xd9, 0x90, 0x54, 0xb4, 0x6b,
+ 0x42, 0xff, 0xe0, 0x04, 0x12, 0x9d, 0x22, 0x04
+};
+static const u8 enc_assoc050[] __initconst = {
+ 0xba, 0x44, 0x6f, 0x6f, 0x9a, 0x0c, 0xed, 0x22,
+ 0x45, 0x0f, 0xeb, 0x10, 0x73, 0x7d, 0x90, 0x07,
+ 0xfd, 0x69, 0xab, 0xc1, 0x9b, 0x1d, 0x4d, 0x90,
+ 0x49, 0xa5, 0x55, 0x1e, 0x86, 0xec, 0x2b, 0x37
+};
+static const u8 enc_nonce050[] __initconst = {
+ 0x8d, 0xf4, 0xb1, 0x5a, 0x88, 0x8c, 0x33, 0x28,
+ 0x6a, 0x7b, 0x76, 0x51
+};
+static const u8 enc_key050[] __initconst = {
+ 0x39, 0x37, 0x98, 0x6a, 0xf8, 0x6d, 0xaf, 0xc1,
+ 0xba, 0x0c, 0x46, 0x72, 0xd8, 0xab, 0xc4, 0x6c,
+ 0x20, 0x70, 0x62, 0x68, 0x2d, 0x9c, 0x26, 0x4a,
+ 0xb0, 0x6d, 0x6c, 0x58, 0x07, 0x20, 0x51, 0x30
+};
+
+/* wycheproof - misc */
+static const u8 enc_input051[] __initconst = {
+ 0x81, 0xce, 0x84, 0xed, 0xe9, 0xb3, 0x58, 0x59,
+ 0xcc, 0x8c, 0x49, 0xa8, 0xf6, 0xbe, 0x7d, 0xc6
+};
+static const u8 enc_output051[] __initconst = {
+ 0x7b, 0x7c, 0xe0, 0xd8, 0x24, 0x80, 0x9a, 0x70,
+ 0xde, 0x32, 0x56, 0x2c, 0xcf, 0x2c, 0x2b, 0xbd,
+ 0x15, 0xd4, 0x4a, 0x00, 0xce, 0x0d, 0x19, 0xb4,
+ 0x23, 0x1f, 0x92, 0x1e, 0x22, 0xbc, 0x0a, 0x43
+};
+static const u8 enc_assoc051[] __initconst = {
+ 0xd4, 0x1a, 0x82, 0x8d, 0x5e, 0x71, 0x82, 0x92,
+ 0x47, 0x02, 0x19, 0x05, 0x40, 0x2e, 0xa2, 0x57,
+ 0xdc, 0xcb, 0xc3, 0xb8, 0x0f, 0xcd, 0x56, 0x75,
+ 0x05, 0x6b, 0x68, 0xbb, 0x59, 0xe6, 0x2e, 0x88,
+ 0x73
+};
+static const u8 enc_nonce051[] __initconst = {
+ 0xbe, 0x40, 0xe5, 0xf1, 0xa1, 0x18, 0x17, 0xa0,
+ 0xa8, 0xfa, 0x89, 0x49
+};
+static const u8 enc_key051[] __initconst = {
+ 0x36, 0x37, 0x2a, 0xbc, 0xdb, 0x78, 0xe0, 0x27,
+ 0x96, 0x46, 0xac, 0x3d, 0x17, 0x6b, 0x96, 0x74,
+ 0xe9, 0x15, 0x4e, 0xec, 0xf0, 0xd5, 0x46, 0x9c,
+ 0x65, 0x1e, 0xc7, 0xe1, 0x6b, 0x4c, 0x11, 0x99
+};
+
+/* wycheproof - misc */
+static const u8 enc_input052[] __initconst = {
+ 0xa6, 0x67, 0x47, 0xc8, 0x9e, 0x85, 0x7a, 0xf3,
+ 0xa1, 0x8e, 0x2c, 0x79, 0x50, 0x00, 0x87, 0xed
+};
+static const u8 enc_output052[] __initconst = {
+ 0xca, 0x82, 0xbf, 0xf3, 0xe2, 0xf3, 0x10, 0xcc,
+ 0xc9, 0x76, 0x67, 0x2c, 0x44, 0x15, 0xe6, 0x9b,
+ 0x57, 0x63, 0x8c, 0x62, 0xa5, 0xd8, 0x5d, 0xed,
+ 0x77, 0x4f, 0x91, 0x3c, 0x81, 0x3e, 0xa0, 0x32
+};
+static const u8 enc_assoc052[] __initconst = {
+ 0x3f, 0x2d, 0xd4, 0x9b, 0xbf, 0x09, 0xd6, 0x9a,
+ 0x78, 0xa3, 0xd8, 0x0e, 0xa2, 0x56, 0x66, 0x14,
+ 0xfc, 0x37, 0x94, 0x74, 0x19, 0x6c, 0x1a, 0xae,
+ 0x84, 0x58, 0x3d, 0xa7, 0x3d, 0x7f, 0xf8, 0x5c,
+ 0x6f, 0x42, 0xca, 0x42, 0x05, 0x6a, 0x97, 0x92,
+ 0xcc, 0x1b, 0x9f, 0xb3, 0xc7, 0xd2, 0x61
+};
+static const u8 enc_nonce052[] __initconst = {
+ 0x84, 0xc8, 0x7d, 0xae, 0x4e, 0xee, 0x27, 0x73,
+ 0x0e, 0xc3, 0x5d, 0x12
+};
+static const u8 enc_key052[] __initconst = {
+ 0x9f, 0x14, 0x79, 0xed, 0x09, 0x7d, 0x7f, 0xe5,
+ 0x29, 0xc1, 0x1f, 0x2f, 0x5a, 0xdd, 0x9a, 0xaf,
+ 0xf4, 0xa1, 0xca, 0x0b, 0x68, 0x99, 0x7a, 0x2c,
+ 0xb7, 0xf7, 0x97, 0x49, 0xbd, 0x90, 0xaa, 0xf4
+};
+
+/* wycheproof - misc */
+static const u8 enc_input053[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83,
+ 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8,
+ 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe
+};
+static const u8 enc_output053[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xe6, 0xd3, 0xd7, 0x32, 0x4a, 0x1c, 0xbb, 0xa7,
+ 0x77, 0xbb, 0xb0, 0xec, 0xdd, 0xa3, 0x78, 0x07
+};
+static const u8 enc_assoc053[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce053[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key053[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input054[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83,
+ 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8,
+ 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe,
+ 0xe3, 0xbc, 0xdb, 0x5b, 0x1e, 0xde, 0xfc, 0xfe,
+ 0x8b, 0xcd, 0xa1, 0xb6, 0xa1, 0x5c, 0x8c, 0x2b,
+ 0x08, 0x69, 0xff, 0xd2, 0xec, 0x5e, 0x26, 0xe5,
+ 0x53, 0xb7, 0xb2, 0x27, 0xfe, 0x87, 0xfd, 0xbd
+};
+static const u8 enc_output054[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x06, 0x2d, 0xe6, 0x79, 0x5f, 0x27, 0x4f, 0xd2,
+ 0xa3, 0x05, 0xd7, 0x69, 0x80, 0xbc, 0x9c, 0xce
+};
+static const u8 enc_assoc054[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce054[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key054[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input055[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83,
+ 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8,
+ 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe,
+ 0xe3, 0xbc, 0xdb, 0x5b, 0x1e, 0xde, 0xfc, 0xfe,
+ 0x8b, 0xcd, 0xa1, 0xb6, 0xa1, 0x5c, 0x8c, 0x2b,
+ 0x08, 0x69, 0xff, 0xd2, 0xec, 0x5e, 0x26, 0xe5,
+ 0x53, 0xb7, 0xb2, 0x27, 0xfe, 0x87, 0xfd, 0xbd,
+ 0x7a, 0xda, 0x44, 0x42, 0x42, 0x69, 0xbf, 0xfa,
+ 0x55, 0x27, 0xf2, 0x70, 0xac, 0xf6, 0x85, 0x02,
+ 0xb7, 0x4c, 0x5a, 0xe2, 0xe6, 0x0c, 0x05, 0x80,
+ 0x98, 0x1a, 0x49, 0x38, 0x45, 0x93, 0x92, 0xc4,
+ 0x9b, 0xb2, 0xf2, 0x84, 0xb6, 0x46, 0xef, 0xc7,
+ 0xf3, 0xf0, 0xb1, 0x36, 0x1d, 0xc3, 0x48, 0xed,
+ 0x77, 0xd3, 0x0b, 0xc5, 0x76, 0x92, 0xed, 0x38,
+ 0xfb, 0xac, 0x01, 0x88, 0x38, 0x04, 0x88, 0xc7
+};
+static const u8 enc_output055[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xd8, 0xb4, 0x79, 0x02, 0xba, 0xae, 0xaf, 0xb3,
+ 0x42, 0x03, 0x05, 0x15, 0x29, 0xaf, 0x28, 0x2e
+};
+static const u8 enc_assoc055[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce055[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key055[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input056[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17,
+ 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84,
+ 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41
+};
+static const u8 enc_output056[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xb3, 0x89, 0x1c, 0x84, 0x9c, 0xb5, 0x2c, 0x27,
+ 0x74, 0x7e, 0xdf, 0xcf, 0x31, 0x21, 0x3b, 0xb6
+};
+static const u8 enc_assoc056[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce056[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key056[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input057[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17,
+ 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84,
+ 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41,
+ 0x1c, 0x43, 0x24, 0xa4, 0xe1, 0x21, 0x03, 0x01,
+ 0x74, 0x32, 0x5e, 0x49, 0x5e, 0xa3, 0x73, 0xd4,
+ 0xf7, 0x96, 0x00, 0x2d, 0x13, 0xa1, 0xd9, 0x1a,
+ 0xac, 0x48, 0x4d, 0xd8, 0x01, 0x78, 0x02, 0x42
+};
+static const u8 enc_output057[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xf0, 0xc1, 0x2d, 0x26, 0xef, 0x03, 0x02, 0x9b,
+ 0x62, 0xc0, 0x08, 0xda, 0x27, 0xc5, 0xdc, 0x68
+};
+static const u8 enc_assoc057[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce057[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key057[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input058[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17,
+ 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84,
+ 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41,
+ 0x1c, 0x43, 0x24, 0xa4, 0xe1, 0x21, 0x03, 0x01,
+ 0x74, 0x32, 0x5e, 0x49, 0x5e, 0xa3, 0x73, 0xd4,
+ 0xf7, 0x96, 0x00, 0x2d, 0x13, 0xa1, 0xd9, 0x1a,
+ 0xac, 0x48, 0x4d, 0xd8, 0x01, 0x78, 0x02, 0x42,
+ 0x85, 0x25, 0xbb, 0xbd, 0xbd, 0x96, 0x40, 0x05,
+ 0xaa, 0xd8, 0x0d, 0x8f, 0x53, 0x09, 0x7a, 0xfd,
+ 0x48, 0xb3, 0xa5, 0x1d, 0x19, 0xf3, 0xfa, 0x7f,
+ 0x67, 0xe5, 0xb6, 0xc7, 0xba, 0x6c, 0x6d, 0x3b,
+ 0x64, 0x4d, 0x0d, 0x7b, 0x49, 0xb9, 0x10, 0x38,
+ 0x0c, 0x0f, 0x4e, 0xc9, 0xe2, 0x3c, 0xb7, 0x12,
+ 0x88, 0x2c, 0xf4, 0x3a, 0x89, 0x6d, 0x12, 0xc7,
+ 0x04, 0x53, 0xfe, 0x77, 0xc7, 0xfb, 0x77, 0x38
+};
+static const u8 enc_output058[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xee, 0x65, 0x78, 0x30, 0x01, 0xc2, 0x56, 0x91,
+ 0xfa, 0x28, 0xd0, 0xf5, 0xf1, 0xc1, 0xd7, 0x62
+};
+static const u8 enc_assoc058[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce058[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key058[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input059[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03,
+ 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68,
+ 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb,
+ 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e
+};
+static const u8 enc_output059[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x79, 0xba, 0x7a, 0x29, 0xf5, 0xa7, 0xbb, 0x75,
+ 0x79, 0x7a, 0xf8, 0x7a, 0x61, 0x01, 0x29, 0xa4
+};
+static const u8 enc_assoc059[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80
+};
+static const u8 enc_nonce059[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key059[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input060[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03,
+ 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68,
+ 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb,
+ 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e,
+ 0xe3, 0xbc, 0xdb, 0xdb, 0x1e, 0xde, 0xfc, 0x7e,
+ 0x8b, 0xcd, 0xa1, 0x36, 0xa1, 0x5c, 0x8c, 0xab,
+ 0x08, 0x69, 0xff, 0x52, 0xec, 0x5e, 0x26, 0x65,
+ 0x53, 0xb7, 0xb2, 0xa7, 0xfe, 0x87, 0xfd, 0x3d
+};
+static const u8 enc_output060[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x36, 0xb1, 0x74, 0x38, 0x19, 0xe1, 0xb9, 0xba,
+ 0x15, 0x51, 0xe8, 0xed, 0x92, 0x2a, 0x95, 0x9a
+};
+static const u8 enc_assoc060[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80
+};
+static const u8 enc_nonce060[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key060[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input061[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03,
+ 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68,
+ 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb,
+ 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e,
+ 0xe3, 0xbc, 0xdb, 0xdb, 0x1e, 0xde, 0xfc, 0x7e,
+ 0x8b, 0xcd, 0xa1, 0x36, 0xa1, 0x5c, 0x8c, 0xab,
+ 0x08, 0x69, 0xff, 0x52, 0xec, 0x5e, 0x26, 0x65,
+ 0x53, 0xb7, 0xb2, 0xa7, 0xfe, 0x87, 0xfd, 0x3d,
+ 0x7a, 0xda, 0x44, 0xc2, 0x42, 0x69, 0xbf, 0x7a,
+ 0x55, 0x27, 0xf2, 0xf0, 0xac, 0xf6, 0x85, 0x82,
+ 0xb7, 0x4c, 0x5a, 0x62, 0xe6, 0x0c, 0x05, 0x00,
+ 0x98, 0x1a, 0x49, 0xb8, 0x45, 0x93, 0x92, 0x44,
+ 0x9b, 0xb2, 0xf2, 0x04, 0xb6, 0x46, 0xef, 0x47,
+ 0xf3, 0xf0, 0xb1, 0xb6, 0x1d, 0xc3, 0x48, 0x6d,
+ 0x77, 0xd3, 0x0b, 0x45, 0x76, 0x92, 0xed, 0xb8,
+ 0xfb, 0xac, 0x01, 0x08, 0x38, 0x04, 0x88, 0x47
+};
+static const u8 enc_output061[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0xfe, 0xac, 0x49, 0x55, 0x55, 0x4e, 0x80, 0x6f,
+ 0x3a, 0x19, 0x02, 0xe2, 0x44, 0x32, 0xc0, 0x8a
+};
+static const u8 enc_assoc061[] __initconst = {
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80
+};
+static const u8 enc_nonce061[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key061[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input062[] __initconst = {
+ 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc,
+ 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97,
+ 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04,
+ 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1
+};
+static const u8 enc_output062[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0x20, 0xa3, 0x79, 0x8d, 0xf1, 0x29, 0x2c, 0x59,
+ 0x72, 0xbf, 0x97, 0x41, 0xae, 0xc3, 0x8a, 0x19
+};
+static const u8 enc_assoc062[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f
+};
+static const u8 enc_nonce062[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key062[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input063[] __initconst = {
+ 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc,
+ 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97,
+ 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04,
+ 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1,
+ 0x1c, 0x43, 0x24, 0x24, 0xe1, 0x21, 0x03, 0x81,
+ 0x74, 0x32, 0x5e, 0xc9, 0x5e, 0xa3, 0x73, 0x54,
+ 0xf7, 0x96, 0x00, 0xad, 0x13, 0xa1, 0xd9, 0x9a,
+ 0xac, 0x48, 0x4d, 0x58, 0x01, 0x78, 0x02, 0xc2
+};
+static const u8 enc_output063[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xc0, 0x3d, 0x9f, 0x67, 0x35, 0x4a, 0x97, 0xb2,
+ 0xf0, 0x74, 0xf7, 0x55, 0x15, 0x57, 0xe4, 0x9c
+};
+static const u8 enc_assoc063[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f
+};
+static const u8 enc_nonce063[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key063[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input064[] __initconst = {
+ 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc,
+ 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97,
+ 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04,
+ 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1,
+ 0x1c, 0x43, 0x24, 0x24, 0xe1, 0x21, 0x03, 0x81,
+ 0x74, 0x32, 0x5e, 0xc9, 0x5e, 0xa3, 0x73, 0x54,
+ 0xf7, 0x96, 0x00, 0xad, 0x13, 0xa1, 0xd9, 0x9a,
+ 0xac, 0x48, 0x4d, 0x58, 0x01, 0x78, 0x02, 0xc2,
+ 0x85, 0x25, 0xbb, 0x3d, 0xbd, 0x96, 0x40, 0x85,
+ 0xaa, 0xd8, 0x0d, 0x0f, 0x53, 0x09, 0x7a, 0x7d,
+ 0x48, 0xb3, 0xa5, 0x9d, 0x19, 0xf3, 0xfa, 0xff,
+ 0x67, 0xe5, 0xb6, 0x47, 0xba, 0x6c, 0x6d, 0xbb,
+ 0x64, 0x4d, 0x0d, 0xfb, 0x49, 0xb9, 0x10, 0xb8,
+ 0x0c, 0x0f, 0x4e, 0x49, 0xe2, 0x3c, 0xb7, 0x92,
+ 0x88, 0x2c, 0xf4, 0xba, 0x89, 0x6d, 0x12, 0x47,
+ 0x04, 0x53, 0xfe, 0xf7, 0xc7, 0xfb, 0x77, 0xb8
+};
+static const u8 enc_output064[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xc8, 0x6d, 0xa8, 0xdd, 0x65, 0x22, 0x86, 0xd5,
+ 0x02, 0x13, 0xd3, 0x28, 0xd6, 0x3e, 0x40, 0x06
+};
+static const u8 enc_assoc064[] __initconst = {
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f
+};
+static const u8 enc_nonce064[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key064[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input065[] __initconst = {
+ 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c,
+ 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17,
+ 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84,
+ 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41
+};
+static const u8 enc_output065[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0xbe, 0xde, 0x90, 0x83, 0xce, 0xb3, 0x6d, 0xdf,
+ 0xe5, 0xfa, 0x81, 0x1f, 0x95, 0x47, 0x1c, 0x67
+};
+static const u8 enc_assoc065[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce065[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key065[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input066[] __initconst = {
+ 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c,
+ 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17,
+ 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84,
+ 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41,
+ 0x9c, 0x43, 0x24, 0xa4, 0x61, 0x21, 0x03, 0x01,
+ 0xf4, 0x32, 0x5e, 0x49, 0xde, 0xa3, 0x73, 0xd4,
+ 0x77, 0x96, 0x00, 0x2d, 0x93, 0xa1, 0xd9, 0x1a,
+ 0x2c, 0x48, 0x4d, 0xd8, 0x81, 0x78, 0x02, 0x42
+};
+static const u8 enc_output066[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x30, 0x08, 0x74, 0xbb, 0x06, 0x92, 0xb6, 0x89,
+ 0xde, 0xad, 0x9a, 0xe1, 0x5b, 0x06, 0x73, 0x90
+};
+static const u8 enc_assoc066[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce066[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key066[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input067[] __initconst = {
+ 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c,
+ 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17,
+ 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84,
+ 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41,
+ 0x9c, 0x43, 0x24, 0xa4, 0x61, 0x21, 0x03, 0x01,
+ 0xf4, 0x32, 0x5e, 0x49, 0xde, 0xa3, 0x73, 0xd4,
+ 0x77, 0x96, 0x00, 0x2d, 0x93, 0xa1, 0xd9, 0x1a,
+ 0x2c, 0x48, 0x4d, 0xd8, 0x81, 0x78, 0x02, 0x42,
+ 0x05, 0x25, 0xbb, 0xbd, 0x3d, 0x96, 0x40, 0x05,
+ 0x2a, 0xd8, 0x0d, 0x8f, 0xd3, 0x09, 0x7a, 0xfd,
+ 0xc8, 0xb3, 0xa5, 0x1d, 0x99, 0xf3, 0xfa, 0x7f,
+ 0xe7, 0xe5, 0xb6, 0xc7, 0x3a, 0x6c, 0x6d, 0x3b,
+ 0xe4, 0x4d, 0x0d, 0x7b, 0xc9, 0xb9, 0x10, 0x38,
+ 0x8c, 0x0f, 0x4e, 0xc9, 0x62, 0x3c, 0xb7, 0x12,
+ 0x08, 0x2c, 0xf4, 0x3a, 0x09, 0x6d, 0x12, 0xc7,
+ 0x84, 0x53, 0xfe, 0x77, 0x47, 0xfb, 0x77, 0x38
+};
+static const u8 enc_output067[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x99, 0xca, 0xd8, 0x5f, 0x45, 0xca, 0x40, 0x94,
+ 0x2d, 0x0d, 0x4d, 0x5e, 0x95, 0x0a, 0xde, 0x22
+};
+static const u8 enc_assoc067[] __initconst = {
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff,
+ 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce067[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key067[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input068[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17,
+ 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84,
+ 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41
+};
+static const u8 enc_output068[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x8b, 0xbe, 0x14, 0x52, 0x72, 0xe7, 0xc2, 0xd9,
+ 0xa1, 0x89, 0x1a, 0x3a, 0xb0, 0x98, 0x3d, 0x9d
+};
+static const u8 enc_assoc068[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce068[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key068[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input069[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17,
+ 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84,
+ 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41,
+ 0xe3, 0xbc, 0xdb, 0x5b, 0xe1, 0x21, 0x03, 0x01,
+ 0x8b, 0xcd, 0xa1, 0xb6, 0x5e, 0xa3, 0x73, 0xd4,
+ 0x08, 0x69, 0xff, 0xd2, 0x13, 0xa1, 0xd9, 0x1a,
+ 0x53, 0xb7, 0xb2, 0x27, 0x01, 0x78, 0x02, 0x42
+};
+static const u8 enc_output069[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x3b, 0x41, 0x86, 0x19, 0x13, 0xa8, 0xf6, 0xde,
+ 0x7f, 0x61, 0xe2, 0x25, 0x63, 0x1b, 0xc3, 0x82
+};
+static const u8 enc_assoc069[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce069[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key069[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input070[] __initconst = {
+ 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c,
+ 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17,
+ 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84,
+ 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41,
+ 0xe3, 0xbc, 0xdb, 0x5b, 0xe1, 0x21, 0x03, 0x01,
+ 0x8b, 0xcd, 0xa1, 0xb6, 0x5e, 0xa3, 0x73, 0xd4,
+ 0x08, 0x69, 0xff, 0xd2, 0x13, 0xa1, 0xd9, 0x1a,
+ 0x53, 0xb7, 0xb2, 0x27, 0x01, 0x78, 0x02, 0x42,
+ 0x7a, 0xda, 0x44, 0x42, 0xbd, 0x96, 0x40, 0x05,
+ 0x55, 0x27, 0xf2, 0x70, 0x53, 0x09, 0x7a, 0xfd,
+ 0xb7, 0x4c, 0x5a, 0xe2, 0x19, 0xf3, 0xfa, 0x7f,
+ 0x98, 0x1a, 0x49, 0x38, 0xba, 0x6c, 0x6d, 0x3b,
+ 0x9b, 0xb2, 0xf2, 0x84, 0x49, 0xb9, 0x10, 0x38,
+ 0xf3, 0xf0, 0xb1, 0x36, 0xe2, 0x3c, 0xb7, 0x12,
+ 0x77, 0xd3, 0x0b, 0xc5, 0x89, 0x6d, 0x12, 0xc7,
+ 0xfb, 0xac, 0x01, 0x88, 0xc7, 0xfb, 0x77, 0x38
+};
+static const u8 enc_output070[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x84, 0x28, 0xbc, 0xf0, 0x23, 0xec, 0x6b, 0xf3,
+ 0x1f, 0xd9, 0xef, 0xb2, 0x03, 0xff, 0x08, 0x71
+};
+static const u8 enc_assoc070[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce070[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key070[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input071[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8,
+ 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe
+};
+static const u8 enc_output071[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0x13, 0x9f, 0xdf, 0x64, 0x74, 0xea, 0x24, 0xf5,
+ 0x49, 0xb0, 0x75, 0x82, 0x5f, 0x2c, 0x76, 0x20
+};
+static const u8 enc_assoc071[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce071[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key071[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input072[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8,
+ 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe,
+ 0x1c, 0x43, 0x24, 0xa4, 0x1e, 0xde, 0xfc, 0xfe,
+ 0x74, 0x32, 0x5e, 0x49, 0xa1, 0x5c, 0x8c, 0x2b,
+ 0xf7, 0x96, 0x00, 0x2d, 0xec, 0x5e, 0x26, 0xe5,
+ 0xac, 0x48, 0x4d, 0xd8, 0xfe, 0x87, 0xfd, 0xbd
+};
+static const u8 enc_output072[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xbb, 0xad, 0x8d, 0x86, 0x3b, 0x83, 0x5a, 0x8e,
+ 0x86, 0x64, 0xfd, 0x1d, 0x45, 0x66, 0xb6, 0xb4
+};
+static const u8 enc_assoc072[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce072[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key072[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - misc */
+static const u8 enc_input073[] __initconst = {
+ 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83,
+ 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8,
+ 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b,
+ 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe,
+ 0x1c, 0x43, 0x24, 0xa4, 0x1e, 0xde, 0xfc, 0xfe,
+ 0x74, 0x32, 0x5e, 0x49, 0xa1, 0x5c, 0x8c, 0x2b,
+ 0xf7, 0x96, 0x00, 0x2d, 0xec, 0x5e, 0x26, 0xe5,
+ 0xac, 0x48, 0x4d, 0xd8, 0xfe, 0x87, 0xfd, 0xbd,
+ 0x85, 0x25, 0xbb, 0xbd, 0x42, 0x69, 0xbf, 0xfa,
+ 0xaa, 0xd8, 0x0d, 0x8f, 0xac, 0xf6, 0x85, 0x02,
+ 0x48, 0xb3, 0xa5, 0x1d, 0xe6, 0x0c, 0x05, 0x80,
+ 0x67, 0xe5, 0xb6, 0xc7, 0x45, 0x93, 0x92, 0xc4,
+ 0x64, 0x4d, 0x0d, 0x7b, 0xb6, 0x46, 0xef, 0xc7,
+ 0x0c, 0x0f, 0x4e, 0xc9, 0x1d, 0xc3, 0x48, 0xed,
+ 0x88, 0x2c, 0xf4, 0x3a, 0x76, 0x92, 0xed, 0x38,
+ 0x04, 0x53, 0xfe, 0x77, 0x38, 0x04, 0x88, 0xc7
+};
+static const u8 enc_output073[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0x42, 0xf2, 0x35, 0x42, 0x97, 0x84, 0x9a, 0x51,
+ 0x1d, 0x53, 0xe5, 0x57, 0x17, 0x72, 0xf7, 0x1f
+};
+static const u8 enc_assoc073[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_nonce073[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00
+};
+static const u8 enc_key073[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input074[] __initconst = {
+ 0xd4, 0x50, 0x0b, 0xf0, 0x09, 0x49, 0x35, 0x51,
+ 0xc3, 0x80, 0xad, 0xf5, 0x2c, 0x57, 0x3a, 0x69,
+ 0xdf, 0x7e, 0x8b, 0x76, 0x24, 0x63, 0x33, 0x0f,
+ 0xac, 0xc1, 0x6a, 0x57, 0x26, 0xbe, 0x71, 0x90,
+ 0xc6, 0x3c, 0x5a, 0x1c, 0x92, 0x65, 0x84, 0xa0,
+ 0x96, 0x75, 0x68, 0x28, 0xdc, 0xdc, 0x64, 0xac,
+ 0xdf, 0x96, 0x3d, 0x93, 0x1b, 0xf1, 0xda, 0xe2,
+ 0x38, 0xf3, 0xf1, 0x57, 0x22, 0x4a, 0xc4, 0xb5,
+ 0x42, 0xd7, 0x85, 0xb0, 0xdd, 0x84, 0xdb, 0x6b,
+ 0xe3, 0xbc, 0x5a, 0x36, 0x63, 0xe8, 0x41, 0x49,
+ 0xff, 0xbe, 0xd0, 0x9e, 0x54, 0xf7, 0x8f, 0x16,
+ 0xa8, 0x22, 0x3b, 0x24, 0xcb, 0x01, 0x9f, 0x58,
+ 0xb2, 0x1b, 0x0e, 0x55, 0x1e, 0x7a, 0xa0, 0x73,
+ 0x27, 0x62, 0x95, 0x51, 0x37, 0x6c, 0xcb, 0xc3,
+ 0x93, 0x76, 0x71, 0xa0, 0x62, 0x9b, 0xd9, 0x5c,
+ 0x99, 0x15, 0xc7, 0x85, 0x55, 0x77, 0x1e, 0x7a
+};
+static const u8 enc_output074[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x0b, 0x30, 0x0d, 0x8d, 0xa5, 0x6c, 0x21, 0x85,
+ 0x75, 0x52, 0x79, 0x55, 0x3c, 0x4c, 0x82, 0xca
+};
+static const u8 enc_assoc074[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce074[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x00, 0x02, 0x50, 0x6e
+};
+static const u8 enc_key074[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input075[] __initconst = {
+ 0x7d, 0xe8, 0x7f, 0x67, 0x29, 0x94, 0x52, 0x75,
+ 0xd0, 0x65, 0x5d, 0xa4, 0xc7, 0xfd, 0xe4, 0x56,
+ 0x9e, 0x16, 0xf1, 0x11, 0xb5, 0xeb, 0x26, 0xc2,
+ 0x2d, 0x85, 0x9e, 0x3f, 0xf8, 0x22, 0xec, 0xed,
+ 0x3a, 0x6d, 0xd9, 0xa6, 0x0f, 0x22, 0x95, 0x7f,
+ 0x7b, 0x7c, 0x85, 0x7e, 0x88, 0x22, 0xeb, 0x9f,
+ 0xe0, 0xb8, 0xd7, 0x02, 0x21, 0x41, 0xf2, 0xd0,
+ 0xb4, 0x8f, 0x4b, 0x56, 0x12, 0xd3, 0x22, 0xa8,
+ 0x8d, 0xd0, 0xfe, 0x0b, 0x4d, 0x91, 0x79, 0x32,
+ 0x4f, 0x7c, 0x6c, 0x9e, 0x99, 0x0e, 0xfb, 0xd8,
+ 0x0e, 0x5e, 0xd6, 0x77, 0x58, 0x26, 0x49, 0x8b,
+ 0x1e, 0xfe, 0x0f, 0x71, 0xa0, 0xf3, 0xec, 0x5b,
+ 0x29, 0xcb, 0x28, 0xc2, 0x54, 0x0a, 0x7d, 0xcd,
+ 0x51, 0xb7, 0xda, 0xae, 0xe0, 0xff, 0x4a, 0x7f,
+ 0x3a, 0xc1, 0xee, 0x54, 0xc2, 0x9e, 0xe4, 0xc1,
+ 0x70, 0xde, 0x40, 0x8f, 0x66, 0x69, 0x21, 0x94
+};
+static const u8 enc_output075[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xc5, 0x78, 0xe2, 0xaa, 0x44, 0xd3, 0x09, 0xb7,
+ 0xb6, 0xa5, 0x19, 0x3b, 0xdc, 0x61, 0x18, 0xf5
+};
+static const u8 enc_assoc075[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce075[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x00, 0x03, 0x18, 0xa5
+};
+static const u8 enc_key075[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input076[] __initconst = {
+ 0x1b, 0x99, 0x6f, 0x9a, 0x3c, 0xcc, 0x67, 0x85,
+ 0xde, 0x22, 0xff, 0x5b, 0x8a, 0xdd, 0x95, 0x02,
+ 0xce, 0x03, 0xa0, 0xfa, 0xf5, 0x99, 0x2a, 0x09,
+ 0x52, 0x2c, 0xdd, 0x12, 0x06, 0xd2, 0x20, 0xb8,
+ 0xf8, 0xbd, 0x07, 0xd1, 0xf1, 0xf5, 0xa1, 0xbd,
+ 0x9a, 0x71, 0xd1, 0x1c, 0x7f, 0x57, 0x9b, 0x85,
+ 0x58, 0x18, 0xc0, 0x8d, 0x4d, 0xe0, 0x36, 0x39,
+ 0x31, 0x83, 0xb7, 0xf5, 0x90, 0xb3, 0x35, 0xae,
+ 0xd8, 0xde, 0x5b, 0x57, 0xb1, 0x3c, 0x5f, 0xed,
+ 0xe2, 0x44, 0x1c, 0x3e, 0x18, 0x4a, 0xa9, 0xd4,
+ 0x6e, 0x61, 0x59, 0x85, 0x06, 0xb3, 0xe1, 0x1c,
+ 0x43, 0xc6, 0x2c, 0xbc, 0xac, 0xec, 0xed, 0x33,
+ 0x19, 0x08, 0x75, 0xb0, 0x12, 0x21, 0x8b, 0x19,
+ 0x30, 0xfb, 0x7c, 0x38, 0xec, 0x45, 0xac, 0x11,
+ 0xc3, 0x53, 0xd0, 0xcf, 0x93, 0x8d, 0xcc, 0xb9,
+ 0xef, 0xad, 0x8f, 0xed, 0xbe, 0x46, 0xda, 0xa5
+};
+static const u8 enc_output076[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x4b, 0x0b, 0xda, 0x8a, 0xd0, 0x43, 0x83, 0x0d,
+ 0x83, 0x19, 0xab, 0x82, 0xc5, 0x0c, 0x76, 0x63
+};
+static const u8 enc_assoc076[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce076[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x07, 0xb4, 0xf0
+};
+static const u8 enc_key076[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input077[] __initconst = {
+ 0x86, 0xcb, 0xac, 0xae, 0x4d, 0x3f, 0x74, 0xae,
+ 0x01, 0x21, 0x3e, 0x05, 0x51, 0xcc, 0x15, 0x16,
+ 0x0e, 0xa1, 0xbe, 0x84, 0x08, 0xe3, 0xd5, 0xd7,
+ 0x4f, 0x01, 0x46, 0x49, 0x95, 0xa6, 0x9e, 0x61,
+ 0x76, 0xcb, 0x9e, 0x02, 0xb2, 0x24, 0x7e, 0xd2,
+ 0x99, 0x89, 0x2f, 0x91, 0x82, 0xa4, 0x5c, 0xaf,
+ 0x4c, 0x69, 0x40, 0x56, 0x11, 0x76, 0x6e, 0xdf,
+ 0xaf, 0xdc, 0x28, 0x55, 0x19, 0xea, 0x30, 0x48,
+ 0x0c, 0x44, 0xf0, 0x5e, 0x78, 0x1e, 0xac, 0xf8,
+ 0xfc, 0xec, 0xc7, 0x09, 0x0a, 0xbb, 0x28, 0xfa,
+ 0x5f, 0xd5, 0x85, 0xac, 0x8c, 0xda, 0x7e, 0x87,
+ 0x72, 0xe5, 0x94, 0xe4, 0xce, 0x6c, 0x88, 0x32,
+ 0x81, 0x93, 0x2e, 0x0f, 0x89, 0xf8, 0x77, 0xa1,
+ 0xf0, 0x4d, 0x9c, 0x32, 0xb0, 0x6c, 0xf9, 0x0b,
+ 0x0e, 0x76, 0x2b, 0x43, 0x0c, 0x4d, 0x51, 0x7c,
+ 0x97, 0x10, 0x70, 0x68, 0xf4, 0x98, 0xef, 0x7f
+};
+static const u8 enc_output077[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x4b, 0xc9, 0x8f, 0x72, 0xc4, 0x94, 0xc2, 0xa4,
+ 0x3c, 0x2b, 0x15, 0xa1, 0x04, 0x3f, 0x1c, 0xfa
+};
+static const u8 enc_assoc077[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce077[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0xfb, 0x66
+};
+static const u8 enc_key077[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input078[] __initconst = {
+ 0xfa, 0xb1, 0xcd, 0xdf, 0x4f, 0xe1, 0x98, 0xef,
+ 0x63, 0xad, 0xd8, 0x81, 0xd6, 0xea, 0xd6, 0xc5,
+ 0x76, 0x37, 0xbb, 0xe9, 0x20, 0x18, 0xca, 0x7c,
+ 0x0b, 0x96, 0xfb, 0xa0, 0x87, 0x1e, 0x93, 0x2d,
+ 0xb1, 0xfb, 0xf9, 0x07, 0x61, 0xbe, 0x25, 0xdf,
+ 0x8d, 0xfa, 0xf9, 0x31, 0xce, 0x57, 0x57, 0xe6,
+ 0x17, 0xb3, 0xd7, 0xa9, 0xf0, 0xbf, 0x0f, 0xfe,
+ 0x5d, 0x59, 0x1a, 0x33, 0xc1, 0x43, 0xb8, 0xf5,
+ 0x3f, 0xd0, 0xb5, 0xa1, 0x96, 0x09, 0xfd, 0x62,
+ 0xe5, 0xc2, 0x51, 0xa4, 0x28, 0x1a, 0x20, 0x0c,
+ 0xfd, 0xc3, 0x4f, 0x28, 0x17, 0x10, 0x40, 0x6f,
+ 0x4e, 0x37, 0x62, 0x54, 0x46, 0xff, 0x6e, 0xf2,
+ 0x24, 0x91, 0x3d, 0xeb, 0x0d, 0x89, 0xaf, 0x33,
+ 0x71, 0x28, 0xe3, 0xd1, 0x55, 0xd1, 0x6d, 0x3e,
+ 0xc3, 0x24, 0x60, 0x41, 0x43, 0x21, 0x43, 0xe9,
+ 0xab, 0x3a, 0x6d, 0x2c, 0xcc, 0x2f, 0x4d, 0x62
+};
+static const u8 enc_output078[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xf7, 0xe9, 0xe1, 0x51, 0xb0, 0x25, 0x33, 0xc7,
+ 0x46, 0x58, 0xbf, 0xc7, 0x73, 0x7c, 0x68, 0x0d
+};
+static const u8 enc_assoc078[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce078[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x38, 0xbb, 0x90
+};
+static const u8 enc_key078[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input079[] __initconst = {
+ 0x22, 0x72, 0x02, 0xbe, 0x7f, 0x35, 0x15, 0xe9,
+ 0xd1, 0xc0, 0x2e, 0xea, 0x2f, 0x19, 0x50, 0xb6,
+ 0x48, 0x1b, 0x04, 0x8a, 0x4c, 0x91, 0x50, 0x6c,
+ 0xb4, 0x0d, 0x50, 0x4e, 0x6c, 0x94, 0x9f, 0x82,
+ 0xd1, 0x97, 0xc2, 0x5a, 0xd1, 0x7d, 0xc7, 0x21,
+ 0x65, 0x11, 0x25, 0x78, 0x2a, 0xc7, 0xa7, 0x12,
+ 0x47, 0xfe, 0xae, 0xf3, 0x2f, 0x1f, 0x25, 0x0c,
+ 0xe4, 0xbb, 0x8f, 0x79, 0xac, 0xaa, 0x17, 0x9d,
+ 0x45, 0xa7, 0xb0, 0x54, 0x5f, 0x09, 0x24, 0x32,
+ 0x5e, 0xfa, 0x87, 0xd5, 0xe4, 0x41, 0xd2, 0x84,
+ 0x78, 0xc6, 0x1f, 0x22, 0x23, 0xee, 0x67, 0xc3,
+ 0xb4, 0x1f, 0x43, 0x94, 0x53, 0x5e, 0x2a, 0x24,
+ 0x36, 0x9a, 0x2e, 0x16, 0x61, 0x3c, 0x45, 0x94,
+ 0x90, 0xc1, 0x4f, 0xb1, 0xd7, 0x55, 0xfe, 0x53,
+ 0xfb, 0xe1, 0xee, 0x45, 0xb1, 0xb2, 0x1f, 0x71,
+ 0x62, 0xe2, 0xfc, 0xaa, 0x74, 0x2a, 0xbe, 0xfd
+};
+static const u8 enc_output079[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x79, 0x5b, 0xcf, 0xf6, 0x47, 0xc5, 0x53, 0xc2,
+ 0xe4, 0xeb, 0x6e, 0x0e, 0xaf, 0xd9, 0xe0, 0x4e
+};
+static const u8 enc_assoc079[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce079[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x70, 0x48, 0x4a
+};
+static const u8 enc_key079[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input080[] __initconst = {
+ 0xfa, 0xe5, 0x83, 0x45, 0xc1, 0x6c, 0xb0, 0xf5,
+ 0xcc, 0x53, 0x7f, 0x2b, 0x1b, 0x34, 0x69, 0xc9,
+ 0x69, 0x46, 0x3b, 0x3e, 0xa7, 0x1b, 0xcf, 0x6b,
+ 0x98, 0xd6, 0x69, 0xa8, 0xe6, 0x0e, 0x04, 0xfc,
+ 0x08, 0xd5, 0xfd, 0x06, 0x9c, 0x36, 0x26, 0x38,
+ 0xe3, 0x40, 0x0e, 0xf4, 0xcb, 0x24, 0x2e, 0x27,
+ 0xe2, 0x24, 0x5e, 0x68, 0xcb, 0x9e, 0xc5, 0x83,
+ 0xda, 0x53, 0x40, 0xb1, 0x2e, 0xdf, 0x42, 0x3b,
+ 0x73, 0x26, 0xad, 0x20, 0xfe, 0xeb, 0x57, 0xda,
+ 0xca, 0x2e, 0x04, 0x67, 0xa3, 0x28, 0x99, 0xb4,
+ 0x2d, 0xf8, 0xe5, 0x6d, 0x84, 0xe0, 0x06, 0xbc,
+ 0x8a, 0x7a, 0xcc, 0x73, 0x1e, 0x7c, 0x1f, 0x6b,
+ 0xec, 0xb5, 0x71, 0x9f, 0x70, 0x77, 0xf0, 0xd4,
+ 0xf4, 0xc6, 0x1a, 0xb1, 0x1e, 0xba, 0xc1, 0x00,
+ 0x18, 0x01, 0xce, 0x33, 0xc4, 0xe4, 0xa7, 0x7d,
+ 0x83, 0x1d, 0x3c, 0xe3, 0x4e, 0x84, 0x10, 0xe1
+};
+static const u8 enc_output080[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x19, 0x46, 0xd6, 0x53, 0x96, 0x0f, 0x94, 0x7a,
+ 0x74, 0xd3, 0xe8, 0x09, 0x3c, 0xf4, 0x85, 0x02
+};
+static const u8 enc_assoc080[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce080[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x93, 0x2f, 0x40
+};
+static const u8 enc_key080[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input081[] __initconst = {
+ 0xeb, 0xb2, 0x16, 0xdd, 0xd7, 0xca, 0x70, 0x92,
+ 0x15, 0xf5, 0x03, 0xdf, 0x9c, 0xe6, 0x3c, 0x5c,
+ 0xd2, 0x19, 0x4e, 0x7d, 0x90, 0x99, 0xe8, 0xa9,
+ 0x0b, 0x2a, 0xfa, 0xad, 0x5e, 0xba, 0x35, 0x06,
+ 0x99, 0x25, 0xa6, 0x03, 0xfd, 0xbc, 0x34, 0x1a,
+ 0xae, 0xd4, 0x15, 0x05, 0xb1, 0x09, 0x41, 0xfa,
+ 0x38, 0x56, 0xa7, 0xe2, 0x47, 0xb1, 0x04, 0x07,
+ 0x09, 0x74, 0x6c, 0xfc, 0x20, 0x96, 0xca, 0xa6,
+ 0x31, 0xb2, 0xff, 0xf4, 0x1c, 0x25, 0x05, 0x06,
+ 0xd8, 0x89, 0xc1, 0xc9, 0x06, 0x71, 0xad, 0xe8,
+ 0x53, 0xee, 0x63, 0x94, 0xc1, 0x91, 0x92, 0xa5,
+ 0xcf, 0x37, 0x10, 0xd1, 0x07, 0x30, 0x99, 0xe5,
+ 0xbc, 0x94, 0x65, 0x82, 0xfc, 0x0f, 0xab, 0x9f,
+ 0x54, 0x3c, 0x71, 0x6a, 0xe2, 0x48, 0x6a, 0x86,
+ 0x83, 0xfd, 0xca, 0x39, 0xd2, 0xe1, 0x4f, 0x23,
+ 0xd0, 0x0a, 0x58, 0x26, 0x64, 0xf4, 0xec, 0xb1
+};
+static const u8 enc_output081[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x36, 0xc3, 0x00, 0x29, 0x85, 0xdd, 0x21, 0xba,
+ 0xf8, 0x95, 0xd6, 0x33, 0x57, 0x3f, 0x12, 0xc0
+};
+static const u8 enc_assoc081[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce081[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0xe2, 0x93, 0x35
+};
+static const u8 enc_key081[] __initconst = {
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30,
+ 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input082[] __initconst = {
+ 0x40, 0x8a, 0xe6, 0xef, 0x1c, 0x7e, 0xf0, 0xfb,
+ 0x2c, 0x2d, 0x61, 0x08, 0x16, 0xfc, 0x78, 0x49,
+ 0xef, 0xa5, 0x8f, 0x78, 0x27, 0x3f, 0x5f, 0x16,
+ 0x6e, 0xa6, 0x5f, 0x81, 0xb5, 0x75, 0x74, 0x7d,
+ 0x03, 0x5b, 0x30, 0x40, 0xfe, 0xde, 0x1e, 0xb9,
+ 0x45, 0x97, 0x88, 0x66, 0x97, 0x88, 0x40, 0x8e,
+ 0x00, 0x41, 0x3b, 0x3e, 0x37, 0x6d, 0x15, 0x2d,
+ 0x20, 0x4a, 0xa2, 0xb7, 0xa8, 0x35, 0x58, 0xfc,
+ 0xd4, 0x8a, 0x0e, 0xf7, 0xa2, 0x6b, 0x1c, 0xd6,
+ 0xd3, 0x5d, 0x23, 0xb3, 0xf5, 0xdf, 0xe0, 0xca,
+ 0x77, 0xa4, 0xce, 0x32, 0xb9, 0x4a, 0xbf, 0x83,
+ 0xda, 0x2a, 0xef, 0xca, 0xf0, 0x68, 0x38, 0x08,
+ 0x79, 0xe8, 0x9f, 0xb0, 0xa3, 0x82, 0x95, 0x95,
+ 0xcf, 0x44, 0xc3, 0x85, 0x2a, 0xe2, 0xcc, 0x66,
+ 0x2b, 0x68, 0x9f, 0x93, 0x55, 0xd9, 0xc1, 0x83,
+ 0x80, 0x1f, 0x6a, 0xcc, 0x31, 0x3f, 0x89, 0x07
+};
+static const u8 enc_output082[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x65, 0x14, 0x51, 0x8e, 0x0a, 0x26, 0x41, 0x42,
+ 0xe0, 0xb7, 0x35, 0x1f, 0x96, 0x7f, 0xc2, 0xae
+};
+static const u8 enc_assoc082[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce082[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0xf7, 0xd5
+};
+static const u8 enc_key082[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input083[] __initconst = {
+ 0x0a, 0x0a, 0x24, 0x49, 0x9b, 0xca, 0xde, 0x58,
+ 0xcf, 0x15, 0x76, 0xc3, 0x12, 0xac, 0xa9, 0x84,
+ 0x71, 0x8c, 0xb4, 0xcc, 0x7e, 0x01, 0x53, 0xf5,
+ 0xa9, 0x01, 0x58, 0x10, 0x85, 0x96, 0x44, 0xdf,
+ 0xc0, 0x21, 0x17, 0x4e, 0x0b, 0x06, 0x0a, 0x39,
+ 0x74, 0x48, 0xde, 0x8b, 0x48, 0x4a, 0x86, 0x03,
+ 0xbe, 0x68, 0x0a, 0x69, 0x34, 0xc0, 0x90, 0x6f,
+ 0x30, 0xdd, 0x17, 0xea, 0xe2, 0xd4, 0xc5, 0xfa,
+ 0xa7, 0x77, 0xf8, 0xca, 0x53, 0x37, 0x0e, 0x08,
+ 0x33, 0x1b, 0x88, 0xc3, 0x42, 0xba, 0xc9, 0x59,
+ 0x78, 0x7b, 0xbb, 0x33, 0x93, 0x0e, 0x3b, 0x56,
+ 0xbe, 0x86, 0xda, 0x7f, 0x2a, 0x6e, 0xb1, 0xf9,
+ 0x40, 0x89, 0xd1, 0xd1, 0x81, 0x07, 0x4d, 0x43,
+ 0x02, 0xf8, 0xe0, 0x55, 0x2d, 0x0d, 0xe1, 0xfa,
+ 0xb3, 0x06, 0xa2, 0x1b, 0x42, 0xd4, 0xc3, 0xba,
+ 0x6e, 0x6f, 0x0c, 0xbc, 0xc8, 0x1e, 0x87, 0x7a
+};
+static const u8 enc_output083[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x4c, 0x19, 0x4d, 0xa6, 0xa9, 0x9f, 0xd6, 0x5b,
+ 0x40, 0xe9, 0xca, 0xd7, 0x98, 0xf4, 0x4b, 0x19
+};
+static const u8 enc_assoc083[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce083[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x3d, 0xfc, 0xe4
+};
+static const u8 enc_key083[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input084[] __initconst = {
+ 0x4a, 0x0a, 0xaf, 0xf8, 0x49, 0x47, 0x29, 0x18,
+ 0x86, 0x91, 0x70, 0x13, 0x40, 0xf3, 0xce, 0x2b,
+ 0x8a, 0x78, 0xee, 0xd3, 0xa0, 0xf0, 0x65, 0x99,
+ 0x4b, 0x72, 0x48, 0x4e, 0x79, 0x91, 0xd2, 0x5c,
+ 0x29, 0xaa, 0x07, 0x5e, 0xb1, 0xfc, 0x16, 0xde,
+ 0x93, 0xfe, 0x06, 0x90, 0x58, 0x11, 0x2a, 0xb2,
+ 0x84, 0xa3, 0xed, 0x18, 0x78, 0x03, 0x26, 0xd1,
+ 0x25, 0x8a, 0x47, 0x22, 0x2f, 0xa6, 0x33, 0xd8,
+ 0xb2, 0x9f, 0x3b, 0xd9, 0x15, 0x0b, 0x23, 0x9b,
+ 0x15, 0x46, 0xc2, 0xbb, 0x9b, 0x9f, 0x41, 0x0f,
+ 0xeb, 0xea, 0xd3, 0x96, 0x00, 0x0e, 0xe4, 0x77,
+ 0x70, 0x15, 0x32, 0xc3, 0xd0, 0xf5, 0xfb, 0xf8,
+ 0x95, 0xd2, 0x80, 0x19, 0x6d, 0x2f, 0x73, 0x7c,
+ 0x5e, 0x9f, 0xec, 0x50, 0xd9, 0x2b, 0xb0, 0xdf,
+ 0x5d, 0x7e, 0x51, 0x3b, 0xe5, 0xb8, 0xea, 0x97,
+ 0x13, 0x10, 0xd5, 0xbf, 0x16, 0xba, 0x7a, 0xee
+};
+static const u8 enc_output084[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xc8, 0xae, 0x77, 0x88, 0xcd, 0x28, 0x74, 0xab,
+ 0xc1, 0x38, 0x54, 0x1e, 0x11, 0xfd, 0x05, 0x87
+};
+static const u8 enc_assoc084[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce084[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x01, 0x84, 0x86, 0xa8
+};
+static const u8 enc_key084[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - checking for int overflows */
+static const u8 enc_input085[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x78, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x9c, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0xd4, 0xd2, 0x06, 0x61, 0x6f, 0x92, 0x93, 0xf6,
+ 0x5b, 0x45, 0xdb, 0xbc, 0x74, 0xe7, 0xc2, 0xed,
+ 0xfb, 0xcb, 0xbf, 0x1c, 0xfb, 0x67, 0x9b, 0xb7,
+ 0x39, 0xa5, 0x86, 0x2d, 0xe2, 0xbc, 0xb9, 0x37,
+ 0xf7, 0x4d, 0x5b, 0xf8, 0x67, 0x1c, 0x5a, 0x8a,
+ 0x50, 0x92, 0xf6, 0x1d, 0x54, 0xc9, 0xaa, 0x5b
+};
+static const u8 enc_output085[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x93, 0x3a, 0x51, 0x63, 0xc7, 0xf6, 0x23, 0x68,
+ 0x32, 0x7b, 0x3f, 0xbc, 0x10, 0x36, 0xc9, 0x43
+};
+static const u8 enc_assoc085[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce085[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key085[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input086[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output086[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
+};
+static const u8 enc_assoc086[] __initconst = {
+ 0x85, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xa6, 0x90, 0x2f, 0xcb, 0xc8, 0x83, 0xbb, 0xc1,
+ 0x80, 0xb2, 0x56, 0xae, 0x34, 0xad, 0x7f, 0x00
+};
+static const u8 enc_nonce086[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key086[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input087[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output087[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_assoc087[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x24, 0x7e, 0x50, 0x64, 0x2a, 0x1c, 0x0a, 0x2f,
+ 0x8f, 0x77, 0x21, 0x96, 0x09, 0xdb, 0xa9, 0x58
+};
+static const u8 enc_nonce087[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key087[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input088[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output088[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_assoc088[] __initconst = {
+ 0x7c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xd9, 0xe7, 0x2c, 0x06, 0x4a, 0xc8, 0x96, 0x1f,
+ 0x3f, 0xa5, 0x85, 0xe0, 0xe2, 0xab, 0xd6, 0x00
+};
+static const u8 enc_nonce088[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key088[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input089[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output089[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
+ 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80
+};
+static const u8 enc_assoc089[] __initconst = {
+ 0x65, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x95, 0xaf, 0x0f, 0x4d, 0x0b, 0x68, 0x6e, 0xae,
+ 0xcc, 0xca, 0x43, 0x07, 0xd5, 0x96, 0xf5, 0x02
+};
+static const u8 enc_nonce089[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key089[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input090[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output090[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f
+};
+static const u8 enc_assoc090[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x85, 0x40, 0xb4, 0x64, 0x35, 0x77, 0x07, 0xbe,
+ 0x3a, 0x39, 0xd5, 0x5c, 0x34, 0xf8, 0xbc, 0xb3
+};
+static const u8 enc_nonce090[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key090[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input091[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output091[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
+ 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00
+};
+static const u8 enc_assoc091[] __initconst = {
+ 0x4f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x66, 0x23, 0xd9, 0x90, 0xb8, 0x98, 0xd8, 0x30,
+ 0xd2, 0x12, 0xaf, 0x23, 0x83, 0x33, 0x07, 0x01
+};
+static const u8 enc_nonce091[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key091[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - special case tag */
+static const u8 enc_input092[] __initconst = {
+ 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6,
+ 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd,
+ 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b,
+ 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2,
+ 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19,
+ 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4,
+ 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63,
+ 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d
+};
+static const u8 enc_output092[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+static const u8 enc_assoc092[] __initconst = {
+ 0x83, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x5f, 0x16, 0xd0, 0x9f, 0x17, 0x78, 0x72, 0x11,
+ 0xb7, 0xd4, 0x84, 0xe0, 0x24, 0xf8, 0x97, 0x01
+};
+static const u8 enc_nonce092[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b
+};
+static const u8 enc_key092[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input093[] __initconst = {
+ 0x00, 0x52, 0x35, 0xd2, 0xa9, 0x19, 0xf2, 0x8d,
+ 0x3d, 0xb7, 0x66, 0x4a, 0x34, 0xae, 0x6b, 0x44,
+ 0x4d, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x5b, 0x8b, 0x94, 0x50, 0x9e, 0x2b, 0x74, 0xa3,
+ 0x6d, 0x34, 0x6e, 0x33, 0xd5, 0x72, 0x65, 0x9b,
+ 0xa9, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0x83, 0xdc, 0xe9, 0xf3, 0x07, 0x3e, 0xfa, 0xdb,
+ 0x7d, 0x23, 0xb8, 0x7a, 0xce, 0x35, 0x16, 0x8c
+};
+static const u8 enc_output093[] __initconst = {
+ 0x00, 0x39, 0xe2, 0xfd, 0x2f, 0xd3, 0x12, 0x14,
+ 0x9e, 0x98, 0x98, 0x80, 0x88, 0x48, 0x13, 0xe7,
+ 0xca, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x3b, 0x0e, 0x86, 0x9a, 0xaa, 0x8e, 0xa4, 0x96,
+ 0x32, 0xff, 0xff, 0x37, 0xb9, 0xe8, 0xce, 0x00,
+ 0xca, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x3b, 0x0e, 0x86, 0x9a, 0xaa, 0x8e, 0xa4, 0x96,
+ 0x32, 0xff, 0xff, 0x37, 0xb9, 0xe8, 0xce, 0x00,
+ 0xa5, 0x19, 0xac, 0x1a, 0x35, 0xb4, 0xa5, 0x77,
+ 0x87, 0x51, 0x0a, 0xf7, 0x8d, 0x8d, 0x20, 0x0a
+};
+static const u8 enc_assoc093[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce093[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key093[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input094[] __initconst = {
+ 0xd3, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xe5, 0xda, 0x78, 0x76, 0x6f, 0xa1, 0x92, 0x90,
+ 0xc0, 0x31, 0xf7, 0x52, 0x08, 0x50, 0x67, 0x45,
+ 0xae, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x49, 0x6d, 0xde, 0xb0, 0x55, 0x09, 0xc6, 0xef,
+ 0xff, 0xab, 0x75, 0xeb, 0x2d, 0xf4, 0xab, 0x09,
+ 0x76, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x01, 0x49, 0xef, 0x50, 0x4b, 0x71, 0xb1, 0x20,
+ 0xca, 0x4f, 0xf3, 0x95, 0x19, 0xc2, 0xc2, 0x10
+};
+static const u8 enc_output094[] __initconst = {
+ 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x62, 0x18, 0xb2, 0x7f, 0x83, 0xb8, 0xb4, 0x66,
+ 0x02, 0xf6, 0xe1, 0xd8, 0x34, 0x20, 0x7b, 0x02,
+ 0xce, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x2a, 0x64, 0x16, 0xce, 0xdb, 0x1c, 0xdd, 0x29,
+ 0x6e, 0xf5, 0xd7, 0xd6, 0x92, 0xda, 0xff, 0x02,
+ 0xce, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x2a, 0x64, 0x16, 0xce, 0xdb, 0x1c, 0xdd, 0x29,
+ 0x6e, 0xf5, 0xd7, 0xd6, 0x92, 0xda, 0xff, 0x02,
+ 0x30, 0x2f, 0xe8, 0x2a, 0xb0, 0xa0, 0x9a, 0xf6,
+ 0x44, 0x00, 0xd0, 0x15, 0xae, 0x83, 0xd9, 0xcc
+};
+static const u8 enc_assoc094[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce094[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key094[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input095[] __initconst = {
+ 0xe9, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x6d, 0xf1, 0x39, 0x4e, 0xdc, 0x53, 0x9b, 0x5b,
+ 0x3a, 0x09, 0x57, 0xbe, 0x0f, 0xb8, 0x59, 0x46,
+ 0x80, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xd1, 0x76, 0x9f, 0xe8, 0x06, 0xbb, 0xfe, 0xb6,
+ 0xf5, 0x90, 0x95, 0x0f, 0x2e, 0xac, 0x9e, 0x0a,
+ 0x58, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x99, 0x52, 0xae, 0x08, 0x18, 0xc3, 0x89, 0x79,
+ 0xc0, 0x74, 0x13, 0x71, 0x1a, 0x9a, 0xf7, 0x13
+};
+static const u8 enc_output095[] __initconst = {
+ 0xe9, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xea, 0x33, 0xf3, 0x47, 0x30, 0x4a, 0xbd, 0xad,
+ 0xf8, 0xce, 0x41, 0x34, 0x33, 0xc8, 0x45, 0x01,
+ 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xb2, 0x7f, 0x57, 0x96, 0x88, 0xae, 0xe5, 0x70,
+ 0x64, 0xce, 0x37, 0x32, 0x91, 0x82, 0xca, 0x01,
+ 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xb2, 0x7f, 0x57, 0x96, 0x88, 0xae, 0xe5, 0x70,
+ 0x64, 0xce, 0x37, 0x32, 0x91, 0x82, 0xca, 0x01,
+ 0x98, 0xa7, 0xe8, 0x36, 0xe0, 0xee, 0x4d, 0x02,
+ 0x35, 0x00, 0xd0, 0x55, 0x7e, 0xc2, 0xcb, 0xe0
+};
+static const u8 enc_assoc095[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce095[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key095[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input096[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x64, 0xf9, 0x0f, 0x5b, 0x26, 0x92, 0xb8, 0x60,
+ 0xd4, 0x59, 0x6f, 0xf4, 0xb3, 0x40, 0x2c, 0x5c,
+ 0x00, 0xb9, 0xbb, 0x53, 0x70, 0x7a, 0xa6, 0x67,
+ 0xd3, 0x56, 0xfe, 0x50, 0xc7, 0x19, 0x96, 0x94,
+ 0x03, 0x35, 0x61, 0xe7, 0xca, 0xca, 0x6d, 0x94,
+ 0x1d, 0xc3, 0xcd, 0x69, 0x14, 0xad, 0x69, 0x04
+};
+static const u8 enc_output096[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xe3, 0x3b, 0xc5, 0x52, 0xca, 0x8b, 0x9e, 0x96,
+ 0x16, 0x9e, 0x79, 0x7e, 0x8f, 0x30, 0x30, 0x1b,
+ 0x60, 0x3c, 0xa9, 0x99, 0x44, 0xdf, 0x76, 0x52,
+ 0x8c, 0x9d, 0x6f, 0x54, 0xab, 0x83, 0x3d, 0x0f,
+ 0x60, 0x3c, 0xa9, 0x99, 0x44, 0xdf, 0x76, 0x52,
+ 0x8c, 0x9d, 0x6f, 0x54, 0xab, 0x83, 0x3d, 0x0f,
+ 0x6a, 0xb8, 0xdc, 0xe2, 0xc5, 0x9d, 0xa4, 0x73,
+ 0x71, 0x30, 0xb0, 0x25, 0x2f, 0x68, 0xa8, 0xd8
+};
+static const u8 enc_assoc096[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce096[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key096[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input097[] __initconst = {
+ 0x68, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xb0, 0x8f, 0x25, 0x67, 0x5b, 0x9b, 0xcb, 0xf6,
+ 0xe3, 0x84, 0x07, 0xde, 0x2e, 0xc7, 0x5a, 0x47,
+ 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x2d, 0x2a, 0xf7, 0xcd, 0x6b, 0x08, 0x05, 0x01,
+ 0xd3, 0x1b, 0xa5, 0x4f, 0xb2, 0xeb, 0x75, 0x96,
+ 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x65, 0x0e, 0xc6, 0x2d, 0x75, 0x70, 0x72, 0xce,
+ 0xe6, 0xff, 0x23, 0x31, 0x86, 0xdd, 0x1c, 0x8f
+};
+static const u8 enc_output097[] __initconst = {
+ 0x68, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x37, 0x4d, 0xef, 0x6e, 0xb7, 0x82, 0xed, 0x00,
+ 0x21, 0x43, 0x11, 0x54, 0x12, 0xb7, 0x46, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x4e, 0x23, 0x3f, 0xb3, 0xe5, 0x1d, 0x1e, 0xc7,
+ 0x42, 0x45, 0x07, 0x72, 0x0d, 0xc5, 0x21, 0x9d,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x4e, 0x23, 0x3f, 0xb3, 0xe5, 0x1d, 0x1e, 0xc7,
+ 0x42, 0x45, 0x07, 0x72, 0x0d, 0xc5, 0x21, 0x9d,
+ 0x04, 0x4d, 0xea, 0x60, 0x88, 0x80, 0x41, 0x2b,
+ 0xfd, 0xff, 0xcf, 0x35, 0x57, 0x9e, 0x9b, 0x26
+};
+static const u8 enc_assoc097[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce097[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key097[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input098[] __initconst = {
+ 0x6d, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xa1, 0x61, 0xb5, 0xab, 0x04, 0x09, 0x00, 0x62,
+ 0x9e, 0xfe, 0xff, 0x78, 0xd7, 0xd8, 0x6b, 0x45,
+ 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xc6, 0xf8, 0x07, 0x8c, 0xc8, 0xef, 0x12, 0xa0,
+ 0xff, 0x65, 0x7d, 0x6d, 0x08, 0xdb, 0x10, 0xb8,
+ 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x8e, 0xdc, 0x36, 0x6c, 0xd6, 0x97, 0x65, 0x6f,
+ 0xca, 0x81, 0xfb, 0x13, 0x3c, 0xed, 0x79, 0xa1
+};
+static const u8 enc_output098[] __initconst = {
+ 0x6d, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x26, 0xa3, 0x7f, 0xa2, 0xe8, 0x10, 0x26, 0x94,
+ 0x5c, 0x39, 0xe9, 0xf2, 0xeb, 0xa8, 0x77, 0x02,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xa5, 0xf1, 0xcf, 0xf2, 0x46, 0xfa, 0x09, 0x66,
+ 0x6e, 0x3b, 0xdf, 0x50, 0xb7, 0xf5, 0x44, 0xb3,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xa5, 0xf1, 0xcf, 0xf2, 0x46, 0xfa, 0x09, 0x66,
+ 0x6e, 0x3b, 0xdf, 0x50, 0xb7, 0xf5, 0x44, 0xb3,
+ 0x1e, 0x6b, 0xea, 0x63, 0x14, 0x54, 0x2e, 0x2e,
+ 0xf9, 0xff, 0xcf, 0x45, 0x0b, 0x2e, 0x98, 0x2b
+};
+static const u8 enc_assoc098[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce098[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key098[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input099[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xfc, 0x01, 0xb8, 0x91, 0xe5, 0xf0, 0xf9, 0x12,
+ 0x8d, 0x7d, 0x1c, 0x57, 0x91, 0x92, 0xb6, 0x98,
+ 0x63, 0x41, 0x44, 0x15, 0xb6, 0x99, 0x68, 0x95,
+ 0x9a, 0x72, 0x91, 0xb7, 0xa5, 0xaf, 0x13, 0x48,
+ 0x60, 0xcd, 0x9e, 0xa1, 0x0c, 0x29, 0xa3, 0x66,
+ 0x54, 0xe7, 0xa2, 0x8e, 0x76, 0x1b, 0xec, 0xd8
+};
+static const u8 enc_output099[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x7b, 0xc3, 0x72, 0x98, 0x09, 0xe9, 0xdf, 0xe4,
+ 0x4f, 0xba, 0x0a, 0xdd, 0xad, 0xe2, 0xaa, 0xdf,
+ 0x03, 0xc4, 0x56, 0xdf, 0x82, 0x3c, 0xb8, 0xa0,
+ 0xc5, 0xb9, 0x00, 0xb3, 0xc9, 0x35, 0xb8, 0xd3,
+ 0x03, 0xc4, 0x56, 0xdf, 0x82, 0x3c, 0xb8, 0xa0,
+ 0xc5, 0xb9, 0x00, 0xb3, 0xc9, 0x35, 0xb8, 0xd3,
+ 0xed, 0x20, 0x17, 0xc8, 0xdb, 0xa4, 0x77, 0x56,
+ 0x29, 0x04, 0x9d, 0x78, 0x6e, 0x3b, 0xce, 0xb1
+};
+static const u8 enc_assoc099[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce099[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key099[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input100[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x6b, 0x6d, 0xc9, 0xd2, 0x1a, 0x81, 0x9e, 0x70,
+ 0xb5, 0x77, 0xf4, 0x41, 0x37, 0xd3, 0xd6, 0xbd,
+ 0x13, 0x35, 0xf5, 0xeb, 0x44, 0x49, 0x40, 0x77,
+ 0xb2, 0x64, 0x49, 0xa5, 0x4b, 0x6c, 0x7c, 0x75,
+ 0x10, 0xb9, 0x2f, 0x5f, 0xfe, 0xf9, 0x8b, 0x84,
+ 0x7c, 0xf1, 0x7a, 0x9c, 0x98, 0xd8, 0x83, 0xe5
+};
+static const u8 enc_output100[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xec, 0xaf, 0x03, 0xdb, 0xf6, 0x98, 0xb8, 0x86,
+ 0x77, 0xb0, 0xe2, 0xcb, 0x0b, 0xa3, 0xca, 0xfa,
+ 0x73, 0xb0, 0xe7, 0x21, 0x70, 0xec, 0x90, 0x42,
+ 0xed, 0xaf, 0xd8, 0xa1, 0x27, 0xf6, 0xd7, 0xee,
+ 0x73, 0xb0, 0xe7, 0x21, 0x70, 0xec, 0x90, 0x42,
+ 0xed, 0xaf, 0xd8, 0xa1, 0x27, 0xf6, 0xd7, 0xee,
+ 0x07, 0x3f, 0x17, 0xcb, 0x67, 0x78, 0x64, 0x59,
+ 0x25, 0x04, 0x9d, 0x88, 0x22, 0xcb, 0xca, 0xb6
+};
+static const u8 enc_assoc100[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce100[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key100[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input101[] __initconst = {
+ 0xff, 0xcb, 0x2b, 0x11, 0x06, 0xf8, 0x23, 0x4c,
+ 0x5e, 0x99, 0xd4, 0xdb, 0x4c, 0x70, 0x48, 0xde,
+ 0x32, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x16, 0xe9, 0x88, 0x4a, 0x11, 0x4f, 0x0e, 0x92,
+ 0x66, 0xce, 0xa3, 0x88, 0x5f, 0xe3, 0x6b, 0x9f,
+ 0xd6, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xce, 0xbe, 0xf5, 0xe9, 0x88, 0x5a, 0x80, 0xea,
+ 0x76, 0xd9, 0x75, 0xc1, 0x44, 0xa4, 0x18, 0x88
+};
+static const u8 enc_output101[] __initconst = {
+ 0xff, 0xa0, 0xfc, 0x3e, 0x80, 0x32, 0xc3, 0xd5,
+ 0xfd, 0xb6, 0x2a, 0x11, 0xf0, 0x96, 0x30, 0x7d,
+ 0xb5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x76, 0x6c, 0x9a, 0x80, 0x25, 0xea, 0xde, 0xa7,
+ 0x39, 0x05, 0x32, 0x8c, 0x33, 0x79, 0xc0, 0x04,
+ 0xb5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x76, 0x6c, 0x9a, 0x80, 0x25, 0xea, 0xde, 0xa7,
+ 0x39, 0x05, 0x32, 0x8c, 0x33, 0x79, 0xc0, 0x04,
+ 0x8b, 0x9b, 0xb4, 0xb4, 0x86, 0x12, 0x89, 0x65,
+ 0x8c, 0x69, 0x6a, 0x83, 0x40, 0x15, 0x04, 0x05
+};
+static const u8 enc_assoc101[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce101[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key101[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input102[] __initconst = {
+ 0x6f, 0x9e, 0x70, 0xed, 0x3b, 0x8b, 0xac, 0xa0,
+ 0x26, 0xe4, 0x6a, 0x5a, 0x09, 0x43, 0x15, 0x8d,
+ 0x21, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x0c, 0x61, 0x2c, 0x5e, 0x8d, 0x89, 0xa8, 0x73,
+ 0xdb, 0xca, 0xad, 0x5b, 0x73, 0x46, 0x42, 0x9b,
+ 0xc5, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xd4, 0x36, 0x51, 0xfd, 0x14, 0x9c, 0x26, 0x0b,
+ 0xcb, 0xdd, 0x7b, 0x12, 0x68, 0x01, 0x31, 0x8c
+};
+static const u8 enc_output102[] __initconst = {
+ 0x6f, 0xf5, 0xa7, 0xc2, 0xbd, 0x41, 0x4c, 0x39,
+ 0x85, 0xcb, 0x94, 0x90, 0xb5, 0xa5, 0x6d, 0x2e,
+ 0xa6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x6c, 0xe4, 0x3e, 0x94, 0xb9, 0x2c, 0x78, 0x46,
+ 0x84, 0x01, 0x3c, 0x5f, 0x1f, 0xdc, 0xe9, 0x00,
+ 0xa6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x6c, 0xe4, 0x3e, 0x94, 0xb9, 0x2c, 0x78, 0x46,
+ 0x84, 0x01, 0x3c, 0x5f, 0x1f, 0xdc, 0xe9, 0x00,
+ 0x8b, 0x3b, 0xbd, 0x51, 0x64, 0x44, 0x59, 0x56,
+ 0x8d, 0x81, 0xca, 0x1f, 0xa7, 0x2c, 0xe4, 0x04
+};
+static const u8 enc_assoc102[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce102[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key102[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input103[] __initconst = {
+ 0x41, 0x2b, 0x08, 0x0a, 0x3e, 0x19, 0xc1, 0x0d,
+ 0x44, 0xa1, 0xaf, 0x1e, 0xab, 0xde, 0xb4, 0xce,
+ 0x35, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x6b, 0x83, 0x94, 0x33, 0x09, 0x21, 0x48, 0x6c,
+ 0xa1, 0x1d, 0x29, 0x1c, 0x3e, 0x97, 0xee, 0x9a,
+ 0xd1, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xb3, 0xd4, 0xe9, 0x90, 0x90, 0x34, 0xc6, 0x14,
+ 0xb1, 0x0a, 0xff, 0x55, 0x25, 0xd0, 0x9d, 0x8d
+};
+static const u8 enc_output103[] __initconst = {
+ 0x41, 0x40, 0xdf, 0x25, 0xb8, 0xd3, 0x21, 0x94,
+ 0xe7, 0x8e, 0x51, 0xd4, 0x17, 0x38, 0xcc, 0x6d,
+ 0xb2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x0b, 0x06, 0x86, 0xf9, 0x3d, 0x84, 0x98, 0x59,
+ 0xfe, 0xd6, 0xb8, 0x18, 0x52, 0x0d, 0x45, 0x01,
+ 0xb2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x0b, 0x06, 0x86, 0xf9, 0x3d, 0x84, 0x98, 0x59,
+ 0xfe, 0xd6, 0xb8, 0x18, 0x52, 0x0d, 0x45, 0x01,
+ 0x86, 0xfb, 0xab, 0x2b, 0x4a, 0x94, 0xf4, 0x7a,
+ 0xa5, 0x6f, 0x0a, 0xea, 0x65, 0xd1, 0x10, 0x08
+};
+static const u8 enc_assoc103[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce103[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key103[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input104[] __initconst = {
+ 0xb2, 0x47, 0xa7, 0x47, 0x23, 0x49, 0x1a, 0xac,
+ 0xac, 0xaa, 0xd7, 0x09, 0xc9, 0x1e, 0x93, 0x2b,
+ 0x31, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x9a, 0xde, 0x04, 0xe7, 0x5b, 0xb7, 0x01, 0xd9,
+ 0x66, 0x06, 0x01, 0xb3, 0x47, 0x65, 0xde, 0x98,
+ 0xd5, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0x42, 0x89, 0x79, 0x44, 0xc2, 0xa2, 0x8f, 0xa1,
+ 0x76, 0x11, 0xd7, 0xfa, 0x5c, 0x22, 0xad, 0x8f
+};
+static const u8 enc_output104[] __initconst = {
+ 0xb2, 0x2c, 0x70, 0x68, 0xa5, 0x83, 0xfa, 0x35,
+ 0x0f, 0x85, 0x29, 0xc3, 0x75, 0xf8, 0xeb, 0x88,
+ 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xfa, 0x5b, 0x16, 0x2d, 0x6f, 0x12, 0xd1, 0xec,
+ 0x39, 0xcd, 0x90, 0xb7, 0x2b, 0xff, 0x75, 0x03,
+ 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xfa, 0x5b, 0x16, 0x2d, 0x6f, 0x12, 0xd1, 0xec,
+ 0x39, 0xcd, 0x90, 0xb7, 0x2b, 0xff, 0x75, 0x03,
+ 0xa0, 0x19, 0xac, 0x2e, 0xd6, 0x67, 0xe1, 0x7d,
+ 0xa1, 0x6f, 0x0a, 0xfa, 0x19, 0x61, 0x0d, 0x0d
+};
+static const u8 enc_assoc104[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce104[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key104[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input105[] __initconst = {
+ 0x74, 0x0f, 0x9e, 0x49, 0xf6, 0x10, 0xef, 0xa5,
+ 0x85, 0xb6, 0x59, 0xca, 0x6e, 0xd8, 0xb4, 0x99,
+ 0x2d, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x41, 0x2d, 0x96, 0xaf, 0xbe, 0x80, 0xec, 0x3e,
+ 0x79, 0xd4, 0x51, 0xb0, 0x0a, 0x2d, 0xb2, 0x9a,
+ 0xc9, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0x99, 0x7a, 0xeb, 0x0c, 0x27, 0x95, 0x62, 0x46,
+ 0x69, 0xc3, 0x87, 0xf9, 0x11, 0x6a, 0xc1, 0x8d
+};
+static const u8 enc_output105[] __initconst = {
+ 0x74, 0x64, 0x49, 0x66, 0x70, 0xda, 0x0f, 0x3c,
+ 0x26, 0x99, 0xa7, 0x00, 0xd2, 0x3e, 0xcc, 0x3a,
+ 0xaa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x21, 0xa8, 0x84, 0x65, 0x8a, 0x25, 0x3c, 0x0b,
+ 0x26, 0x1f, 0xc0, 0xb4, 0x66, 0xb7, 0x19, 0x01,
+ 0xaa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x21, 0xa8, 0x84, 0x65, 0x8a, 0x25, 0x3c, 0x0b,
+ 0x26, 0x1f, 0xc0, 0xb4, 0x66, 0xb7, 0x19, 0x01,
+ 0x73, 0x6e, 0x18, 0x18, 0x16, 0x96, 0xa5, 0x88,
+ 0x9c, 0x31, 0x59, 0xfa, 0xab, 0xab, 0x20, 0xfd
+};
+static const u8 enc_assoc105[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce105[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key105[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input106[] __initconst = {
+ 0xad, 0xba, 0x5d, 0x10, 0x5b, 0xc8, 0xaa, 0x06,
+ 0x2c, 0x23, 0x36, 0xcb, 0x88, 0x9d, 0xdb, 0xd5,
+ 0x37, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x17, 0x7c, 0x5f, 0xfe, 0x28, 0x75, 0xf4, 0x68,
+ 0xf6, 0xc2, 0x96, 0x57, 0x48, 0xf3, 0x59, 0x9a,
+ 0xd3, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xcf, 0x2b, 0x22, 0x5d, 0xb1, 0x60, 0x7a, 0x10,
+ 0xe6, 0xd5, 0x40, 0x1e, 0x53, 0xb4, 0x2a, 0x8d
+};
+static const u8 enc_output106[] __initconst = {
+ 0xad, 0xd1, 0x8a, 0x3f, 0xdd, 0x02, 0x4a, 0x9f,
+ 0x8f, 0x0c, 0xc8, 0x01, 0x34, 0x7b, 0xa3, 0x76,
+ 0xb0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x77, 0xf9, 0x4d, 0x34, 0x1c, 0xd0, 0x24, 0x5d,
+ 0xa9, 0x09, 0x07, 0x53, 0x24, 0x69, 0xf2, 0x01,
+ 0xb0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x77, 0xf9, 0x4d, 0x34, 0x1c, 0xd0, 0x24, 0x5d,
+ 0xa9, 0x09, 0x07, 0x53, 0x24, 0x69, 0xf2, 0x01,
+ 0xba, 0xd5, 0x8f, 0x10, 0xa9, 0x1e, 0x6a, 0x88,
+ 0x9a, 0xba, 0x32, 0xfd, 0x17, 0xd8, 0x33, 0x1a
+};
+static const u8 enc_assoc106[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce106[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key106[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input107[] __initconst = {
+ 0xfe, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xc0, 0x01, 0xed, 0xc5, 0xda, 0x44, 0x2e, 0x71,
+ 0x9b, 0xce, 0x9a, 0xbe, 0x27, 0x3a, 0xf1, 0x44,
+ 0xb4, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x48, 0x02, 0x5f, 0x41, 0xfa, 0x4e, 0x33, 0x6c,
+ 0x78, 0x69, 0x57, 0xa2, 0xa7, 0xc4, 0x93, 0x0a,
+ 0x6c, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x00, 0x26, 0x6e, 0xa1, 0xe4, 0x36, 0x44, 0xa3,
+ 0x4d, 0x8d, 0xd1, 0xdc, 0x93, 0xf2, 0xfa, 0x13
+};
+static const u8 enc_output107[] __initconst = {
+ 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x47, 0xc3, 0x27, 0xcc, 0x36, 0x5d, 0x08, 0x87,
+ 0x59, 0x09, 0x8c, 0x34, 0x1b, 0x4a, 0xed, 0x03,
+ 0xd4, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x2b, 0x0b, 0x97, 0x3f, 0x74, 0x5b, 0x28, 0xaa,
+ 0xe9, 0x37, 0xf5, 0x9f, 0x18, 0xea, 0xc7, 0x01,
+ 0xd4, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x2b, 0x0b, 0x97, 0x3f, 0x74, 0x5b, 0x28, 0xaa,
+ 0xe9, 0x37, 0xf5, 0x9f, 0x18, 0xea, 0xc7, 0x01,
+ 0xd6, 0x8c, 0xe1, 0x74, 0x07, 0x9a, 0xdd, 0x02,
+ 0x8d, 0xd0, 0x5c, 0xf8, 0x14, 0x63, 0x04, 0x88
+};
+static const u8 enc_assoc107[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce107[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key107[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input108[] __initconst = {
+ 0xb5, 0x13, 0xb0, 0x6a, 0xb9, 0xac, 0x14, 0x43,
+ 0x5a, 0xcb, 0x8a, 0xa3, 0xa3, 0x7a, 0xfd, 0xb6,
+ 0x54, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x61, 0x95, 0x01, 0x93, 0xb1, 0xbf, 0x03, 0x11,
+ 0xff, 0x11, 0x79, 0x89, 0xae, 0xd9, 0xa9, 0x99,
+ 0xb0, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xb9, 0xc2, 0x7c, 0x30, 0x28, 0xaa, 0x8d, 0x69,
+ 0xef, 0x06, 0xaf, 0xc0, 0xb5, 0x9e, 0xda, 0x8e
+};
+static const u8 enc_output108[] __initconst = {
+ 0xb5, 0x78, 0x67, 0x45, 0x3f, 0x66, 0xf4, 0xda,
+ 0xf9, 0xe4, 0x74, 0x69, 0x1f, 0x9c, 0x85, 0x15,
+ 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x01, 0x10, 0x13, 0x59, 0x85, 0x1a, 0xd3, 0x24,
+ 0xa0, 0xda, 0xe8, 0x8d, 0xc2, 0x43, 0x02, 0x02,
+ 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x01, 0x10, 0x13, 0x59, 0x85, 0x1a, 0xd3, 0x24,
+ 0xa0, 0xda, 0xe8, 0x8d, 0xc2, 0x43, 0x02, 0x02,
+ 0xaa, 0x48, 0xa3, 0x88, 0x7d, 0x4b, 0x05, 0x96,
+ 0x99, 0xc2, 0xfd, 0xf9, 0xc6, 0x78, 0x7e, 0x0a
+};
+static const u8 enc_assoc108[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce108[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key108[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input109[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xd4, 0xf1, 0x09, 0xe8, 0x14, 0xce, 0xa8, 0x5a,
+ 0x08, 0xc0, 0x11, 0xd8, 0x50, 0xdd, 0x1d, 0xcb,
+ 0xcf, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x53, 0x40, 0xb8, 0x5a, 0x9a, 0xa0, 0x82, 0x96,
+ 0xb7, 0x7a, 0x5f, 0xc3, 0x96, 0x1f, 0x66, 0x0f,
+ 0x17, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x1b, 0x64, 0x89, 0xba, 0x84, 0xd8, 0xf5, 0x59,
+ 0x82, 0x9e, 0xd9, 0xbd, 0xa2, 0x29, 0x0f, 0x16
+};
+static const u8 enc_output109[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x53, 0x33, 0xc3, 0xe1, 0xf8, 0xd7, 0x8e, 0xac,
+ 0xca, 0x07, 0x07, 0x52, 0x6c, 0xad, 0x01, 0x8c,
+ 0xaf, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x30, 0x49, 0x70, 0x24, 0x14, 0xb5, 0x99, 0x50,
+ 0x26, 0x24, 0xfd, 0xfe, 0x29, 0x31, 0x32, 0x04,
+ 0xaf, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x30, 0x49, 0x70, 0x24, 0x14, 0xb5, 0x99, 0x50,
+ 0x26, 0x24, 0xfd, 0xfe, 0x29, 0x31, 0x32, 0x04,
+ 0xb9, 0x36, 0xa8, 0x17, 0xf2, 0x21, 0x1a, 0xf1,
+ 0x29, 0xe2, 0xcf, 0x16, 0x0f, 0xd4, 0x2b, 0xcb
+};
+static const u8 enc_assoc109[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce109[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key109[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input110[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xdf, 0x4c, 0x62, 0x03, 0x2d, 0x41, 0x19, 0xb5,
+ 0x88, 0x47, 0x7e, 0x99, 0x92, 0x5a, 0x56, 0xd9,
+ 0xd6, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xfa, 0x84, 0xf0, 0x64, 0x55, 0x36, 0x42, 0x1b,
+ 0x2b, 0xb9, 0x24, 0x6e, 0xc2, 0x19, 0xed, 0x0b,
+ 0x0e, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0xb2, 0xa0, 0xc1, 0x84, 0x4b, 0x4e, 0x35, 0xd4,
+ 0x1e, 0x5d, 0xa2, 0x10, 0xf6, 0x2f, 0x84, 0x12
+};
+static const u8 enc_output110[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x58, 0x8e, 0xa8, 0x0a, 0xc1, 0x58, 0x3f, 0x43,
+ 0x4a, 0x80, 0x68, 0x13, 0xae, 0x2a, 0x4a, 0x9e,
+ 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x99, 0x8d, 0x38, 0x1a, 0xdb, 0x23, 0x59, 0xdd,
+ 0xba, 0xe7, 0x86, 0x53, 0x7d, 0x37, 0xb9, 0x00,
+ 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x99, 0x8d, 0x38, 0x1a, 0xdb, 0x23, 0x59, 0xdd,
+ 0xba, 0xe7, 0x86, 0x53, 0x7d, 0x37, 0xb9, 0x00,
+ 0x9f, 0x7a, 0xc4, 0x35, 0x1f, 0x6b, 0x91, 0xe6,
+ 0x30, 0x97, 0xa7, 0x13, 0x11, 0x5d, 0x05, 0xbe
+};
+static const u8 enc_assoc110[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce110[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key110[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input111[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x13, 0xf8, 0x0a, 0x00, 0x6d, 0xc1, 0xbb, 0xda,
+ 0xd6, 0x39, 0xa9, 0x2f, 0xc7, 0xec, 0xa6, 0x55,
+ 0xf7, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x63, 0x48, 0xb8, 0xfd, 0x29, 0xbf, 0x96, 0xd5,
+ 0x63, 0xa5, 0x17, 0xe2, 0x7d, 0x7b, 0xfc, 0x0f,
+ 0x2f, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x2b, 0x6c, 0x89, 0x1d, 0x37, 0xc7, 0xe1, 0x1a,
+ 0x56, 0x41, 0x91, 0x9c, 0x49, 0x4d, 0x95, 0x16
+};
+static const u8 enc_output111[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x94, 0x3a, 0xc0, 0x09, 0x81, 0xd8, 0x9d, 0x2c,
+ 0x14, 0xfe, 0xbf, 0xa5, 0xfb, 0x9c, 0xba, 0x12,
+ 0x97, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x41, 0x70, 0x83, 0xa7, 0xaa, 0x8d, 0x13,
+ 0xf2, 0xfb, 0xb5, 0xdf, 0xc2, 0x55, 0xa8, 0x04,
+ 0x97, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x41, 0x70, 0x83, 0xa7, 0xaa, 0x8d, 0x13,
+ 0xf2, 0xfb, 0xb5, 0xdf, 0xc2, 0x55, 0xa8, 0x04,
+ 0x9a, 0x18, 0xa8, 0x28, 0x07, 0x02, 0x69, 0xf4,
+ 0x47, 0x00, 0xd0, 0x09, 0xe7, 0x17, 0x1c, 0xc9
+};
+static const u8 enc_assoc111[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce111[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key111[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input112[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x82, 0xe5, 0x9b, 0x45, 0x82, 0x91, 0x50, 0x38,
+ 0xf9, 0x33, 0x81, 0x1e, 0x65, 0x2d, 0xc6, 0x6a,
+ 0xfc, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xb6, 0x71, 0xc8, 0xca, 0xc2, 0x70, 0xc2, 0x65,
+ 0xa0, 0xac, 0x2f, 0x53, 0x57, 0x99, 0x88, 0x0a,
+ 0x24, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0xfe, 0x55, 0xf9, 0x2a, 0xdc, 0x08, 0xb5, 0xaa,
+ 0x95, 0x48, 0xa9, 0x2d, 0x63, 0xaf, 0xe1, 0x13
+};
+static const u8 enc_output112[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x05, 0x27, 0x51, 0x4c, 0x6e, 0x88, 0x76, 0xce,
+ 0x3b, 0xf4, 0x97, 0x94, 0x59, 0x5d, 0xda, 0x2d,
+ 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xd5, 0x78, 0x00, 0xb4, 0x4c, 0x65, 0xd9, 0xa3,
+ 0x31, 0xf2, 0x8d, 0x6e, 0xe8, 0xb7, 0xdc, 0x01,
+ 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xd5, 0x78, 0x00, 0xb4, 0x4c, 0x65, 0xd9, 0xa3,
+ 0x31, 0xf2, 0x8d, 0x6e, 0xe8, 0xb7, 0xdc, 0x01,
+ 0xb4, 0x36, 0xa8, 0x2b, 0x93, 0xd5, 0x55, 0xf7,
+ 0x43, 0x00, 0xd0, 0x19, 0x9b, 0xa7, 0x18, 0xce
+};
+static const u8 enc_assoc112[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce112[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key112[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input113[] __initconst = {
+ 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0xf1, 0xd1, 0x28, 0x87, 0xb7, 0x21, 0x69, 0x86,
+ 0xa1, 0x2d, 0x79, 0x09, 0x8b, 0x6d, 0xe6, 0x0f,
+ 0xc0, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xa7, 0xc7, 0x58, 0x99, 0xf3, 0xe6, 0x0a, 0xf1,
+ 0xfc, 0xb6, 0xc7, 0x30, 0x7d, 0x87, 0x59, 0x0f,
+ 0x18, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0xef, 0xe3, 0x69, 0x79, 0xed, 0x9e, 0x7d, 0x3e,
+ 0xc9, 0x52, 0x41, 0x4e, 0x49, 0xb1, 0x30, 0x16
+};
+static const u8 enc_output113[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x76, 0x13, 0xe2, 0x8e, 0x5b, 0x38, 0x4f, 0x70,
+ 0x63, 0xea, 0x6f, 0x83, 0xb7, 0x1d, 0xfa, 0x48,
+ 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xc4, 0xce, 0x90, 0xe7, 0x7d, 0xf3, 0x11, 0x37,
+ 0x6d, 0xe8, 0x65, 0x0d, 0xc2, 0xa9, 0x0d, 0x04,
+ 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xc4, 0xce, 0x90, 0xe7, 0x7d, 0xf3, 0x11, 0x37,
+ 0x6d, 0xe8, 0x65, 0x0d, 0xc2, 0xa9, 0x0d, 0x04,
+ 0xce, 0x54, 0xa8, 0x2e, 0x1f, 0xa9, 0x42, 0xfa,
+ 0x3f, 0x00, 0xd0, 0x29, 0x4f, 0x37, 0x15, 0xd3
+};
+static const u8 enc_assoc113[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce113[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key113[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input114[] __initconst = {
+ 0xcb, 0xf1, 0xda, 0x9e, 0x0b, 0xa9, 0x37, 0x73,
+ 0x74, 0xe6, 0x9e, 0x1c, 0x0e, 0x60, 0x0c, 0xfc,
+ 0x34, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0xbe, 0x3f, 0xa6, 0x6b, 0x6c, 0xe7, 0x80, 0x8a,
+ 0xa3, 0xe4, 0x59, 0x49, 0xf9, 0x44, 0x64, 0x9f,
+ 0xd0, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0x66, 0x68, 0xdb, 0xc8, 0xf5, 0xf2, 0x0e, 0xf2,
+ 0xb3, 0xf3, 0x8f, 0x00, 0xe2, 0x03, 0x17, 0x88
+};
+static const u8 enc_output114[] __initconst = {
+ 0xcb, 0x9a, 0x0d, 0xb1, 0x8d, 0x63, 0xd7, 0xea,
+ 0xd7, 0xc9, 0x60, 0xd6, 0xb2, 0x86, 0x74, 0x5f,
+ 0xb3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xde, 0xba, 0xb4, 0xa1, 0x58, 0x42, 0x50, 0xbf,
+ 0xfc, 0x2f, 0xc8, 0x4d, 0x95, 0xde, 0xcf, 0x04,
+ 0xb3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xde, 0xba, 0xb4, 0xa1, 0x58, 0x42, 0x50, 0xbf,
+ 0xfc, 0x2f, 0xc8, 0x4d, 0x95, 0xde, 0xcf, 0x04,
+ 0x23, 0x83, 0xab, 0x0b, 0x79, 0x92, 0x05, 0x69,
+ 0x9b, 0x51, 0x0a, 0xa7, 0x09, 0xbf, 0x31, 0xf1
+};
+static const u8 enc_assoc114[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce114[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key114[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input115[] __initconst = {
+ 0x8f, 0x27, 0x86, 0x94, 0xc4, 0xe9, 0xda, 0xeb,
+ 0xd5, 0x8d, 0x3e, 0x5b, 0x96, 0x6e, 0x8b, 0x68,
+ 0x42, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09,
+ 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8,
+ 0x06, 0x53, 0xe7, 0xa3, 0x31, 0x71, 0x88, 0x33,
+ 0xac, 0xc3, 0xb9, 0xad, 0xff, 0x1c, 0x31, 0x98,
+ 0xa6, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39,
+ 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4,
+ 0xde, 0x04, 0x9a, 0x00, 0xa8, 0x64, 0x06, 0x4b,
+ 0xbc, 0xd4, 0x6f, 0xe4, 0xe4, 0x5b, 0x42, 0x8f
+};
+static const u8 enc_output115[] __initconst = {
+ 0x8f, 0x4c, 0x51, 0xbb, 0x42, 0x23, 0x3a, 0x72,
+ 0x76, 0xa2, 0xc0, 0x91, 0x2a, 0x88, 0xf3, 0xcb,
+ 0xc5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x66, 0xd6, 0xf5, 0x69, 0x05, 0xd4, 0x58, 0x06,
+ 0xf3, 0x08, 0x28, 0xa9, 0x93, 0x86, 0x9a, 0x03,
+ 0xc5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x66, 0xd6, 0xf5, 0x69, 0x05, 0xd4, 0x58, 0x06,
+ 0xf3, 0x08, 0x28, 0xa9, 0x93, 0x86, 0x9a, 0x03,
+ 0x8b, 0xfb, 0xab, 0x17, 0xa9, 0xe0, 0xb8, 0x74,
+ 0x8b, 0x51, 0x0a, 0xe7, 0xd9, 0xfd, 0x23, 0x05
+};
+static const u8 enc_assoc115[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce115[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key115[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input116[] __initconst = {
+ 0xd5, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x9a, 0x22, 0xd7, 0x0a, 0x48, 0xe2, 0x4f, 0xdd,
+ 0xcd, 0xd4, 0x41, 0x9d, 0xe6, 0x4c, 0x8f, 0x44,
+ 0xfc, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x77, 0xb5, 0xc9, 0x07, 0xd9, 0xc9, 0xe1, 0xea,
+ 0x51, 0x85, 0x1a, 0x20, 0x4a, 0xad, 0x9f, 0x0a,
+ 0x24, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x3f, 0x91, 0xf8, 0xe7, 0xc7, 0xb1, 0x96, 0x25,
+ 0x64, 0x61, 0x9c, 0x5e, 0x7e, 0x9b, 0xf6, 0x13
+};
+static const u8 enc_output116[] __initconst = {
+ 0xd5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x1d, 0xe0, 0x1d, 0x03, 0xa4, 0xfb, 0x69, 0x2b,
+ 0x0f, 0x13, 0x57, 0x17, 0xda, 0x3c, 0x93, 0x03,
+ 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x14, 0xbc, 0x01, 0x79, 0x57, 0xdc, 0xfa, 0x2c,
+ 0xc0, 0xdb, 0xb8, 0x1d, 0xf5, 0x83, 0xcb, 0x01,
+ 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x14, 0xbc, 0x01, 0x79, 0x57, 0xdc, 0xfa, 0x2c,
+ 0xc0, 0xdb, 0xb8, 0x1d, 0xf5, 0x83, 0xcb, 0x01,
+ 0x49, 0xbc, 0x6e, 0x9f, 0xc5, 0x1c, 0x4d, 0x50,
+ 0x30, 0x36, 0x64, 0x4d, 0x84, 0x27, 0x73, 0xd2
+};
+static const u8 enc_assoc116[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce116[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key116[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input117[] __initconst = {
+ 0xdb, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x75, 0xd5, 0x64, 0x3a, 0xa5, 0xaf, 0x93, 0x4d,
+ 0x8c, 0xce, 0x39, 0x2c, 0xc3, 0xee, 0xdb, 0x47,
+ 0xc0, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0x60, 0x1b, 0x5a, 0xd2, 0x06, 0x7f, 0x28, 0x06,
+ 0x6a, 0x8f, 0x32, 0x81, 0x71, 0x5b, 0xa8, 0x08,
+ 0x18, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x28, 0x3f, 0x6b, 0x32, 0x18, 0x07, 0x5f, 0xc9,
+ 0x5f, 0x6b, 0xb4, 0xff, 0x45, 0x6d, 0xc1, 0x11
+};
+static const u8 enc_output117[] __initconst = {
+ 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xf2, 0x17, 0xae, 0x33, 0x49, 0xb6, 0xb5, 0xbb,
+ 0x4e, 0x09, 0x2f, 0xa6, 0xff, 0x9e, 0xc7, 0x00,
+ 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x03, 0x12, 0x92, 0xac, 0x88, 0x6a, 0x33, 0xc0,
+ 0xfb, 0xd1, 0x90, 0xbc, 0xce, 0x75, 0xfc, 0x03,
+ 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0x03, 0x12, 0x92, 0xac, 0x88, 0x6a, 0x33, 0xc0,
+ 0xfb, 0xd1, 0x90, 0xbc, 0xce, 0x75, 0xfc, 0x03,
+ 0x63, 0xda, 0x6e, 0xa2, 0x51, 0xf0, 0x39, 0x53,
+ 0x2c, 0x36, 0x64, 0x5d, 0x38, 0xb7, 0x6f, 0xd7
+};
+static const u8 enc_assoc117[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce117[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key117[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+/* wycheproof - edge case intermediate sums in poly1305 */
+static const u8 enc_input118[] __initconst = {
+ 0x93, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66,
+ 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c,
+ 0x62, 0x48, 0x39, 0x60, 0x42, 0x16, 0xe4, 0x03,
+ 0xeb, 0xcc, 0x6a, 0xf5, 0x59, 0xec, 0x8b, 0x43,
+ 0x97, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca,
+ 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64,
+ 0xd8, 0xc8, 0xc3, 0xfa, 0x1a, 0x9e, 0x47, 0x4a,
+ 0xbe, 0x52, 0xd0, 0x2c, 0x81, 0x87, 0xe9, 0x0f,
+ 0x4f, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2,
+ 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73,
+ 0x90, 0xec, 0xf2, 0x1a, 0x04, 0xe6, 0x30, 0x85,
+ 0x8b, 0xb6, 0x56, 0x52, 0xb5, 0xb1, 0x80, 0x16
+};
+static const u8 enc_output118[] __initconst = {
+ 0x93, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xe5, 0x8a, 0xf3, 0x69, 0xae, 0x0f, 0xc2, 0xf5,
+ 0x29, 0x0b, 0x7c, 0x7f, 0x65, 0x9c, 0x97, 0x04,
+ 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xbb, 0xc1, 0x0b, 0x84, 0x94, 0x8b, 0x5c, 0x8c,
+ 0x2f, 0x0c, 0x72, 0x11, 0x3e, 0xa9, 0xbd, 0x04,
+ 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xbb, 0xc1, 0x0b, 0x84, 0x94, 0x8b, 0x5c, 0x8c,
+ 0x2f, 0x0c, 0x72, 0x11, 0x3e, 0xa9, 0xbd, 0x04,
+ 0x73, 0xeb, 0x27, 0x24, 0xb5, 0xc4, 0x05, 0xf0,
+ 0x4d, 0x00, 0xd0, 0xf1, 0x58, 0x40, 0xa1, 0xc1
+};
+static const u8 enc_assoc118[] __initconst = {
+ 0xff, 0xff, 0xff, 0xff
+};
+static const u8 enc_nonce118[] __initconst = {
+ 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52
+};
+static const u8 enc_key118[] __initconst = {
+ 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
+ 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
+ 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97,
+ 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f
+};
+
+static const struct chacha20poly1305_testvec
+chacha20poly1305_enc_vectors[] __initconst = {
+ { enc_input001, enc_output001, enc_assoc001, enc_nonce001, enc_key001,
+ sizeof(enc_input001), sizeof(enc_assoc001), sizeof(enc_nonce001) },
+ { enc_input002, enc_output002, enc_assoc002, enc_nonce002, enc_key002,
+ sizeof(enc_input002), sizeof(enc_assoc002), sizeof(enc_nonce002) },
+ { enc_input003, enc_output003, enc_assoc003, enc_nonce003, enc_key003,
+ sizeof(enc_input003), sizeof(enc_assoc003), sizeof(enc_nonce003) },
+ { enc_input004, enc_output004, enc_assoc004, enc_nonce004, enc_key004,
+ sizeof(enc_input004), sizeof(enc_assoc004), sizeof(enc_nonce004) },
+ { enc_input005, enc_output005, enc_assoc005, enc_nonce005, enc_key005,
+ sizeof(enc_input005), sizeof(enc_assoc005), sizeof(enc_nonce005) },
+ { enc_input006, enc_output006, enc_assoc006, enc_nonce006, enc_key006,
+ sizeof(enc_input006), sizeof(enc_assoc006), sizeof(enc_nonce006) },
+ { enc_input007, enc_output007, enc_assoc007, enc_nonce007, enc_key007,
+ sizeof(enc_input007), sizeof(enc_assoc007), sizeof(enc_nonce007) },
+ { enc_input008, enc_output008, enc_assoc008, enc_nonce008, enc_key008,
+ sizeof(enc_input008), sizeof(enc_assoc008), sizeof(enc_nonce008) },
+ { enc_input009, enc_output009, enc_assoc009, enc_nonce009, enc_key009,
+ sizeof(enc_input009), sizeof(enc_assoc009), sizeof(enc_nonce009) },
+ { enc_input010, enc_output010, enc_assoc010, enc_nonce010, enc_key010,
+ sizeof(enc_input010), sizeof(enc_assoc010), sizeof(enc_nonce010) },
+ { enc_input011, enc_output011, enc_assoc011, enc_nonce011, enc_key011,
+ sizeof(enc_input011), sizeof(enc_assoc011), sizeof(enc_nonce011) },
+ { enc_input012, enc_output012, enc_assoc012, enc_nonce012, enc_key012,
+ sizeof(enc_input012), sizeof(enc_assoc012), sizeof(enc_nonce012) },
+ { enc_input013, enc_output013, enc_assoc013, enc_nonce013, enc_key013,
+ sizeof(enc_input013), sizeof(enc_assoc013), sizeof(enc_nonce013) },
+ { enc_input014, enc_output014, enc_assoc014, enc_nonce014, enc_key014,
+ sizeof(enc_input014), sizeof(enc_assoc014), sizeof(enc_nonce014) },
+ { enc_input015, enc_output015, enc_assoc015, enc_nonce015, enc_key015,
+ sizeof(enc_input015), sizeof(enc_assoc015), sizeof(enc_nonce015) },
+ { enc_input016, enc_output016, enc_assoc016, enc_nonce016, enc_key016,
+ sizeof(enc_input016), sizeof(enc_assoc016), sizeof(enc_nonce016) },
+ { enc_input017, enc_output017, enc_assoc017, enc_nonce017, enc_key017,
+ sizeof(enc_input017), sizeof(enc_assoc017), sizeof(enc_nonce017) },
+ { enc_input018, enc_output018, enc_assoc018, enc_nonce018, enc_key018,
+ sizeof(enc_input018), sizeof(enc_assoc018), sizeof(enc_nonce018) },
+ { enc_input019, enc_output019, enc_assoc019, enc_nonce019, enc_key019,
+ sizeof(enc_input019), sizeof(enc_assoc019), sizeof(enc_nonce019) },
+ { enc_input020, enc_output020, enc_assoc020, enc_nonce020, enc_key020,
+ sizeof(enc_input020), sizeof(enc_assoc020), sizeof(enc_nonce020) },
+ { enc_input021, enc_output021, enc_assoc021, enc_nonce021, enc_key021,
+ sizeof(enc_input021), sizeof(enc_assoc021), sizeof(enc_nonce021) },
+ { enc_input022, enc_output022, enc_assoc022, enc_nonce022, enc_key022,
+ sizeof(enc_input022), sizeof(enc_assoc022), sizeof(enc_nonce022) },
+ { enc_input023, enc_output023, enc_assoc023, enc_nonce023, enc_key023,
+ sizeof(enc_input023), sizeof(enc_assoc023), sizeof(enc_nonce023) },
+ { enc_input024, enc_output024, enc_assoc024, enc_nonce024, enc_key024,
+ sizeof(enc_input024), sizeof(enc_assoc024), sizeof(enc_nonce024) },
+ { enc_input025, enc_output025, enc_assoc025, enc_nonce025, enc_key025,
+ sizeof(enc_input025), sizeof(enc_assoc025), sizeof(enc_nonce025) },
+ { enc_input026, enc_output026, enc_assoc026, enc_nonce026, enc_key026,
+ sizeof(enc_input026), sizeof(enc_assoc026), sizeof(enc_nonce026) },
+ { enc_input027, enc_output027, enc_assoc027, enc_nonce027, enc_key027,
+ sizeof(enc_input027), sizeof(enc_assoc027), sizeof(enc_nonce027) },
+ { enc_input028, enc_output028, enc_assoc028, enc_nonce028, enc_key028,
+ sizeof(enc_input028), sizeof(enc_assoc028), sizeof(enc_nonce028) },
+ { enc_input029, enc_output029, enc_assoc029, enc_nonce029, enc_key029,
+ sizeof(enc_input029), sizeof(enc_assoc029), sizeof(enc_nonce029) },
+ { enc_input030, enc_output030, enc_assoc030, enc_nonce030, enc_key030,
+ sizeof(enc_input030), sizeof(enc_assoc030), sizeof(enc_nonce030) },
+ { enc_input031, enc_output031, enc_assoc031, enc_nonce031, enc_key031,
+ sizeof(enc_input031), sizeof(enc_assoc031), sizeof(enc_nonce031) },
+ { enc_input032, enc_output032, enc_assoc032, enc_nonce032, enc_key032,
+ sizeof(enc_input032), sizeof(enc_assoc032), sizeof(enc_nonce032) },
+ { enc_input033, enc_output033, enc_assoc033, enc_nonce033, enc_key033,
+ sizeof(enc_input033), sizeof(enc_assoc033), sizeof(enc_nonce033) },
+ { enc_input034, enc_output034, enc_assoc034, enc_nonce034, enc_key034,
+ sizeof(enc_input034), sizeof(enc_assoc034), sizeof(enc_nonce034) },
+ { enc_input035, enc_output035, enc_assoc035, enc_nonce035, enc_key035,
+ sizeof(enc_input035), sizeof(enc_assoc035), sizeof(enc_nonce035) },
+ { enc_input036, enc_output036, enc_assoc036, enc_nonce036, enc_key036,
+ sizeof(enc_input036), sizeof(enc_assoc036), sizeof(enc_nonce036) },
+ { enc_input037, enc_output037, enc_assoc037, enc_nonce037, enc_key037,
+ sizeof(enc_input037), sizeof(enc_assoc037), sizeof(enc_nonce037) },
+ { enc_input038, enc_output038, enc_assoc038, enc_nonce038, enc_key038,
+ sizeof(enc_input038), sizeof(enc_assoc038), sizeof(enc_nonce038) },
+ { enc_input039, enc_output039, enc_assoc039, enc_nonce039, enc_key039,
+ sizeof(enc_input039), sizeof(enc_assoc039), sizeof(enc_nonce039) },
+ { enc_input040, enc_output040, enc_assoc040, enc_nonce040, enc_key040,
+ sizeof(enc_input040), sizeof(enc_assoc040), sizeof(enc_nonce040) },
+ { enc_input041, enc_output041, enc_assoc041, enc_nonce041, enc_key041,
+ sizeof(enc_input041), sizeof(enc_assoc041), sizeof(enc_nonce041) },
+ { enc_input042, enc_output042, enc_assoc042, enc_nonce042, enc_key042,
+ sizeof(enc_input042), sizeof(enc_assoc042), sizeof(enc_nonce042) },
+ { enc_input043, enc_output043, enc_assoc043, enc_nonce043, enc_key043,
+ sizeof(enc_input043), sizeof(enc_assoc043), sizeof(enc_nonce043) },
+ { enc_input044, enc_output044, enc_assoc044, enc_nonce044, enc_key044,
+ sizeof(enc_input044), sizeof(enc_assoc044), sizeof(enc_nonce044) },
+ { enc_input045, enc_output045, enc_assoc045, enc_nonce045, enc_key045,
+ sizeof(enc_input045), sizeof(enc_assoc045), sizeof(enc_nonce045) },
+ { enc_input046, enc_output046, enc_assoc046, enc_nonce046, enc_key046,
+ sizeof(enc_input046), sizeof(enc_assoc046), sizeof(enc_nonce046) },
+ { enc_input047, enc_output047, enc_assoc047, enc_nonce047, enc_key047,
+ sizeof(enc_input047), sizeof(enc_assoc047), sizeof(enc_nonce047) },
+ { enc_input048, enc_output048, enc_assoc048, enc_nonce048, enc_key048,
+ sizeof(enc_input048), sizeof(enc_assoc048), sizeof(enc_nonce048) },
+ { enc_input049, enc_output049, enc_assoc049, enc_nonce049, enc_key049,
+ sizeof(enc_input049), sizeof(enc_assoc049), sizeof(enc_nonce049) },
+ { enc_input050, enc_output050, enc_assoc050, enc_nonce050, enc_key050,
+ sizeof(enc_input050), sizeof(enc_assoc050), sizeof(enc_nonce050) },
+ { enc_input051, enc_output051, enc_assoc051, enc_nonce051, enc_key051,
+ sizeof(enc_input051), sizeof(enc_assoc051), sizeof(enc_nonce051) },
+ { enc_input052, enc_output052, enc_assoc052, enc_nonce052, enc_key052,
+ sizeof(enc_input052), sizeof(enc_assoc052), sizeof(enc_nonce052) },
+ { enc_input053, enc_output053, enc_assoc053, enc_nonce053, enc_key053,
+ sizeof(enc_input053), sizeof(enc_assoc053), sizeof(enc_nonce053) },
+ { enc_input054, enc_output054, enc_assoc054, enc_nonce054, enc_key054,
+ sizeof(enc_input054), sizeof(enc_assoc054), sizeof(enc_nonce054) },
+ { enc_input055, enc_output055, enc_assoc055, enc_nonce055, enc_key055,
+ sizeof(enc_input055), sizeof(enc_assoc055), sizeof(enc_nonce055) },
+ { enc_input056, enc_output056, enc_assoc056, enc_nonce056, enc_key056,
+ sizeof(enc_input056), sizeof(enc_assoc056), sizeof(enc_nonce056) },
+ { enc_input057, enc_output057, enc_assoc057, enc_nonce057, enc_key057,
+ sizeof(enc_input057), sizeof(enc_assoc057), sizeof(enc_nonce057) },
+ { enc_input058, enc_output058, enc_assoc058, enc_nonce058, enc_key058,
+ sizeof(enc_input058), sizeof(enc_assoc058), sizeof(enc_nonce058) },
+ { enc_input059, enc_output059, enc_assoc059, enc_nonce059, enc_key059,
+ sizeof(enc_input059), sizeof(enc_assoc059), sizeof(enc_nonce059) },
+ { enc_input060, enc_output060, enc_assoc060, enc_nonce060, enc_key060,
+ sizeof(enc_input060), sizeof(enc_assoc060), sizeof(enc_nonce060) },
+ { enc_input061, enc_output061, enc_assoc061, enc_nonce061, enc_key061,
+ sizeof(enc_input061), sizeof(enc_assoc061), sizeof(enc_nonce061) },
+ { enc_input062, enc_output062, enc_assoc062, enc_nonce062, enc_key062,
+ sizeof(enc_input062), sizeof(enc_assoc062), sizeof(enc_nonce062) },
+ { enc_input063, enc_output063, enc_assoc063, enc_nonce063, enc_key063,
+ sizeof(enc_input063), sizeof(enc_assoc063), sizeof(enc_nonce063) },
+ { enc_input064, enc_output064, enc_assoc064, enc_nonce064, enc_key064,
+ sizeof(enc_input064), sizeof(enc_assoc064), sizeof(enc_nonce064) },
+ { enc_input065, enc_output065, enc_assoc065, enc_nonce065, enc_key065,
+ sizeof(enc_input065), sizeof(enc_assoc065), sizeof(enc_nonce065) },
+ { enc_input066, enc_output066, enc_assoc066, enc_nonce066, enc_key066,
+ sizeof(enc_input066), sizeof(enc_assoc066), sizeof(enc_nonce066) },
+ { enc_input067, enc_output067, enc_assoc067, enc_nonce067, enc_key067,
+ sizeof(enc_input067), sizeof(enc_assoc067), sizeof(enc_nonce067) },
+ { enc_input068, enc_output068, enc_assoc068, enc_nonce068, enc_key068,
+ sizeof(enc_input068), sizeof(enc_assoc068), sizeof(enc_nonce068) },
+ { enc_input069, enc_output069, enc_assoc069, enc_nonce069, enc_key069,
+ sizeof(enc_input069), sizeof(enc_assoc069), sizeof(enc_nonce069) },
+ { enc_input070, enc_output070, enc_assoc070, enc_nonce070, enc_key070,
+ sizeof(enc_input070), sizeof(enc_assoc070), sizeof(enc_nonce070) },
+ { enc_input071, enc_output071, enc_assoc071, enc_nonce071, enc_key071,
+ sizeof(enc_input071), sizeof(enc_assoc071), sizeof(enc_nonce071) },
+ { enc_input072, enc_output072, enc_assoc072, enc_nonce072, enc_key072,
+ sizeof(enc_input072), sizeof(enc_assoc072), sizeof(enc_nonce072) },
+ { enc_input073, enc_output073, enc_assoc073, enc_nonce073, enc_key073,
+ sizeof(enc_input073), sizeof(enc_assoc073), sizeof(enc_nonce073) },
+ { enc_input074, enc_output074, enc_assoc074, enc_nonce074, enc_key074,
+ sizeof(enc_input074), sizeof(enc_assoc074), sizeof(enc_nonce074) },
+ { enc_input075, enc_output075, enc_assoc075, enc_nonce075, enc_key075,
+ sizeof(enc_input075), sizeof(enc_assoc075), sizeof(enc_nonce075) },
+ { enc_input076, enc_output076, enc_assoc076, enc_nonce076, enc_key076,
+ sizeof(enc_input076), sizeof(enc_assoc076), sizeof(enc_nonce076) },
+ { enc_input077, enc_output077, enc_assoc077, enc_nonce077, enc_key077,
+ sizeof(enc_input077), sizeof(enc_assoc077), sizeof(enc_nonce077) },
+ { enc_input078, enc_output078, enc_assoc078, enc_nonce078, enc_key078,
+ sizeof(enc_input078), sizeof(enc_assoc078), sizeof(enc_nonce078) },
+ { enc_input079, enc_output079, enc_assoc079, enc_nonce079, enc_key079,
+ sizeof(enc_input079), sizeof(enc_assoc079), sizeof(enc_nonce079) },
+ { enc_input080, enc_output080, enc_assoc080, enc_nonce080, enc_key080,
+ sizeof(enc_input080), sizeof(enc_assoc080), sizeof(enc_nonce080) },
+ { enc_input081, enc_output081, enc_assoc081, enc_nonce081, enc_key081,
+ sizeof(enc_input081), sizeof(enc_assoc081), sizeof(enc_nonce081) },
+ { enc_input082, enc_output082, enc_assoc082, enc_nonce082, enc_key082,
+ sizeof(enc_input082), sizeof(enc_assoc082), sizeof(enc_nonce082) },
+ { enc_input083, enc_output083, enc_assoc083, enc_nonce083, enc_key083,
+ sizeof(enc_input083), sizeof(enc_assoc083), sizeof(enc_nonce083) },
+ { enc_input084, enc_output084, enc_assoc084, enc_nonce084, enc_key084,
+ sizeof(enc_input084), sizeof(enc_assoc084), sizeof(enc_nonce084) },
+ { enc_input085, enc_output085, enc_assoc085, enc_nonce085, enc_key085,
+ sizeof(enc_input085), sizeof(enc_assoc085), sizeof(enc_nonce085) },
+ { enc_input086, enc_output086, enc_assoc086, enc_nonce086, enc_key086,
+ sizeof(enc_input086), sizeof(enc_assoc086), sizeof(enc_nonce086) },
+ { enc_input087, enc_output087, enc_assoc087, enc_nonce087, enc_key087,
+ sizeof(enc_input087), sizeof(enc_assoc087), sizeof(enc_nonce087) },
+ { enc_input088, enc_output088, enc_assoc088, enc_nonce088, enc_key088,
+ sizeof(enc_input088), sizeof(enc_assoc088), sizeof(enc_nonce088) },
+ { enc_input089, enc_output089, enc_assoc089, enc_nonce089, enc_key089,
+ sizeof(enc_input089), sizeof(enc_assoc089), sizeof(enc_nonce089) },
+ { enc_input090, enc_output090, enc_assoc090, enc_nonce090, enc_key090,
+ sizeof(enc_input090), sizeof(enc_assoc090), sizeof(enc_nonce090) },
+ { enc_input091, enc_output091, enc_assoc091, enc_nonce091, enc_key091,
+ sizeof(enc_input091), sizeof(enc_assoc091), sizeof(enc_nonce091) },
+ { enc_input092, enc_output092, enc_assoc092, enc_nonce092, enc_key092,
+ sizeof(enc_input092), sizeof(enc_assoc092), sizeof(enc_nonce092) },
+ { enc_input093, enc_output093, enc_assoc093, enc_nonce093, enc_key093,
+ sizeof(enc_input093), sizeof(enc_assoc093), sizeof(enc_nonce093) },
+ { enc_input094, enc_output094, enc_assoc094, enc_nonce094, enc_key094,
+ sizeof(enc_input094), sizeof(enc_assoc094), sizeof(enc_nonce094) },
+ { enc_input095, enc_output095, enc_assoc095, enc_nonce095, enc_key095,
+ sizeof(enc_input095), sizeof(enc_assoc095), sizeof(enc_nonce095) },
+ { enc_input096, enc_output096, enc_assoc096, enc_nonce096, enc_key096,
+ sizeof(enc_input096), sizeof(enc_assoc096), sizeof(enc_nonce096) },
+ { enc_input097, enc_output097, enc_assoc097, enc_nonce097, enc_key097,
+ sizeof(enc_input097), sizeof(enc_assoc097), sizeof(enc_nonce097) },
+ { enc_input098, enc_output098, enc_assoc098, enc_nonce098, enc_key098,
+ sizeof(enc_input098), sizeof(enc_assoc098), sizeof(enc_nonce098) },
+ { enc_input099, enc_output099, enc_assoc099, enc_nonce099, enc_key099,
+ sizeof(enc_input099), sizeof(enc_assoc099), sizeof(enc_nonce099) },
+ { enc_input100, enc_output100, enc_assoc100, enc_nonce100, enc_key100,
+ sizeof(enc_input100), sizeof(enc_assoc100), sizeof(enc_nonce100) },
+ { enc_input101, enc_output101, enc_assoc101, enc_nonce101, enc_key101,
+ sizeof(enc_input101), sizeof(enc_assoc101), sizeof(enc_nonce101) },
+ { enc_input102, enc_output102, enc_assoc102, enc_nonce102, enc_key102,
+ sizeof(enc_input102), sizeof(enc_assoc102), sizeof(enc_nonce102) },
+ { enc_input103, enc_output103, enc_assoc103, enc_nonce103, enc_key103,
+ sizeof(enc_input103), sizeof(enc_assoc103), sizeof(enc_nonce103) },
+ { enc_input104, enc_output104, enc_assoc104, enc_nonce104, enc_key104,
+ sizeof(enc_input104), sizeof(enc_assoc104), sizeof(enc_nonce104) },
+ { enc_input105, enc_output105, enc_assoc105, enc_nonce105, enc_key105,
+ sizeof(enc_input105), sizeof(enc_assoc105), sizeof(enc_nonce105) },
+ { enc_input106, enc_output106, enc_assoc106, enc_nonce106, enc_key106,
+ sizeof(enc_input106), sizeof(enc_assoc106), sizeof(enc_nonce106) },
+ { enc_input107, enc_output107, enc_assoc107, enc_nonce107, enc_key107,
+ sizeof(enc_input107), sizeof(enc_assoc107), sizeof(enc_nonce107) },
+ { enc_input108, enc_output108, enc_assoc108, enc_nonce108, enc_key108,
+ sizeof(enc_input108), sizeof(enc_assoc108), sizeof(enc_nonce108) },
+ { enc_input109, enc_output109, enc_assoc109, enc_nonce109, enc_key109,
+ sizeof(enc_input109), sizeof(enc_assoc109), sizeof(enc_nonce109) },
+ { enc_input110, enc_output110, enc_assoc110, enc_nonce110, enc_key110,
+ sizeof(enc_input110), sizeof(enc_assoc110), sizeof(enc_nonce110) },
+ { enc_input111, enc_output111, enc_assoc111, enc_nonce111, enc_key111,
+ sizeof(enc_input111), sizeof(enc_assoc111), sizeof(enc_nonce111) },
+ { enc_input112, enc_output112, enc_assoc112, enc_nonce112, enc_key112,
+ sizeof(enc_input112), sizeof(enc_assoc112), sizeof(enc_nonce112) },
+ { enc_input113, enc_output113, enc_assoc113, enc_nonce113, enc_key113,
+ sizeof(enc_input113), sizeof(enc_assoc113), sizeof(enc_nonce113) },
+ { enc_input114, enc_output114, enc_assoc114, enc_nonce114, enc_key114,
+ sizeof(enc_input114), sizeof(enc_assoc114), sizeof(enc_nonce114) },
+ { enc_input115, enc_output115, enc_assoc115, enc_nonce115, enc_key115,
+ sizeof(enc_input115), sizeof(enc_assoc115), sizeof(enc_nonce115) },
+ { enc_input116, enc_output116, enc_assoc116, enc_nonce116, enc_key116,
+ sizeof(enc_input116), sizeof(enc_assoc116), sizeof(enc_nonce116) },
+ { enc_input117, enc_output117, enc_assoc117, enc_nonce117, enc_key117,
+ sizeof(enc_input117), sizeof(enc_assoc117), sizeof(enc_nonce117) },
+ { enc_input118, enc_output118, enc_assoc118, enc_nonce118, enc_key118,
+ sizeof(enc_input118), sizeof(enc_assoc118), sizeof(enc_nonce118) }
+};
+
+static const u8 dec_input001[] __initconst = {
+ 0x64, 0xa0, 0x86, 0x15, 0x75, 0x86, 0x1a, 0xf4,
+ 0x60, 0xf0, 0x62, 0xc7, 0x9b, 0xe6, 0x43, 0xbd,
+ 0x5e, 0x80, 0x5c, 0xfd, 0x34, 0x5c, 0xf3, 0x89,
+ 0xf1, 0x08, 0x67, 0x0a, 0xc7, 0x6c, 0x8c, 0xb2,
+ 0x4c, 0x6c, 0xfc, 0x18, 0x75, 0x5d, 0x43, 0xee,
+ 0xa0, 0x9e, 0xe9, 0x4e, 0x38, 0x2d, 0x26, 0xb0,
+ 0xbd, 0xb7, 0xb7, 0x3c, 0x32, 0x1b, 0x01, 0x00,
+ 0xd4, 0xf0, 0x3b, 0x7f, 0x35, 0x58, 0x94, 0xcf,
+ 0x33, 0x2f, 0x83, 0x0e, 0x71, 0x0b, 0x97, 0xce,
+ 0x98, 0xc8, 0xa8, 0x4a, 0xbd, 0x0b, 0x94, 0x81,
+ 0x14, 0xad, 0x17, 0x6e, 0x00, 0x8d, 0x33, 0xbd,
+ 0x60, 0xf9, 0x82, 0xb1, 0xff, 0x37, 0xc8, 0x55,
+ 0x97, 0x97, 0xa0, 0x6e, 0xf4, 0xf0, 0xef, 0x61,
+ 0xc1, 0x86, 0x32, 0x4e, 0x2b, 0x35, 0x06, 0x38,
+ 0x36, 0x06, 0x90, 0x7b, 0x6a, 0x7c, 0x02, 0xb0,
+ 0xf9, 0xf6, 0x15, 0x7b, 0x53, 0xc8, 0x67, 0xe4,
+ 0xb9, 0x16, 0x6c, 0x76, 0x7b, 0x80, 0x4d, 0x46,
+ 0xa5, 0x9b, 0x52, 0x16, 0xcd, 0xe7, 0xa4, 0xe9,
+ 0x90, 0x40, 0xc5, 0xa4, 0x04, 0x33, 0x22, 0x5e,
+ 0xe2, 0x82, 0xa1, 0xb0, 0xa0, 0x6c, 0x52, 0x3e,
+ 0xaf, 0x45, 0x34, 0xd7, 0xf8, 0x3f, 0xa1, 0x15,
+ 0x5b, 0x00, 0x47, 0x71, 0x8c, 0xbc, 0x54, 0x6a,
+ 0x0d, 0x07, 0x2b, 0x04, 0xb3, 0x56, 0x4e, 0xea,
+ 0x1b, 0x42, 0x22, 0x73, 0xf5, 0x48, 0x27, 0x1a,
+ 0x0b, 0xb2, 0x31, 0x60, 0x53, 0xfa, 0x76, 0x99,
+ 0x19, 0x55, 0xeb, 0xd6, 0x31, 0x59, 0x43, 0x4e,
+ 0xce, 0xbb, 0x4e, 0x46, 0x6d, 0xae, 0x5a, 0x10,
+ 0x73, 0xa6, 0x72, 0x76, 0x27, 0x09, 0x7a, 0x10,
+ 0x49, 0xe6, 0x17, 0xd9, 0x1d, 0x36, 0x10, 0x94,
+ 0xfa, 0x68, 0xf0, 0xff, 0x77, 0x98, 0x71, 0x30,
+ 0x30, 0x5b, 0xea, 0xba, 0x2e, 0xda, 0x04, 0xdf,
+ 0x99, 0x7b, 0x71, 0x4d, 0x6c, 0x6f, 0x2c, 0x29,
+ 0xa6, 0xad, 0x5c, 0xb4, 0x02, 0x2b, 0x02, 0x70,
+ 0x9b, 0xee, 0xad, 0x9d, 0x67, 0x89, 0x0c, 0xbb,
+ 0x22, 0x39, 0x23, 0x36, 0xfe, 0xa1, 0x85, 0x1f,
+ 0x38
+};
+static const u8 dec_output001[] __initconst = {
+ 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74,
+ 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20,
+ 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66,
+ 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69,
+ 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20,
+ 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20,
+ 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d,
+ 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e,
+ 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65,
+ 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64,
+ 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63,
+ 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f,
+ 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64,
+ 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65,
+ 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61,
+ 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e,
+ 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69,
+ 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72,
+ 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20,
+ 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65,
+ 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61,
+ 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72,
+ 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65,
+ 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61,
+ 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20,
+ 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65,
+ 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20,
+ 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20,
+ 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b,
+ 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67,
+ 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80,
+ 0x9d
+};
+static const u8 dec_assoc001[] __initconst = {
+ 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x4e, 0x91
+};
+static const u8 dec_nonce001[] __initconst = {
+ 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08
+};
+static const u8 dec_key001[] __initconst = {
+ 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a,
+ 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0,
+ 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09,
+ 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0
+};
+
+static const u8 dec_input002[] __initconst = {
+ 0xea, 0xe0, 0x1e, 0x9e, 0x2c, 0x91, 0xaa, 0xe1,
+ 0xdb, 0x5d, 0x99, 0x3f, 0x8a, 0xf7, 0x69, 0x92
+};
+static const u8 dec_output002[] __initconst = { };
+static const u8 dec_assoc002[] __initconst = { };
+static const u8 dec_nonce002[] __initconst = {
+ 0xca, 0xbf, 0x33, 0x71, 0x32, 0x45, 0x77, 0x8e
+};
+static const u8 dec_key002[] __initconst = {
+ 0x4c, 0xf5, 0x96, 0x83, 0x38, 0xe6, 0xae, 0x7f,
+ 0x2d, 0x29, 0x25, 0x76, 0xd5, 0x75, 0x27, 0x86,
+ 0x91, 0x9a, 0x27, 0x7a, 0xfb, 0x46, 0xc5, 0xef,
+ 0x94, 0x81, 0x79, 0x57, 0x14, 0x59, 0x40, 0x68
+};
+
+static const u8 dec_input003[] __initconst = {
+ 0xdd, 0x6b, 0x3b, 0x82, 0xce, 0x5a, 0xbd, 0xd6,
+ 0xa9, 0x35, 0x83, 0xd8, 0x8c, 0x3d, 0x85, 0x77
+};
+static const u8 dec_output003[] __initconst = { };
+static const u8 dec_assoc003[] __initconst = {
+ 0x33, 0x10, 0x41, 0x12, 0x1f, 0xf3, 0xd2, 0x6b
+};
+static const u8 dec_nonce003[] __initconst = {
+ 0x3d, 0x86, 0xb5, 0x6b, 0xc8, 0xa3, 0x1f, 0x1d
+};
+static const u8 dec_key003[] __initconst = {
+ 0x2d, 0xb0, 0x5d, 0x40, 0xc8, 0xed, 0x44, 0x88,
+ 0x34, 0xd1, 0x13, 0xaf, 0x57, 0xa1, 0xeb, 0x3a,
+ 0x2a, 0x80, 0x51, 0x36, 0xec, 0x5b, 0xbc, 0x08,
+ 0x93, 0x84, 0x21, 0xb5, 0x13, 0x88, 0x3c, 0x0d
+};
+
+static const u8 dec_input004[] __initconst = {
+ 0xb7, 0x1b, 0xb0, 0x73, 0x59, 0xb0, 0x84, 0xb2,
+ 0x6d, 0x8e, 0xab, 0x94, 0x31, 0xa1, 0xae, 0xac,
+ 0x89
+};
+static const u8 dec_output004[] __initconst = {
+ 0xa4
+};
+static const u8 dec_assoc004[] __initconst = {
+ 0x6a, 0xe2, 0xad, 0x3f, 0x88, 0x39, 0x5a, 0x40
+};
+static const u8 dec_nonce004[] __initconst = {
+ 0xd2, 0x32, 0x1f, 0x29, 0x28, 0xc6, 0xc4, 0xc4
+};
+static const u8 dec_key004[] __initconst = {
+ 0x4b, 0x28, 0x4b, 0xa3, 0x7b, 0xbe, 0xe9, 0xf8,
+ 0x31, 0x80, 0x82, 0xd7, 0xd8, 0xe8, 0xb5, 0xa1,
+ 0xe2, 0x18, 0x18, 0x8a, 0x9c, 0xfa, 0xa3, 0x3d,
+ 0x25, 0x71, 0x3e, 0x40, 0xbc, 0x54, 0x7a, 0x3e
+};
+
+static const u8 dec_input005[] __initconst = {
+ 0xbf, 0xe1, 0x5b, 0x0b, 0xdb, 0x6b, 0xf5, 0x5e,
+ 0x6c, 0x5d, 0x84, 0x44, 0x39, 0x81, 0xc1, 0x9c,
+ 0xac
+};
+static const u8 dec_output005[] __initconst = {
+ 0x2d
+};
+static const u8 dec_assoc005[] __initconst = { };
+static const u8 dec_nonce005[] __initconst = {
+ 0x20, 0x1c, 0xaa, 0x5f, 0x9c, 0xbf, 0x92, 0x30
+};
+static const u8 dec_key005[] __initconst = {
+ 0x66, 0xca, 0x9c, 0x23, 0x2a, 0x4b, 0x4b, 0x31,
+ 0x0e, 0x92, 0x89, 0x8b, 0xf4, 0x93, 0xc7, 0x87,
+ 0x98, 0xa3, 0xd8, 0x39, 0xf8, 0xf4, 0xa7, 0x01,
+ 0xc0, 0x2e, 0x0a, 0xa6, 0x7e, 0x5a, 0x78, 0x87
+};
+
+static const u8 dec_input006[] __initconst = {
+ 0x8b, 0x06, 0xd3, 0x31, 0xb0, 0x93, 0x45, 0xb1,
+ 0x75, 0x6e, 0x26, 0xf9, 0x67, 0xbc, 0x90, 0x15,
+ 0x81, 0x2c, 0xb5, 0xf0, 0xc6, 0x2b, 0xc7, 0x8c,
+ 0x56, 0xd1, 0xbf, 0x69, 0x6c, 0x07, 0xa0, 0xda,
+ 0x65, 0x27, 0xc9, 0x90, 0x3d, 0xef, 0x4b, 0x11,
+ 0x0f, 0x19, 0x07, 0xfd, 0x29, 0x92, 0xd9, 0xc8,
+ 0xf7, 0x99, 0x2e, 0x4a, 0xd0, 0xb8, 0x2c, 0xdc,
+ 0x93, 0xf5, 0x9e, 0x33, 0x78, 0xd1, 0x37, 0xc3,
+ 0x66, 0xd7, 0x5e, 0xbc, 0x44, 0xbf, 0x53, 0xa5,
+ 0xbc, 0xc4, 0xcb, 0x7b, 0x3a, 0x8e, 0x7f, 0x02,
+ 0xbd, 0xbb, 0xe7, 0xca, 0xa6, 0x6c, 0x6b, 0x93,
+ 0x21, 0x93, 0x10, 0x61, 0xe7, 0x69, 0xd0, 0x78,
+ 0xf3, 0x07, 0x5a, 0x1a, 0x8f, 0x73, 0xaa, 0xb1,
+ 0x4e, 0xd3, 0xda, 0x4f, 0xf3, 0x32, 0xe1, 0x66,
+ 0x3e, 0x6c, 0xc6, 0x13, 0xba, 0x06, 0x5b, 0xfc,
+ 0x6a, 0xe5, 0x6f, 0x60, 0xfb, 0x07, 0x40, 0xb0,
+ 0x8c, 0x9d, 0x84, 0x43, 0x6b, 0xc1, 0xf7, 0x8d,
+ 0x8d, 0x31, 0xf7, 0x7a, 0x39, 0x4d, 0x8f, 0x9a,
+ 0xeb
+};
+static const u8 dec_output006[] __initconst = {
+ 0x33, 0x2f, 0x94, 0xc1, 0xa4, 0xef, 0xcc, 0x2a,
+ 0x5b, 0xa6, 0xe5, 0x8f, 0x1d, 0x40, 0xf0, 0x92,
+ 0x3c, 0xd9, 0x24, 0x11, 0xa9, 0x71, 0xf9, 0x37,
+ 0x14, 0x99, 0xfa, 0xbe, 0xe6, 0x80, 0xde, 0x50,
+ 0xc9, 0x96, 0xd4, 0xb0, 0xec, 0x9e, 0x17, 0xec,
+ 0xd2, 0x5e, 0x72, 0x99, 0xfc, 0x0a, 0xe1, 0xcb,
+ 0x48, 0xd2, 0x85, 0xdd, 0x2f, 0x90, 0xe0, 0x66,
+ 0x3b, 0xe6, 0x20, 0x74, 0xbe, 0x23, 0x8f, 0xcb,
+ 0xb4, 0xe4, 0xda, 0x48, 0x40, 0xa6, 0xd1, 0x1b,
+ 0xc7, 0x42, 0xce, 0x2f, 0x0c, 0xa6, 0x85, 0x6e,
+ 0x87, 0x37, 0x03, 0xb1, 0x7c, 0x25, 0x96, 0xa3,
+ 0x05, 0xd8, 0xb0, 0xf4, 0xed, 0xea, 0xc2, 0xf0,
+ 0x31, 0x98, 0x6c, 0xd1, 0x14, 0x25, 0xc0, 0xcb,
+ 0x01, 0x74, 0xd0, 0x82, 0xf4, 0x36, 0xf5, 0x41,
+ 0xd5, 0xdc, 0xca, 0xc5, 0xbb, 0x98, 0xfe, 0xfc,
+ 0x69, 0x21, 0x70, 0xd8, 0xa4, 0x4b, 0xc8, 0xde,
+ 0x8f
+};
+static const u8 dec_assoc006[] __initconst = {
+ 0x70, 0xd3, 0x33, 0xf3, 0x8b, 0x18, 0x0b
+};
+static const u8 dec_nonce006[] __initconst = {
+ 0xdf, 0x51, 0x84, 0x82, 0x42, 0x0c, 0x75, 0x9c
+};
+static const u8 dec_key006[] __initconst = {
+ 0x68, 0x7b, 0x8d, 0x8e, 0xe3, 0xc4, 0xdd, 0xae,
+ 0xdf, 0x72, 0x7f, 0x53, 0x72, 0x25, 0x1e, 0x78,
+ 0x91, 0xcb, 0x69, 0x76, 0x1f, 0x49, 0x93, 0xf9,
+ 0x6f, 0x21, 0xcc, 0x39, 0x9c, 0xad, 0xb1, 0x01
+};
+
+static const u8 dec_input007[] __initconst = {
+ 0x85, 0x04, 0xc2, 0xed, 0x8d, 0xfd, 0x97, 0x5c,
+ 0xd2, 0xb7, 0xe2, 0xc1, 0x6b, 0xa3, 0xba, 0xf8,
+ 0xc9, 0x50, 0xc3, 0xc6, 0xa5, 0xe3, 0xa4, 0x7c,
+ 0xc3, 0x23, 0x49, 0x5e, 0xa9, 0xb9, 0x32, 0xeb,
+ 0x8a, 0x7c, 0xca, 0xe5, 0xec, 0xfb, 0x7c, 0xc0,
+ 0xcb, 0x7d, 0xdc, 0x2c, 0x9d, 0x92, 0x55, 0x21,
+ 0x0a, 0xc8, 0x43, 0x63, 0x59, 0x0a, 0x31, 0x70,
+ 0x82, 0x67, 0x41, 0x03, 0xf8, 0xdf, 0xf2, 0xac,
+ 0xa7, 0x02, 0xd4, 0xd5, 0x8a, 0x2d, 0xc8, 0x99,
+ 0x19, 0x66, 0xd0, 0xf6, 0x88, 0x2c, 0x77, 0xd9,
+ 0xd4, 0x0d, 0x6c, 0xbd, 0x98, 0xde, 0xe7, 0x7f,
+ 0xad, 0x7e, 0x8a, 0xfb, 0xe9, 0x4b, 0xe5, 0xf7,
+ 0xe5, 0x50, 0xa0, 0x90, 0x3f, 0xd6, 0x22, 0x53,
+ 0xe3, 0xfe, 0x1b, 0xcc, 0x79, 0x3b, 0xec, 0x12,
+ 0x47, 0x52, 0xa7, 0xd6, 0x04, 0xe3, 0x52, 0xe6,
+ 0x93, 0x90, 0x91, 0x32, 0x73, 0x79, 0xb8, 0xd0,
+ 0x31, 0xde, 0x1f, 0x9f, 0x2f, 0x05, 0x38, 0x54,
+ 0x2f, 0x35, 0x04, 0x39, 0xe0, 0xa7, 0xba, 0xc6,
+ 0x52, 0xf6, 0x37, 0x65, 0x4c, 0x07, 0xa9, 0x7e,
+ 0xb3, 0x21, 0x6f, 0x74, 0x8c, 0xc9, 0xde, 0xdb,
+ 0x65, 0x1b, 0x9b, 0xaa, 0x60, 0xb1, 0x03, 0x30,
+ 0x6b, 0xb2, 0x03, 0xc4, 0x1c, 0x04, 0xf8, 0x0f,
+ 0x64, 0xaf, 0x46, 0xe4, 0x65, 0x99, 0x49, 0xe2,
+ 0xea, 0xce, 0x78, 0x00, 0xd8, 0x8b, 0xd5, 0x2e,
+ 0xcf, 0xfc, 0x40, 0x49, 0xe8, 0x58, 0xdc, 0x34,
+ 0x9c, 0x8c, 0x61, 0xbf, 0x0a, 0x8e, 0xec, 0x39,
+ 0xa9, 0x30, 0x05, 0x5a, 0xd2, 0x56, 0x01, 0xc7,
+ 0xda, 0x8f, 0x4e, 0xbb, 0x43, 0xa3, 0x3a, 0xf9,
+ 0x15, 0x2a, 0xd0, 0xa0, 0x7a, 0x87, 0x34, 0x82,
+ 0xfe, 0x8a, 0xd1, 0x2d, 0x5e, 0xc7, 0xbf, 0x04,
+ 0x53, 0x5f, 0x3b, 0x36, 0xd4, 0x25, 0x5c, 0x34,
+ 0x7a, 0x8d, 0xd5, 0x05, 0xce, 0x72, 0xca, 0xef,
+ 0x7a, 0x4b, 0xbc, 0xb0, 0x10, 0x5c, 0x96, 0x42,
+ 0x3a, 0x00, 0x98, 0xcd, 0x15, 0xe8, 0xb7, 0x53
+};
+static const u8 dec_output007[] __initconst = {
+ 0x9b, 0x18, 0xdb, 0xdd, 0x9a, 0x0f, 0x3e, 0xa5,
+ 0x15, 0x17, 0xde, 0xdf, 0x08, 0x9d, 0x65, 0x0a,
+ 0x67, 0x30, 0x12, 0xe2, 0x34, 0x77, 0x4b, 0xc1,
+ 0xd9, 0xc6, 0x1f, 0xab, 0xc6, 0x18, 0x50, 0x17,
+ 0xa7, 0x9d, 0x3c, 0xa6, 0xc5, 0x35, 0x8c, 0x1c,
+ 0xc0, 0xa1, 0x7c, 0x9f, 0x03, 0x89, 0xca, 0xe1,
+ 0xe6, 0xe9, 0xd4, 0xd3, 0x88, 0xdb, 0xb4, 0x51,
+ 0x9d, 0xec, 0xb4, 0xfc, 0x52, 0xee, 0x6d, 0xf1,
+ 0x75, 0x42, 0xc6, 0xfd, 0xbd, 0x7a, 0x8e, 0x86,
+ 0xfc, 0x44, 0xb3, 0x4f, 0xf3, 0xea, 0x67, 0x5a,
+ 0x41, 0x13, 0xba, 0xb0, 0xdc, 0xe1, 0xd3, 0x2a,
+ 0x7c, 0x22, 0xb3, 0xca, 0xac, 0x6a, 0x37, 0x98,
+ 0x3e, 0x1d, 0x40, 0x97, 0xf7, 0x9b, 0x1d, 0x36,
+ 0x6b, 0xb3, 0x28, 0xbd, 0x60, 0x82, 0x47, 0x34,
+ 0xaa, 0x2f, 0x7d, 0xe9, 0xa8, 0x70, 0x81, 0x57,
+ 0xd4, 0xb9, 0x77, 0x0a, 0x9d, 0x29, 0xa7, 0x84,
+ 0x52, 0x4f, 0xc2, 0x4a, 0x40, 0x3b, 0x3c, 0xd4,
+ 0xc9, 0x2a, 0xdb, 0x4a, 0x53, 0xc4, 0xbe, 0x80,
+ 0xe9, 0x51, 0x7f, 0x8f, 0xc7, 0xa2, 0xce, 0x82,
+ 0x5c, 0x91, 0x1e, 0x74, 0xd9, 0xd0, 0xbd, 0xd5,
+ 0xf3, 0xfd, 0xda, 0x4d, 0x25, 0xb4, 0xbb, 0x2d,
+ 0xac, 0x2f, 0x3d, 0x71, 0x85, 0x7b, 0xcf, 0x3c,
+ 0x7b, 0x3e, 0x0e, 0x22, 0x78, 0x0c, 0x29, 0xbf,
+ 0xe4, 0xf4, 0x57, 0xb3, 0xcb, 0x49, 0xa0, 0xfc,
+ 0x1e, 0x05, 0x4e, 0x16, 0xbc, 0xd5, 0xa8, 0xa3,
+ 0xee, 0x05, 0x35, 0xc6, 0x7c, 0xab, 0x60, 0x14,
+ 0x55, 0x1a, 0x8e, 0xc5, 0x88, 0x5d, 0xd5, 0x81,
+ 0xc2, 0x81, 0xa5, 0xc4, 0x60, 0xdb, 0xaf, 0x77,
+ 0x91, 0xe1, 0xce, 0xa2, 0x7e, 0x7f, 0x42, 0xe3,
+ 0xb0, 0x13, 0x1c, 0x1f, 0x25, 0x60, 0x21, 0xe2,
+ 0x40, 0x5f, 0x99, 0xb7, 0x73, 0xec, 0x9b, 0x2b,
+ 0xf0, 0x65, 0x11, 0xc8, 0xd0, 0x0a, 0x9f, 0xd3
+};
+static const u8 dec_assoc007[] __initconst = { };
+static const u8 dec_nonce007[] __initconst = {
+ 0xde, 0x7b, 0xef, 0xc3, 0x65, 0x1b, 0x68, 0xb0
+};
+static const u8 dec_key007[] __initconst = {
+ 0x8d, 0xb8, 0x91, 0x48, 0xf0, 0xe7, 0x0a, 0xbd,
+ 0xf9, 0x3f, 0xcd, 0xd9, 0xa0, 0x1e, 0x42, 0x4c,
+ 0xe7, 0xde, 0x25, 0x3d, 0xa3, 0xd7, 0x05, 0x80,
+ 0x8d, 0xf2, 0x82, 0xac, 0x44, 0x16, 0x51, 0x01
+};
+
+static const u8 dec_input008[] __initconst = {
+ 0x14, 0xf6, 0x41, 0x37, 0xa6, 0xd4, 0x27, 0xcd,
+ 0xdb, 0x06, 0x3e, 0x9a, 0x4e, 0xab, 0xd5, 0xb1,
+ 0x1e, 0x6b, 0xd2, 0xbc, 0x11, 0xf4, 0x28, 0x93,
+ 0x63, 0x54, 0xef, 0xbb, 0x5e, 0x1d, 0x3a, 0x1d,
+ 0x37, 0x3c, 0x0a, 0x6c, 0x1e, 0xc2, 0xd1, 0x2c,
+ 0xb5, 0xa3, 0xb5, 0x7b, 0xb8, 0x8f, 0x25, 0xa6,
+ 0x1b, 0x61, 0x1c, 0xec, 0x28, 0x58, 0x26, 0xa4,
+ 0xa8, 0x33, 0x28, 0x25, 0x5c, 0x45, 0x05, 0xe5,
+ 0x6c, 0x99, 0xe5, 0x45, 0xc4, 0xa2, 0x03, 0x84,
+ 0x03, 0x73, 0x1e, 0x8c, 0x49, 0xac, 0x20, 0xdd,
+ 0x8d, 0xb3, 0xc4, 0xf5, 0xe7, 0x4f, 0xf1, 0xed,
+ 0xa1, 0x98, 0xde, 0xa4, 0x96, 0xdd, 0x2f, 0xab,
+ 0xab, 0x97, 0xcf, 0x3e, 0xd2, 0x9e, 0xb8, 0x13,
+ 0x07, 0x28, 0x29, 0x19, 0xaf, 0xfd, 0xf2, 0x49,
+ 0x43, 0xea, 0x49, 0x26, 0x91, 0xc1, 0x07, 0xd6,
+ 0xbb, 0x81, 0x75, 0x35, 0x0d, 0x24, 0x7f, 0xc8,
+ 0xda, 0xd4, 0xb7, 0xeb, 0xe8, 0x5c, 0x09, 0xa2,
+ 0x2f, 0xdc, 0x28, 0x7d, 0x3a, 0x03, 0xfa, 0x94,
+ 0xb5, 0x1d, 0x17, 0x99, 0x36, 0xc3, 0x1c, 0x18,
+ 0x34, 0xe3, 0x9f, 0xf5, 0x55, 0x7c, 0xb0, 0x60,
+ 0x9d, 0xff, 0xac, 0xd4, 0x61, 0xf2, 0xad, 0xf8,
+ 0xce, 0xc7, 0xbe, 0x5c, 0xd2, 0x95, 0xa8, 0x4b,
+ 0x77, 0x13, 0x19, 0x59, 0x26, 0xc9, 0xb7, 0x8f,
+ 0x6a, 0xcb, 0x2d, 0x37, 0x91, 0xea, 0x92, 0x9c,
+ 0x94, 0x5b, 0xda, 0x0b, 0xce, 0xfe, 0x30, 0x20,
+ 0xf8, 0x51, 0xad, 0xf2, 0xbe, 0xe7, 0xc7, 0xff,
+ 0xb3, 0x33, 0x91, 0x6a, 0xc9, 0x1a, 0x41, 0xc9,
+ 0x0f, 0xf3, 0x10, 0x0e, 0xfd, 0x53, 0xff, 0x6c,
+ 0x16, 0x52, 0xd9, 0xf3, 0xf7, 0x98, 0x2e, 0xc9,
+ 0x07, 0x31, 0x2c, 0x0c, 0x72, 0xd7, 0xc5, 0xc6,
+ 0x08, 0x2a, 0x7b, 0xda, 0xbd, 0x7e, 0x02, 0xea,
+ 0x1a, 0xbb, 0xf2, 0x04, 0x27, 0x61, 0x28, 0x8e,
+ 0xf5, 0x04, 0x03, 0x1f, 0x4c, 0x07, 0x55, 0x82,
+ 0xec, 0x1e, 0xd7, 0x8b, 0x2f, 0x65, 0x56, 0xd1,
+ 0xd9, 0x1e, 0x3c, 0xe9, 0x1f, 0x5e, 0x98, 0x70,
+ 0x38, 0x4a, 0x8c, 0x49, 0xc5, 0x43, 0xa0, 0xa1,
+ 0x8b, 0x74, 0x9d, 0x4c, 0x62, 0x0d, 0x10, 0x0c,
+ 0xf4, 0x6c, 0x8f, 0xe0, 0xaa, 0x9a, 0x8d, 0xb7,
+ 0xe0, 0xbe, 0x4c, 0x87, 0xf1, 0x98, 0x2f, 0xcc,
+ 0xed, 0xc0, 0x52, 0x29, 0xdc, 0x83, 0xf8, 0xfc,
+ 0x2c, 0x0e, 0xa8, 0x51, 0x4d, 0x80, 0x0d, 0xa3,
+ 0xfe, 0xd8, 0x37, 0xe7, 0x41, 0x24, 0xfc, 0xfb,
+ 0x75, 0xe3, 0x71, 0x7b, 0x57, 0x45, 0xf5, 0x97,
+ 0x73, 0x65, 0x63, 0x14, 0x74, 0xb8, 0x82, 0x9f,
+ 0xf8, 0x60, 0x2f, 0x8a, 0xf2, 0x4e, 0xf1, 0x39,
+ 0xda, 0x33, 0x91, 0xf8, 0x36, 0xe0, 0x8d, 0x3f,
+ 0x1f, 0x3b, 0x56, 0xdc, 0xa0, 0x8f, 0x3c, 0x9d,
+ 0x71, 0x52, 0xa7, 0xb8, 0xc0, 0xa5, 0xc6, 0xa2,
+ 0x73, 0xda, 0xf4, 0x4b, 0x74, 0x5b, 0x00, 0x3d,
+ 0x99, 0xd7, 0x96, 0xba, 0xe6, 0xe1, 0xa6, 0x96,
+ 0x38, 0xad, 0xb3, 0xc0, 0xd2, 0xba, 0x91, 0x6b,
+ 0xf9, 0x19, 0xdd, 0x3b, 0xbe, 0xbe, 0x9c, 0x20,
+ 0x50, 0xba, 0xa1, 0xd0, 0xce, 0x11, 0xbd, 0x95,
+ 0xd8, 0xd1, 0xdd, 0x33, 0x85, 0x74, 0xdc, 0xdb,
+ 0x66, 0x76, 0x44, 0xdc, 0x03, 0x74, 0x48, 0x35,
+ 0x98, 0xb1, 0x18, 0x47, 0x94, 0x7d, 0xff, 0x62,
+ 0xe4, 0x58, 0x78, 0xab, 0xed, 0x95, 0x36, 0xd9,
+ 0x84, 0x91, 0x82, 0x64, 0x41, 0xbb, 0x58, 0xe6,
+ 0x1c, 0x20, 0x6d, 0x15, 0x6b, 0x13, 0x96, 0xe8,
+ 0x35, 0x7f, 0xdc, 0x40, 0x2c, 0xe9, 0xbc, 0x8a,
+ 0x4f, 0x92, 0xec, 0x06, 0x2d, 0x50, 0xdf, 0x93,
+ 0x5d, 0x65, 0x5a, 0xa8, 0xfc, 0x20, 0x50, 0x14,
+ 0xa9, 0x8a, 0x7e, 0x1d, 0x08, 0x1f, 0xe2, 0x99,
+ 0xd0, 0xbe, 0xfb, 0x3a, 0x21, 0x9d, 0xad, 0x86,
+ 0x54, 0xfd, 0x0d, 0x98, 0x1c, 0x5a, 0x6f, 0x1f,
+ 0x9a, 0x40, 0xcd, 0xa2, 0xff, 0x6a, 0xf1, 0x54
+};
+static const u8 dec_output008[] __initconst = {
+ 0xc3, 0x09, 0x94, 0x62, 0xe6, 0x46, 0x2e, 0x10,
+ 0xbe, 0x00, 0xe4, 0xfc, 0xf3, 0x40, 0xa3, 0xe2,
+ 0x0f, 0xc2, 0x8b, 0x28, 0xdc, 0xba, 0xb4, 0x3c,
+ 0xe4, 0x21, 0x58, 0x61, 0xcd, 0x8b, 0xcd, 0xfb,
+ 0xac, 0x94, 0xa1, 0x45, 0xf5, 0x1c, 0xe1, 0x12,
+ 0xe0, 0x3b, 0x67, 0x21, 0x54, 0x5e, 0x8c, 0xaa,
+ 0xcf, 0xdb, 0xb4, 0x51, 0xd4, 0x13, 0xda, 0xe6,
+ 0x83, 0x89, 0xb6, 0x92, 0xe9, 0x21, 0x76, 0xa4,
+ 0x93, 0x7d, 0x0e, 0xfd, 0x96, 0x36, 0x03, 0x91,
+ 0x43, 0x5c, 0x92, 0x49, 0x62, 0x61, 0x7b, 0xeb,
+ 0x43, 0x89, 0xb8, 0x12, 0x20, 0x43, 0xd4, 0x47,
+ 0x06, 0x84, 0xee, 0x47, 0xe9, 0x8a, 0x73, 0x15,
+ 0x0f, 0x72, 0xcf, 0xed, 0xce, 0x96, 0xb2, 0x7f,
+ 0x21, 0x45, 0x76, 0xeb, 0x26, 0x28, 0x83, 0x6a,
+ 0xad, 0xaa, 0xa6, 0x81, 0xd8, 0x55, 0xb1, 0xa3,
+ 0x85, 0xb3, 0x0c, 0xdf, 0xf1, 0x69, 0x2d, 0x97,
+ 0x05, 0x2a, 0xbc, 0x7c, 0x7b, 0x25, 0xf8, 0x80,
+ 0x9d, 0x39, 0x25, 0xf3, 0x62, 0xf0, 0x66, 0x5e,
+ 0xf4, 0xa0, 0xcf, 0xd8, 0xfd, 0x4f, 0xb1, 0x1f,
+ 0x60, 0x3a, 0x08, 0x47, 0xaf, 0xe1, 0xf6, 0x10,
+ 0x77, 0x09, 0xa7, 0x27, 0x8f, 0x9a, 0x97, 0x5a,
+ 0x26, 0xfa, 0xfe, 0x41, 0x32, 0x83, 0x10, 0xe0,
+ 0x1d, 0xbf, 0x64, 0x0d, 0xf4, 0x1c, 0x32, 0x35,
+ 0xe5, 0x1b, 0x36, 0xef, 0xd4, 0x4a, 0x93, 0x4d,
+ 0x00, 0x7c, 0xec, 0x02, 0x07, 0x8b, 0x5d, 0x7d,
+ 0x1b, 0x0e, 0xd1, 0xa6, 0xa5, 0x5d, 0x7d, 0x57,
+ 0x88, 0xa8, 0xcc, 0x81, 0xb4, 0x86, 0x4e, 0xb4,
+ 0x40, 0xe9, 0x1d, 0xc3, 0xb1, 0x24, 0x3e, 0x7f,
+ 0xcc, 0x8a, 0x24, 0x9b, 0xdf, 0x6d, 0xf0, 0x39,
+ 0x69, 0x3e, 0x4c, 0xc0, 0x96, 0xe4, 0x13, 0xda,
+ 0x90, 0xda, 0xf4, 0x95, 0x66, 0x8b, 0x17, 0x17,
+ 0xfe, 0x39, 0x43, 0x25, 0xaa, 0xda, 0xa0, 0x43,
+ 0x3c, 0xb1, 0x41, 0x02, 0xa3, 0xf0, 0xa7, 0x19,
+ 0x59, 0xbc, 0x1d, 0x7d, 0x6c, 0x6d, 0x91, 0x09,
+ 0x5c, 0xb7, 0x5b, 0x01, 0xd1, 0x6f, 0x17, 0x21,
+ 0x97, 0xbf, 0x89, 0x71, 0xa5, 0xb0, 0x6e, 0x07,
+ 0x45, 0xfd, 0x9d, 0xea, 0x07, 0xf6, 0x7a, 0x9f,
+ 0x10, 0x18, 0x22, 0x30, 0x73, 0xac, 0xd4, 0x6b,
+ 0x72, 0x44, 0xed, 0xd9, 0x19, 0x9b, 0x2d, 0x4a,
+ 0x41, 0xdd, 0xd1, 0x85, 0x5e, 0x37, 0x19, 0xed,
+ 0xd2, 0x15, 0x8f, 0x5e, 0x91, 0xdb, 0x33, 0xf2,
+ 0xe4, 0xdb, 0xff, 0x98, 0xfb, 0xa3, 0xb5, 0xca,
+ 0x21, 0x69, 0x08, 0xe7, 0x8a, 0xdf, 0x90, 0xff,
+ 0x3e, 0xe9, 0x20, 0x86, 0x3c, 0xe9, 0xfc, 0x0b,
+ 0xfe, 0x5c, 0x61, 0xaa, 0x13, 0x92, 0x7f, 0x7b,
+ 0xec, 0xe0, 0x6d, 0xa8, 0x23, 0x22, 0xf6, 0x6b,
+ 0x77, 0xc4, 0xfe, 0x40, 0x07, 0x3b, 0xb6, 0xf6,
+ 0x8e, 0x5f, 0xd4, 0xb9, 0xb7, 0x0f, 0x21, 0x04,
+ 0xef, 0x83, 0x63, 0x91, 0x69, 0x40, 0xa3, 0x48,
+ 0x5c, 0xd2, 0x60, 0xf9, 0x4f, 0x6c, 0x47, 0x8b,
+ 0x3b, 0xb1, 0x9f, 0x8e, 0xee, 0x16, 0x8a, 0x13,
+ 0xfc, 0x46, 0x17, 0xc3, 0xc3, 0x32, 0x56, 0xf8,
+ 0x3c, 0x85, 0x3a, 0xb6, 0x3e, 0xaa, 0x89, 0x4f,
+ 0xb3, 0xdf, 0x38, 0xfd, 0xf1, 0xe4, 0x3a, 0xc0,
+ 0xe6, 0x58, 0xb5, 0x8f, 0xc5, 0x29, 0xa2, 0x92,
+ 0x4a, 0xb6, 0xa0, 0x34, 0x7f, 0xab, 0xb5, 0x8a,
+ 0x90, 0xa1, 0xdb, 0x4d, 0xca, 0xb6, 0x2c, 0x41,
+ 0x3c, 0xf7, 0x2b, 0x21, 0xc3, 0xfd, 0xf4, 0x17,
+ 0x5c, 0xb5, 0x33, 0x17, 0x68, 0x2b, 0x08, 0x30,
+ 0xf3, 0xf7, 0x30, 0x3c, 0x96, 0xe6, 0x6a, 0x20,
+ 0x97, 0xe7, 0x4d, 0x10, 0x5f, 0x47, 0x5f, 0x49,
+ 0x96, 0x09, 0xf0, 0x27, 0x91, 0xc8, 0xf8, 0x5a,
+ 0x2e, 0x79, 0xb5, 0xe2, 0xb8, 0xe8, 0xb9, 0x7b,
+ 0xd5, 0x10, 0xcb, 0xff, 0x5d, 0x14, 0x73, 0xf3
+};
+static const u8 dec_assoc008[] __initconst = { };
+static const u8 dec_nonce008[] __initconst = {
+ 0x0e, 0x0d, 0x57, 0xbb, 0x7b, 0x40, 0x54, 0x02
+};
+static const u8 dec_key008[] __initconst = {
+ 0xf2, 0xaa, 0x4f, 0x99, 0xfd, 0x3e, 0xa8, 0x53,
+ 0xc1, 0x44, 0xe9, 0x81, 0x18, 0xdc, 0xf5, 0xf0,
+ 0x3e, 0x44, 0x15, 0x59, 0xe0, 0xc5, 0x44, 0x86,
+ 0xc3, 0x91, 0xa8, 0x75, 0xc0, 0x12, 0x46, 0xba
+};
+
+static const u8 dec_input009[] __initconst = {
+ 0xfd, 0x81, 0x8d, 0xd0, 0x3d, 0xb4, 0xd5, 0xdf,
+ 0xd3, 0x42, 0x47, 0x5a, 0x6d, 0x19, 0x27, 0x66,
+ 0x4b, 0x2e, 0x0c, 0x27, 0x9c, 0x96, 0x4c, 0x72,
+ 0x02, 0xa3, 0x65, 0xc3, 0xb3, 0x6f, 0x2e, 0xbd,
+ 0x63, 0x8a, 0x4a, 0x5d, 0x29, 0xa2, 0xd0, 0x28,
+ 0x48, 0xc5, 0x3d, 0x98, 0xa3, 0xbc, 0xe0, 0xbe,
+ 0x3b, 0x3f, 0xe6, 0x8a, 0xa4, 0x7f, 0x53, 0x06,
+ 0xfa, 0x7f, 0x27, 0x76, 0x72, 0x31, 0xa1, 0xf5,
+ 0xd6, 0x0c, 0x52, 0x47, 0xba, 0xcd, 0x4f, 0xd7,
+ 0xeb, 0x05, 0x48, 0x0d, 0x7c, 0x35, 0x4a, 0x09,
+ 0xc9, 0x76, 0x71, 0x02, 0xa3, 0xfb, 0xb7, 0x1a,
+ 0x65, 0xb7, 0xed, 0x98, 0xc6, 0x30, 0x8a, 0x00,
+ 0xae, 0xa1, 0x31, 0xe5, 0xb5, 0x9e, 0x6d, 0x62,
+ 0xda, 0xda, 0x07, 0x0f, 0x38, 0x38, 0xd3, 0xcb,
+ 0xc1, 0xb0, 0xad, 0xec, 0x72, 0xec, 0xb1, 0xa2,
+ 0x7b, 0x59, 0xf3, 0x3d, 0x2b, 0xef, 0xcd, 0x28,
+ 0x5b, 0x83, 0xcc, 0x18, 0x91, 0x88, 0xb0, 0x2e,
+ 0xf9, 0x29, 0x31, 0x18, 0xf9, 0x4e, 0xe9, 0x0a,
+ 0x91, 0x92, 0x9f, 0xae, 0x2d, 0xad, 0xf4, 0xe6,
+ 0x1a, 0xe2, 0xa4, 0xee, 0x47, 0x15, 0xbf, 0x83,
+ 0x6e, 0xd7, 0x72, 0x12, 0x3b, 0x2d, 0x24, 0xe9,
+ 0xb2, 0x55, 0xcb, 0x3c, 0x10, 0xf0, 0x24, 0x8a,
+ 0x4a, 0x02, 0xea, 0x90, 0x25, 0xf0, 0xb4, 0x79,
+ 0x3a, 0xef, 0x6e, 0xf5, 0x52, 0xdf, 0xb0, 0x0a,
+ 0xcd, 0x24, 0x1c, 0xd3, 0x2e, 0x22, 0x74, 0xea,
+ 0x21, 0x6f, 0xe9, 0xbd, 0xc8, 0x3e, 0x36, 0x5b,
+ 0x19, 0xf1, 0xca, 0x99, 0x0a, 0xb4, 0xa7, 0x52,
+ 0x1a, 0x4e, 0xf2, 0xad, 0x8d, 0x56, 0x85, 0xbb,
+ 0x64, 0x89, 0xba, 0x26, 0xf9, 0xc7, 0xe1, 0x89,
+ 0x19, 0x22, 0x77, 0xc3, 0xa8, 0xfc, 0xff, 0xad,
+ 0xfe, 0xb9, 0x48, 0xae, 0x12, 0x30, 0x9f, 0x19,
+ 0xfb, 0x1b, 0xef, 0x14, 0x87, 0x8a, 0x78, 0x71,
+ 0xf3, 0xf4, 0xb7, 0x00, 0x9c, 0x1d, 0xb5, 0x3d,
+ 0x49, 0x00, 0x0c, 0x06, 0xd4, 0x50, 0xf9, 0x54,
+ 0x45, 0xb2, 0x5b, 0x43, 0xdb, 0x6d, 0xcf, 0x1a,
+ 0xe9, 0x7a, 0x7a, 0xcf, 0xfc, 0x8a, 0x4e, 0x4d,
+ 0x0b, 0x07, 0x63, 0x28, 0xd8, 0xe7, 0x08, 0x95,
+ 0xdf, 0xa6, 0x72, 0x93, 0x2e, 0xbb, 0xa0, 0x42,
+ 0x89, 0x16, 0xf1, 0xd9, 0x0c, 0xf9, 0xa1, 0x16,
+ 0xfd, 0xd9, 0x03, 0xb4, 0x3b, 0x8a, 0xf5, 0xf6,
+ 0xe7, 0x6b, 0x2e, 0x8e, 0x4c, 0x3d, 0xe2, 0xaf,
+ 0x08, 0x45, 0x03, 0xff, 0x09, 0xb6, 0xeb, 0x2d,
+ 0xc6, 0x1b, 0x88, 0x94, 0xac, 0x3e, 0xf1, 0x9f,
+ 0x0e, 0x0e, 0x2b, 0xd5, 0x00, 0x4d, 0x3f, 0x3b,
+ 0x53, 0xae, 0xaf, 0x1c, 0x33, 0x5f, 0x55, 0x6e,
+ 0x8d, 0xaf, 0x05, 0x7a, 0x10, 0x34, 0xc9, 0xf4,
+ 0x66, 0xcb, 0x62, 0x12, 0xa6, 0xee, 0xe8, 0x1c,
+ 0x5d, 0x12, 0x86, 0xdb, 0x6f, 0x1c, 0x33, 0xc4,
+ 0x1c, 0xda, 0x82, 0x2d, 0x3b, 0x59, 0xfe, 0xb1,
+ 0xa4, 0x59, 0x41, 0x86, 0xd0, 0xef, 0xae, 0xfb,
+ 0xda, 0x6d, 0x11, 0xb8, 0xca, 0xe9, 0x6e, 0xff,
+ 0xf7, 0xa9, 0xd9, 0x70, 0x30, 0xfc, 0x53, 0xe2,
+ 0xd7, 0xa2, 0x4e, 0xc7, 0x91, 0xd9, 0x07, 0x06,
+ 0xaa, 0xdd, 0xb0, 0x59, 0x28, 0x1d, 0x00, 0x66,
+ 0xc5, 0x54, 0xc2, 0xfc, 0x06, 0xda, 0x05, 0x90,
+ 0x52, 0x1d, 0x37, 0x66, 0xee, 0xf0, 0xb2, 0x55,
+ 0x8a, 0x5d, 0xd2, 0x38, 0x86, 0x94, 0x9b, 0xfc,
+ 0x10, 0x4c, 0xa1, 0xb9, 0x64, 0x3e, 0x44, 0xb8,
+ 0x5f, 0xb0, 0x0c, 0xec, 0xe0, 0xc9, 0xe5, 0x62,
+ 0x75, 0x3f, 0x09, 0xd5, 0xf5, 0xd9, 0x26, 0xba,
+ 0x9e, 0xd2, 0xf4, 0xb9, 0x48, 0x0a, 0xbc, 0xa2,
+ 0xd6, 0x7c, 0x36, 0x11, 0x7d, 0x26, 0x81, 0x89,
+ 0xcf, 0xa4, 0xad, 0x73, 0x0e, 0xee, 0xcc, 0x06,
+ 0xa9, 0xdb, 0xb1, 0xfd, 0xfb, 0x09, 0x7f, 0x90,
+ 0x42, 0x37, 0x2f, 0xe1, 0x9c, 0x0f, 0x6f, 0xcf,
+ 0x43, 0xb5, 0xd9, 0x90, 0xe1, 0x85, 0xf5, 0xa8,
+ 0xae
+};
+static const u8 dec_output009[] __initconst = {
+ 0xe6, 0xc3, 0xdb, 0x63, 0x55, 0x15, 0xe3, 0x5b,
+ 0xb7, 0x4b, 0x27, 0x8b, 0x5a, 0xdd, 0xc2, 0xe8,
+ 0x3a, 0x6b, 0xd7, 0x81, 0x96, 0x35, 0x97, 0xca,
+ 0xd7, 0x68, 0xe8, 0xef, 0xce, 0xab, 0xda, 0x09,
+ 0x6e, 0xd6, 0x8e, 0xcb, 0x55, 0xb5, 0xe1, 0xe5,
+ 0x57, 0xfd, 0xc4, 0xe3, 0xe0, 0x18, 0x4f, 0x85,
+ 0xf5, 0x3f, 0x7e, 0x4b, 0x88, 0xc9, 0x52, 0x44,
+ 0x0f, 0xea, 0xaf, 0x1f, 0x71, 0x48, 0x9f, 0x97,
+ 0x6d, 0xb9, 0x6f, 0x00, 0xa6, 0xde, 0x2b, 0x77,
+ 0x8b, 0x15, 0xad, 0x10, 0xa0, 0x2b, 0x7b, 0x41,
+ 0x90, 0x03, 0x2d, 0x69, 0xae, 0xcc, 0x77, 0x7c,
+ 0xa5, 0x9d, 0x29, 0x22, 0xc2, 0xea, 0xb4, 0x00,
+ 0x1a, 0xd2, 0x7a, 0x98, 0x8a, 0xf9, 0xf7, 0x82,
+ 0xb0, 0xab, 0xd8, 0xa6, 0x94, 0x8d, 0x58, 0x2f,
+ 0x01, 0x9e, 0x00, 0x20, 0xfc, 0x49, 0xdc, 0x0e,
+ 0x03, 0xe8, 0x45, 0x10, 0xd6, 0xa8, 0xda, 0x55,
+ 0x10, 0x9a, 0xdf, 0x67, 0x22, 0x8b, 0x43, 0xab,
+ 0x00, 0xbb, 0x02, 0xc8, 0xdd, 0x7b, 0x97, 0x17,
+ 0xd7, 0x1d, 0x9e, 0x02, 0x5e, 0x48, 0xde, 0x8e,
+ 0xcf, 0x99, 0x07, 0x95, 0x92, 0x3c, 0x5f, 0x9f,
+ 0xc5, 0x8a, 0xc0, 0x23, 0xaa, 0xd5, 0x8c, 0x82,
+ 0x6e, 0x16, 0x92, 0xb1, 0x12, 0x17, 0x07, 0xc3,
+ 0xfb, 0x36, 0xf5, 0x6c, 0x35, 0xd6, 0x06, 0x1f,
+ 0x9f, 0xa7, 0x94, 0xa2, 0x38, 0x63, 0x9c, 0xb0,
+ 0x71, 0xb3, 0xa5, 0xd2, 0xd8, 0xba, 0x9f, 0x08,
+ 0x01, 0xb3, 0xff, 0x04, 0x97, 0x73, 0x45, 0x1b,
+ 0xd5, 0xa9, 0x9c, 0x80, 0xaf, 0x04, 0x9a, 0x85,
+ 0xdb, 0x32, 0x5b, 0x5d, 0x1a, 0xc1, 0x36, 0x28,
+ 0x10, 0x79, 0xf1, 0x3c, 0xbf, 0x1a, 0x41, 0x5c,
+ 0x4e, 0xdf, 0xb2, 0x7c, 0x79, 0x3b, 0x7a, 0x62,
+ 0x3d, 0x4b, 0xc9, 0x9b, 0x2a, 0x2e, 0x7c, 0xa2,
+ 0xb1, 0x11, 0x98, 0xa7, 0x34, 0x1a, 0x00, 0xf3,
+ 0xd1, 0xbc, 0x18, 0x22, 0xba, 0x02, 0x56, 0x62,
+ 0x31, 0x10, 0x11, 0x6d, 0xe0, 0x54, 0x9d, 0x40,
+ 0x1f, 0x26, 0x80, 0x41, 0xca, 0x3f, 0x68, 0x0f,
+ 0x32, 0x1d, 0x0a, 0x8e, 0x79, 0xd8, 0xa4, 0x1b,
+ 0x29, 0x1c, 0x90, 0x8e, 0xc5, 0xe3, 0xb4, 0x91,
+ 0x37, 0x9a, 0x97, 0x86, 0x99, 0xd5, 0x09, 0xc5,
+ 0xbb, 0xa3, 0x3f, 0x21, 0x29, 0x82, 0x14, 0x5c,
+ 0xab, 0x25, 0xfb, 0xf2, 0x4f, 0x58, 0x26, 0xd4,
+ 0x83, 0xaa, 0x66, 0x89, 0x67, 0x7e, 0xc0, 0x49,
+ 0xe1, 0x11, 0x10, 0x7f, 0x7a, 0xda, 0x29, 0x04,
+ 0xff, 0xf0, 0xcb, 0x09, 0x7c, 0x9d, 0xfa, 0x03,
+ 0x6f, 0x81, 0x09, 0x31, 0x60, 0xfb, 0x08, 0xfa,
+ 0x74, 0xd3, 0x64, 0x44, 0x7c, 0x55, 0x85, 0xec,
+ 0x9c, 0x6e, 0x25, 0xb7, 0x6c, 0xc5, 0x37, 0xb6,
+ 0x83, 0x87, 0x72, 0x95, 0x8b, 0x9d, 0xe1, 0x69,
+ 0x5c, 0x31, 0x95, 0x42, 0xa6, 0x2c, 0xd1, 0x36,
+ 0x47, 0x1f, 0xec, 0x54, 0xab, 0xa2, 0x1c, 0xd8,
+ 0x00, 0xcc, 0xbc, 0x0d, 0x65, 0xe2, 0x67, 0xbf,
+ 0xbc, 0xea, 0xee, 0x9e, 0xe4, 0x36, 0x95, 0xbe,
+ 0x73, 0xd9, 0xa6, 0xd9, 0x0f, 0xa0, 0xcc, 0x82,
+ 0x76, 0x26, 0xad, 0x5b, 0x58, 0x6c, 0x4e, 0xab,
+ 0x29, 0x64, 0xd3, 0xd9, 0xa9, 0x08, 0x8c, 0x1d,
+ 0xa1, 0x4f, 0x80, 0xd8, 0x3f, 0x94, 0xfb, 0xd3,
+ 0x7b, 0xfc, 0xd1, 0x2b, 0xc3, 0x21, 0xeb, 0xe5,
+ 0x1c, 0x84, 0x23, 0x7f, 0x4b, 0xfa, 0xdb, 0x34,
+ 0x18, 0xa2, 0xc2, 0xe5, 0x13, 0xfe, 0x6c, 0x49,
+ 0x81, 0xd2, 0x73, 0xe7, 0xe2, 0xd7, 0xe4, 0x4f,
+ 0x4b, 0x08, 0x6e, 0xb1, 0x12, 0x22, 0x10, 0x9d,
+ 0xac, 0x51, 0x1e, 0x17, 0xd9, 0x8a, 0x0b, 0x42,
+ 0x88, 0x16, 0x81, 0x37, 0x7c, 0x6a, 0xf7, 0xef,
+ 0x2d, 0xe3, 0xd9, 0xf8, 0x5f, 0xe0, 0x53, 0x27,
+ 0x74, 0xb9, 0xe2, 0xd6, 0x1c, 0x80, 0x2c, 0x52,
+ 0x65
+};
+static const u8 dec_assoc009[] __initconst = {
+ 0x5a, 0x27, 0xff, 0xeb, 0xdf, 0x84, 0xb2, 0x9e,
+ 0xef
+};
+static const u8 dec_nonce009[] __initconst = {
+ 0xef, 0x2d, 0x63, 0xee, 0x6b, 0x80, 0x8b, 0x78
+};
+static const u8 dec_key009[] __initconst = {
+ 0xea, 0xbc, 0x56, 0x99, 0xe3, 0x50, 0xff, 0xc5,
+ 0xcc, 0x1a, 0xd7, 0xc1, 0x57, 0x72, 0xea, 0x86,
+ 0x5b, 0x89, 0x88, 0x61, 0x3d, 0x2f, 0x9b, 0xb2,
+ 0xe7, 0x9c, 0xec, 0x74, 0x6e, 0x3e, 0xf4, 0x3b
+};
+
+static const u8 dec_input010[] __initconst = {
+ 0xe5, 0x26, 0xa4, 0x3d, 0xbd, 0x33, 0xd0, 0x4b,
+ 0x6f, 0x05, 0xa7, 0x6e, 0x12, 0x7a, 0xd2, 0x74,
+ 0xa6, 0xdd, 0xbd, 0x95, 0xeb, 0xf9, 0xa4, 0xf1,
+ 0x59, 0x93, 0x91, 0x70, 0xd9, 0xfe, 0x9a, 0xcd,
+ 0x53, 0x1f, 0x3a, 0xab, 0xa6, 0x7c, 0x9f, 0xa6,
+ 0x9e, 0xbd, 0x99, 0xd9, 0xb5, 0x97, 0x44, 0xd5,
+ 0x14, 0x48, 0x4d, 0x9d, 0xc0, 0xd0, 0x05, 0x96,
+ 0xeb, 0x4c, 0x78, 0x55, 0x09, 0x08, 0x01, 0x02,
+ 0x30, 0x90, 0x7b, 0x96, 0x7a, 0x7b, 0x5f, 0x30,
+ 0x41, 0x24, 0xce, 0x68, 0x61, 0x49, 0x86, 0x57,
+ 0x82, 0xdd, 0x53, 0x1c, 0x51, 0x28, 0x2b, 0x53,
+ 0x6e, 0x2d, 0xc2, 0x20, 0x4c, 0xdd, 0x8f, 0x65,
+ 0x10, 0x20, 0x50, 0xdd, 0x9d, 0x50, 0xe5, 0x71,
+ 0x40, 0x53, 0x69, 0xfc, 0x77, 0x48, 0x11, 0xb9,
+ 0xde, 0xa4, 0x8d, 0x58, 0xe4, 0xa6, 0x1a, 0x18,
+ 0x47, 0x81, 0x7e, 0xfc, 0xdd, 0xf6, 0xef, 0xce,
+ 0x2f, 0x43, 0x68, 0xd6, 0x06, 0xe2, 0x74, 0x6a,
+ 0xad, 0x90, 0xf5, 0x37, 0xf3, 0x3d, 0x82, 0x69,
+ 0x40, 0xe9, 0x6b, 0xa7, 0x3d, 0xa8, 0x1e, 0xd2,
+ 0x02, 0x7c, 0xb7, 0x9b, 0xe4, 0xda, 0x8f, 0x95,
+ 0x06, 0xc5, 0xdf, 0x73, 0xa3, 0x20, 0x9a, 0x49,
+ 0xde, 0x9c, 0xbc, 0xee, 0x14, 0x3f, 0x81, 0x5e,
+ 0xf8, 0x3b, 0x59, 0x3c, 0xe1, 0x68, 0x12, 0x5a,
+ 0x3a, 0x76, 0x3a, 0x3f, 0xf7, 0x87, 0x33, 0x0a,
+ 0x01, 0xb8, 0xd4, 0xed, 0xb6, 0xbe, 0x94, 0x5e,
+ 0x70, 0x40, 0x56, 0x67, 0x1f, 0x50, 0x44, 0x19,
+ 0xce, 0x82, 0x70, 0x10, 0x87, 0x13, 0x20, 0x0b,
+ 0x4c, 0x5a, 0xb6, 0xf6, 0xa7, 0xae, 0x81, 0x75,
+ 0x01, 0x81, 0xe6, 0x4b, 0x57, 0x7c, 0xdd, 0x6d,
+ 0xf8, 0x1c, 0x29, 0x32, 0xf7, 0xda, 0x3c, 0x2d,
+ 0xf8, 0x9b, 0x25, 0x6e, 0x00, 0xb4, 0xf7, 0x2f,
+ 0xf7, 0x04, 0xf7, 0xa1, 0x56, 0xac, 0x4f, 0x1a,
+ 0x64, 0xb8, 0x47, 0x55, 0x18, 0x7b, 0x07, 0x4d,
+ 0xbd, 0x47, 0x24, 0x80, 0x5d, 0xa2, 0x70, 0xc5,
+ 0xdd, 0x8e, 0x82, 0xd4, 0xeb, 0xec, 0xb2, 0x0c,
+ 0x39, 0xd2, 0x97, 0xc1, 0xcb, 0xeb, 0xf4, 0x77,
+ 0x59, 0xb4, 0x87, 0xef, 0xcb, 0x43, 0x2d, 0x46,
+ 0x54, 0xd1, 0xa7, 0xd7, 0x15, 0x99, 0x0a, 0x43,
+ 0xa1, 0xe0, 0x99, 0x33, 0x71, 0xc1, 0xed, 0xfe,
+ 0x72, 0x46, 0x33, 0x8e, 0x91, 0x08, 0x9f, 0xc8,
+ 0x2e, 0xca, 0xfa, 0xdc, 0x59, 0xd5, 0xc3, 0x76,
+ 0x84, 0x9f, 0xa3, 0x37, 0x68, 0xc3, 0xf0, 0x47,
+ 0x2c, 0x68, 0xdb, 0x5e, 0xc3, 0x49, 0x4c, 0xe8,
+ 0x92, 0x85, 0xe2, 0x23, 0xd3, 0x3f, 0xad, 0x32,
+ 0xe5, 0x2b, 0x82, 0xd7, 0x8f, 0x99, 0x0a, 0x59,
+ 0x5c, 0x45, 0xd9, 0xb4, 0x51, 0x52, 0xc2, 0xae,
+ 0xbf, 0x80, 0xcf, 0xc9, 0xc9, 0x51, 0x24, 0x2a,
+ 0x3b, 0x3a, 0x4d, 0xae, 0xeb, 0xbd, 0x22, 0xc3,
+ 0x0e, 0x0f, 0x59, 0x25, 0x92, 0x17, 0xe9, 0x74,
+ 0xc7, 0x8b, 0x70, 0x70, 0x36, 0x55, 0x95, 0x75,
+ 0x4b, 0xad, 0x61, 0x2b, 0x09, 0xbc, 0x82, 0xf2,
+ 0x6e, 0x94, 0x43, 0xae, 0xc3, 0xd5, 0xcd, 0x8e,
+ 0xfe, 0x5b, 0x9a, 0x88, 0x43, 0x01, 0x75, 0xb2,
+ 0x23, 0x09, 0xf7, 0x89, 0x83, 0xe7, 0xfa, 0xf9,
+ 0xb4, 0x9b, 0xf8, 0xef, 0xbd, 0x1c, 0x92, 0xc1,
+ 0xda, 0x7e, 0xfe, 0x05, 0xba, 0x5a, 0xcd, 0x07,
+ 0x6a, 0x78, 0x9e, 0x5d, 0xfb, 0x11, 0x2f, 0x79,
+ 0x38, 0xb6, 0xc2, 0x5b, 0x6b, 0x51, 0xb4, 0x71,
+ 0xdd, 0xf7, 0x2a, 0xe4, 0xf4, 0x72, 0x76, 0xad,
+ 0xc2, 0xdd, 0x64, 0x5d, 0x79, 0xb6, 0xf5, 0x7a,
+ 0x77, 0x20, 0x05, 0x3d, 0x30, 0x06, 0xd4, 0x4c,
+ 0x0a, 0x2c, 0x98, 0x5a, 0xb9, 0xd4, 0x98, 0xa9,
+ 0x3f, 0xc6, 0x12, 0xea, 0x3b, 0x4b, 0xc5, 0x79,
+ 0x64, 0x63, 0x6b, 0x09, 0x54, 0x3b, 0x14, 0x27,
+ 0xba, 0x99, 0x80, 0xc8, 0x72, 0xa8, 0x12, 0x90,
+ 0x29, 0xba, 0x40, 0x54, 0x97, 0x2b, 0x7b, 0xfe,
+ 0xeb, 0xcd, 0x01, 0x05, 0x44, 0x72, 0xdb, 0x99,
+ 0xe4, 0x61, 0xc9, 0x69, 0xd6, 0xb9, 0x28, 0xd1,
+ 0x05, 0x3e, 0xf9, 0x0b, 0x49, 0x0a, 0x49, 0xe9,
+ 0x8d, 0x0e, 0xa7, 0x4a, 0x0f, 0xaf, 0x32, 0xd0,
+ 0xe0, 0xb2, 0x3a, 0x55, 0x58, 0xfe, 0x5c, 0x28,
+ 0x70, 0x51, 0x23, 0xb0, 0x7b, 0x6a, 0x5f, 0x1e,
+ 0xb8, 0x17, 0xd7, 0x94, 0x15, 0x8f, 0xee, 0x20,
+ 0xc7, 0x42, 0x25, 0x3e, 0x9a, 0x14, 0xd7, 0x60,
+ 0x72, 0x39, 0x47, 0x48, 0xa9, 0xfe, 0xdd, 0x47,
+ 0x0a, 0xb1, 0xe6, 0x60, 0x28, 0x8c, 0x11, 0x68,
+ 0xe1, 0xff, 0xd7, 0xce, 0xc8, 0xbe, 0xb3, 0xfe,
+ 0x27, 0x30, 0x09, 0x70, 0xd7, 0xfa, 0x02, 0x33,
+ 0x3a, 0x61, 0x2e, 0xc7, 0xff, 0xa4, 0x2a, 0xa8,
+ 0x6e, 0xb4, 0x79, 0x35, 0x6d, 0x4c, 0x1e, 0x38,
+ 0xf8, 0xee, 0xd4, 0x84, 0x4e, 0x6e, 0x28, 0xa7,
+ 0xce, 0xc8, 0xc1, 0xcf, 0x80, 0x05, 0xf3, 0x04,
+ 0xef, 0xc8, 0x18, 0x28, 0x2e, 0x8d, 0x5e, 0x0c,
+ 0xdf, 0xb8, 0x5f, 0x96, 0xe8, 0xc6, 0x9c, 0x2f,
+ 0xe5, 0xa6, 0x44, 0xd7, 0xe7, 0x99, 0x44, 0x0c,
+ 0xec, 0xd7, 0x05, 0x60, 0x97, 0xbb, 0x74, 0x77,
+ 0x58, 0xd5, 0xbb, 0x48, 0xde, 0x5a, 0xb2, 0x54,
+ 0x7f, 0x0e, 0x46, 0x70, 0x6a, 0x6f, 0x78, 0xa5,
+ 0x08, 0x89, 0x05, 0x4e, 0x7e, 0xa0, 0x69, 0xb4,
+ 0x40, 0x60, 0x55, 0x77, 0x75, 0x9b, 0x19, 0xf2,
+ 0xd5, 0x13, 0x80, 0x77, 0xf9, 0x4b, 0x3f, 0x1e,
+ 0xee, 0xe6, 0x76, 0x84, 0x7b, 0x8c, 0xe5, 0x27,
+ 0xa8, 0x0a, 0x91, 0x01, 0x68, 0x71, 0x8a, 0x3f,
+ 0x06, 0xab, 0xf6, 0xa9, 0xa5, 0xe6, 0x72, 0x92,
+ 0xe4, 0x67, 0xe2, 0xa2, 0x46, 0x35, 0x84, 0x55,
+ 0x7d, 0xca, 0xa8, 0x85, 0xd0, 0xf1, 0x3f, 0xbe,
+ 0xd7, 0x34, 0x64, 0xfc, 0xae, 0xe3, 0xe4, 0x04,
+ 0x9f, 0x66, 0x02, 0xb9, 0x88, 0x10, 0xd9, 0xc4,
+ 0x4c, 0x31, 0x43, 0x7a, 0x93, 0xe2, 0x9b, 0x56,
+ 0x43, 0x84, 0xdc, 0xdc, 0xde, 0x1d, 0xa4, 0x02,
+ 0x0e, 0xc2, 0xef, 0xc3, 0xf8, 0x78, 0xd1, 0xb2,
+ 0x6b, 0x63, 0x18, 0xc9, 0xa9, 0xe5, 0x72, 0xd8,
+ 0xf3, 0xb9, 0xd1, 0x8a, 0xc7, 0x1a, 0x02, 0x27,
+ 0x20, 0x77, 0x10, 0xe5, 0xc8, 0xd4, 0x4a, 0x47,
+ 0xe5, 0xdf, 0x5f, 0x01, 0xaa, 0xb0, 0xd4, 0x10,
+ 0xbb, 0x69, 0xe3, 0x36, 0xc8, 0xe1, 0x3d, 0x43,
+ 0xfb, 0x86, 0xcd, 0xcc, 0xbf, 0xf4, 0x88, 0xe0,
+ 0x20, 0xca, 0xb7, 0x1b, 0xf1, 0x2f, 0x5c, 0xee,
+ 0xd4, 0xd3, 0xa3, 0xcc, 0xa4, 0x1e, 0x1c, 0x47,
+ 0xfb, 0xbf, 0xfc, 0xa2, 0x41, 0x55, 0x9d, 0xf6,
+ 0x5a, 0x5e, 0x65, 0x32, 0x34, 0x7b, 0x52, 0x8d,
+ 0xd5, 0xd0, 0x20, 0x60, 0x03, 0xab, 0x3f, 0x8c,
+ 0xd4, 0x21, 0xea, 0x2a, 0xd9, 0xc4, 0xd0, 0xd3,
+ 0x65, 0xd8, 0x7a, 0x13, 0x28, 0x62, 0x32, 0x4b,
+ 0x2c, 0x87, 0x93, 0xa8, 0xb4, 0x52, 0x45, 0x09,
+ 0x44, 0xec, 0xec, 0xc3, 0x17, 0xdb, 0x9a, 0x4d,
+ 0x5c, 0xa9, 0x11, 0xd4, 0x7d, 0xaf, 0x9e, 0xf1,
+ 0x2d, 0xb2, 0x66, 0xc5, 0x1d, 0xed, 0xb7, 0xcd,
+ 0x0b, 0x25, 0x5e, 0x30, 0x47, 0x3f, 0x40, 0xf4,
+ 0xa1, 0xa0, 0x00, 0x94, 0x10, 0xc5, 0x6a, 0x63,
+ 0x1a, 0xd5, 0x88, 0x92, 0x8e, 0x82, 0x39, 0x87,
+ 0x3c, 0x78, 0x65, 0x58, 0x42, 0x75, 0x5b, 0xdd,
+ 0x77, 0x3e, 0x09, 0x4e, 0x76, 0x5b, 0xe6, 0x0e,
+ 0x4d, 0x38, 0xb2, 0xc0, 0xb8, 0x95, 0x01, 0x7a,
+ 0x10, 0xe0, 0xfb, 0x07, 0xf2, 0xab, 0x2d, 0x8c,
+ 0x32, 0xed, 0x2b, 0xc0, 0x46, 0xc2, 0xf5, 0x38,
+ 0x83, 0xf0, 0x17, 0xec, 0xc1, 0x20, 0x6a, 0x9a,
+ 0x0b, 0x00, 0xa0, 0x98, 0x22, 0x50, 0x23, 0xd5,
+ 0x80, 0x6b, 0xf6, 0x1f, 0xc3, 0xcc, 0x97, 0xc9,
+ 0x24, 0x9f, 0xf3, 0xaf, 0x43, 0x14, 0xd5, 0xa0
+};
+static const u8 dec_output010[] __initconst = {
+ 0x42, 0x93, 0xe4, 0xeb, 0x97, 0xb0, 0x57, 0xbf,
+ 0x1a, 0x8b, 0x1f, 0xe4, 0x5f, 0x36, 0x20, 0x3c,
+ 0xef, 0x0a, 0xa9, 0x48, 0x5f, 0x5f, 0x37, 0x22,
+ 0x3a, 0xde, 0xe3, 0xae, 0xbe, 0xad, 0x07, 0xcc,
+ 0xb1, 0xf6, 0xf5, 0xf9, 0x56, 0xdd, 0xe7, 0x16,
+ 0x1e, 0x7f, 0xdf, 0x7a, 0x9e, 0x75, 0xb7, 0xc7,
+ 0xbe, 0xbe, 0x8a, 0x36, 0x04, 0xc0, 0x10, 0xf4,
+ 0x95, 0x20, 0x03, 0xec, 0xdc, 0x05, 0xa1, 0x7d,
+ 0xc4, 0xa9, 0x2c, 0x82, 0xd0, 0xbc, 0x8b, 0xc5,
+ 0xc7, 0x45, 0x50, 0xf6, 0xa2, 0x1a, 0xb5, 0x46,
+ 0x3b, 0x73, 0x02, 0xa6, 0x83, 0x4b, 0x73, 0x82,
+ 0x58, 0x5e, 0x3b, 0x65, 0x2f, 0x0e, 0xfd, 0x2b,
+ 0x59, 0x16, 0xce, 0xa1, 0x60, 0x9c, 0xe8, 0x3a,
+ 0x99, 0xed, 0x8d, 0x5a, 0xcf, 0xf6, 0x83, 0xaf,
+ 0xba, 0xd7, 0x73, 0x73, 0x40, 0x97, 0x3d, 0xca,
+ 0xef, 0x07, 0x57, 0xe6, 0xd9, 0x70, 0x0e, 0x95,
+ 0xae, 0xa6, 0x8d, 0x04, 0xcc, 0xee, 0xf7, 0x09,
+ 0x31, 0x77, 0x12, 0xa3, 0x23, 0x97, 0x62, 0xb3,
+ 0x7b, 0x32, 0xfb, 0x80, 0x14, 0x48, 0x81, 0xc3,
+ 0xe5, 0xea, 0x91, 0x39, 0x52, 0x81, 0xa2, 0x4f,
+ 0xe4, 0xb3, 0x09, 0xff, 0xde, 0x5e, 0xe9, 0x58,
+ 0x84, 0x6e, 0xf9, 0x3d, 0xdf, 0x25, 0xea, 0xad,
+ 0xae, 0xe6, 0x9a, 0xd1, 0x89, 0x55, 0xd3, 0xde,
+ 0x6c, 0x52, 0xdb, 0x70, 0xfe, 0x37, 0xce, 0x44,
+ 0x0a, 0xa8, 0x25, 0x5f, 0x92, 0xc1, 0x33, 0x4a,
+ 0x4f, 0x9b, 0x62, 0x35, 0xff, 0xce, 0xc0, 0xa9,
+ 0x60, 0xce, 0x52, 0x00, 0x97, 0x51, 0x35, 0x26,
+ 0x2e, 0xb9, 0x36, 0xa9, 0x87, 0x6e, 0x1e, 0xcc,
+ 0x91, 0x78, 0x53, 0x98, 0x86, 0x5b, 0x9c, 0x74,
+ 0x7d, 0x88, 0x33, 0xe1, 0xdf, 0x37, 0x69, 0x2b,
+ 0xbb, 0xf1, 0x4d, 0xf4, 0xd1, 0xf1, 0x39, 0x93,
+ 0x17, 0x51, 0x19, 0xe3, 0x19, 0x1e, 0x76, 0x37,
+ 0x25, 0xfb, 0x09, 0x27, 0x6a, 0xab, 0x67, 0x6f,
+ 0x14, 0x12, 0x64, 0xe7, 0xc4, 0x07, 0xdf, 0x4d,
+ 0x17, 0xbb, 0x6d, 0xe0, 0xe9, 0xb9, 0xab, 0xca,
+ 0x10, 0x68, 0xaf, 0x7e, 0xb7, 0x33, 0x54, 0x73,
+ 0x07, 0x6e, 0xf7, 0x81, 0x97, 0x9c, 0x05, 0x6f,
+ 0x84, 0x5f, 0xd2, 0x42, 0xfb, 0x38, 0xcf, 0xd1,
+ 0x2f, 0x14, 0x30, 0x88, 0x98, 0x4d, 0x5a, 0xa9,
+ 0x76, 0xd5, 0x4f, 0x3e, 0x70, 0x6c, 0x85, 0x76,
+ 0xd7, 0x01, 0xa0, 0x1a, 0xc8, 0x4e, 0xaa, 0xac,
+ 0x78, 0xfe, 0x46, 0xde, 0x6a, 0x05, 0x46, 0xa7,
+ 0x43, 0x0c, 0xb9, 0xde, 0xb9, 0x68, 0xfb, 0xce,
+ 0x42, 0x99, 0x07, 0x4d, 0x0b, 0x3b, 0x5a, 0x30,
+ 0x35, 0xa8, 0xf9, 0x3a, 0x73, 0xef, 0x0f, 0xdb,
+ 0x1e, 0x16, 0x42, 0xc4, 0xba, 0xae, 0x58, 0xaa,
+ 0xf8, 0xe5, 0x75, 0x2f, 0x1b, 0x15, 0x5c, 0xfd,
+ 0x0a, 0x97, 0xd0, 0xe4, 0x37, 0x83, 0x61, 0x5f,
+ 0x43, 0xa6, 0xc7, 0x3f, 0x38, 0x59, 0xe6, 0xeb,
+ 0xa3, 0x90, 0xc3, 0xaa, 0xaa, 0x5a, 0xd3, 0x34,
+ 0xd4, 0x17, 0xc8, 0x65, 0x3e, 0x57, 0xbc, 0x5e,
+ 0xdd, 0x9e, 0xb7, 0xf0, 0x2e, 0x5b, 0xb2, 0x1f,
+ 0x8a, 0x08, 0x0d, 0x45, 0x91, 0x0b, 0x29, 0x53,
+ 0x4f, 0x4c, 0x5a, 0x73, 0x56, 0xfe, 0xaf, 0x41,
+ 0x01, 0x39, 0x0a, 0x24, 0x3c, 0x7e, 0xbe, 0x4e,
+ 0x53, 0xf3, 0xeb, 0x06, 0x66, 0x51, 0x28, 0x1d,
+ 0xbd, 0x41, 0x0a, 0x01, 0xab, 0x16, 0x47, 0x27,
+ 0x47, 0x47, 0xf7, 0xcb, 0x46, 0x0a, 0x70, 0x9e,
+ 0x01, 0x9c, 0x09, 0xe1, 0x2a, 0x00, 0x1a, 0xd8,
+ 0xd4, 0x79, 0x9d, 0x80, 0x15, 0x8e, 0x53, 0x2a,
+ 0x65, 0x83, 0x78, 0x3e, 0x03, 0x00, 0x07, 0x12,
+ 0x1f, 0x33, 0x3e, 0x7b, 0x13, 0x37, 0xf1, 0xc3,
+ 0xef, 0xb7, 0xc1, 0x20, 0x3c, 0x3e, 0x67, 0x66,
+ 0x5d, 0x88, 0xa7, 0x7d, 0x33, 0x50, 0x77, 0xb0,
+ 0x28, 0x8e, 0xe7, 0x2c, 0x2e, 0x7a, 0xf4, 0x3c,
+ 0x8d, 0x74, 0x83, 0xaf, 0x8e, 0x87, 0x0f, 0xe4,
+ 0x50, 0xff, 0x84, 0x5c, 0x47, 0x0c, 0x6a, 0x49,
+ 0xbf, 0x42, 0x86, 0x77, 0x15, 0x48, 0xa5, 0x90,
+ 0x5d, 0x93, 0xd6, 0x2a, 0x11, 0xd5, 0xd5, 0x11,
+ 0xaa, 0xce, 0xe7, 0x6f, 0xa5, 0xb0, 0x09, 0x2c,
+ 0x8d, 0xd3, 0x92, 0xf0, 0x5a, 0x2a, 0xda, 0x5b,
+ 0x1e, 0xd5, 0x9a, 0xc4, 0xc4, 0xf3, 0x49, 0x74,
+ 0x41, 0xca, 0xe8, 0xc1, 0xf8, 0x44, 0xd6, 0x3c,
+ 0xae, 0x6c, 0x1d, 0x9a, 0x30, 0x04, 0x4d, 0x27,
+ 0x0e, 0xb1, 0x5f, 0x59, 0xa2, 0x24, 0xe8, 0xe1,
+ 0x98, 0xc5, 0x6a, 0x4c, 0xfe, 0x41, 0xd2, 0x27,
+ 0x42, 0x52, 0xe1, 0xe9, 0x7d, 0x62, 0xe4, 0x88,
+ 0x0f, 0xad, 0xb2, 0x70, 0xcb, 0x9d, 0x4c, 0x27,
+ 0x2e, 0x76, 0x1e, 0x1a, 0x63, 0x65, 0xf5, 0x3b,
+ 0xf8, 0x57, 0x69, 0xeb, 0x5b, 0x38, 0x26, 0x39,
+ 0x33, 0x25, 0x45, 0x3e, 0x91, 0xb8, 0xd8, 0xc7,
+ 0xd5, 0x42, 0xc0, 0x22, 0x31, 0x74, 0xf4, 0xbc,
+ 0x0c, 0x23, 0xf1, 0xca, 0xc1, 0x8d, 0xd7, 0xbe,
+ 0xc9, 0x62, 0xe4, 0x08, 0x1a, 0xcf, 0x36, 0xd5,
+ 0xfe, 0x55, 0x21, 0x59, 0x91, 0x87, 0x87, 0xdf,
+ 0x06, 0xdb, 0xdf, 0x96, 0x45, 0x58, 0xda, 0x05,
+ 0xcd, 0x50, 0x4d, 0xd2, 0x7d, 0x05, 0x18, 0x73,
+ 0x6a, 0x8d, 0x11, 0x85, 0xa6, 0x88, 0xe8, 0xda,
+ 0xe6, 0x30, 0x33, 0xa4, 0x89, 0x31, 0x75, 0xbe,
+ 0x69, 0x43, 0x84, 0x43, 0x50, 0x87, 0xdd, 0x71,
+ 0x36, 0x83, 0xc3, 0x78, 0x74, 0x24, 0x0a, 0xed,
+ 0x7b, 0xdb, 0xa4, 0x24, 0x0b, 0xb9, 0x7e, 0x5d,
+ 0xff, 0xde, 0xb1, 0xef, 0x61, 0x5a, 0x45, 0x33,
+ 0xf6, 0x17, 0x07, 0x08, 0x98, 0x83, 0x92, 0x0f,
+ 0x23, 0x6d, 0xe6, 0xaa, 0x17, 0x54, 0xad, 0x6a,
+ 0xc8, 0xdb, 0x26, 0xbe, 0xb8, 0xb6, 0x08, 0xfa,
+ 0x68, 0xf1, 0xd7, 0x79, 0x6f, 0x18, 0xb4, 0x9e,
+ 0x2d, 0x3f, 0x1b, 0x64, 0xaf, 0x8d, 0x06, 0x0e,
+ 0x49, 0x28, 0xe0, 0x5d, 0x45, 0x68, 0x13, 0x87,
+ 0xfa, 0xde, 0x40, 0x7b, 0xd2, 0xc3, 0x94, 0xd5,
+ 0xe1, 0xd9, 0xc2, 0xaf, 0x55, 0x89, 0xeb, 0xb4,
+ 0x12, 0x59, 0xa8, 0xd4, 0xc5, 0x29, 0x66, 0x38,
+ 0xe6, 0xac, 0x22, 0x22, 0xd9, 0x64, 0x9b, 0x34,
+ 0x0a, 0x32, 0x9f, 0xc2, 0xbf, 0x17, 0x6c, 0x3f,
+ 0x71, 0x7a, 0x38, 0x6b, 0x98, 0xfb, 0x49, 0x36,
+ 0x89, 0xc9, 0xe2, 0xd6, 0xc7, 0x5d, 0xd0, 0x69,
+ 0x5f, 0x23, 0x35, 0xc9, 0x30, 0xe2, 0xfd, 0x44,
+ 0x58, 0x39, 0xd7, 0x97, 0xfb, 0x5c, 0x00, 0xd5,
+ 0x4f, 0x7a, 0x1a, 0x95, 0x8b, 0x62, 0x4b, 0xce,
+ 0xe5, 0x91, 0x21, 0x7b, 0x30, 0x00, 0xd6, 0xdd,
+ 0x6d, 0x02, 0x86, 0x49, 0x0f, 0x3c, 0x1a, 0x27,
+ 0x3c, 0xd3, 0x0e, 0x71, 0xf2, 0xff, 0xf5, 0x2f,
+ 0x87, 0xac, 0x67, 0x59, 0x81, 0xa3, 0xf7, 0xf8,
+ 0xd6, 0x11, 0x0c, 0x84, 0xa9, 0x03, 0xee, 0x2a,
+ 0xc4, 0xf3, 0x22, 0xab, 0x7c, 0xe2, 0x25, 0xf5,
+ 0x67, 0xa3, 0xe4, 0x11, 0xe0, 0x59, 0xb3, 0xca,
+ 0x87, 0xa0, 0xae, 0xc9, 0xa6, 0x62, 0x1b, 0x6e,
+ 0x4d, 0x02, 0x6b, 0x07, 0x9d, 0xfd, 0xd0, 0x92,
+ 0x06, 0xe1, 0xb2, 0x9a, 0x4a, 0x1f, 0x1f, 0x13,
+ 0x49, 0x99, 0x97, 0x08, 0xde, 0x7f, 0x98, 0xaf,
+ 0x51, 0x98, 0xee, 0x2c, 0xcb, 0xf0, 0x0b, 0xc6,
+ 0xb6, 0xb7, 0x2d, 0x9a, 0xb1, 0xac, 0xa6, 0xe3,
+ 0x15, 0x77, 0x9d, 0x6b, 0x1a, 0xe4, 0xfc, 0x8b,
+ 0xf2, 0x17, 0x59, 0x08, 0x04, 0x58, 0x81, 0x9d,
+ 0x1b, 0x1b, 0x69, 0x55, 0xc2, 0xb4, 0x3c, 0x1f,
+ 0x50, 0xf1, 0x7f, 0x77, 0x90, 0x4c, 0x66, 0x40,
+ 0x5a, 0xc0, 0x33, 0x1f, 0xcb, 0x05, 0x6d, 0x5c,
+ 0x06, 0x87, 0x52, 0xa2, 0x8f, 0x26, 0xd5, 0x4f
+};
+static const u8 dec_assoc010[] __initconst = {
+ 0xd2, 0xa1, 0x70, 0xdb, 0x7a, 0xf8, 0xfa, 0x27,
+ 0xba, 0x73, 0x0f, 0xbf, 0x3d, 0x1e, 0x82, 0xb2
+};
+static const u8 dec_nonce010[] __initconst = {
+ 0xdb, 0x92, 0x0f, 0x7f, 0x17, 0x54, 0x0c, 0x30
+};
+static const u8 dec_key010[] __initconst = {
+ 0x47, 0x11, 0xeb, 0x86, 0x2b, 0x2c, 0xab, 0x44,
+ 0x34, 0xda, 0x7f, 0x57, 0x03, 0x39, 0x0c, 0xaf,
+ 0x2c, 0x14, 0xfd, 0x65, 0x23, 0xe9, 0x8e, 0x74,
+ 0xd5, 0x08, 0x68, 0x08, 0xe7, 0xb4, 0x72, 0xd7
+};
+
+static const u8 dec_input011[] __initconst = {
+ 0x6a, 0xfc, 0x4b, 0x25, 0xdf, 0xc0, 0xe4, 0xe8,
+ 0x17, 0x4d, 0x4c, 0xc9, 0x7e, 0xde, 0x3a, 0xcc,
+ 0x3c, 0xba, 0x6a, 0x77, 0x47, 0xdb, 0xe3, 0x74,
+ 0x7a, 0x4d, 0x5f, 0x8d, 0x37, 0x55, 0x80, 0x73,
+ 0x90, 0x66, 0x5d, 0x3a, 0x7d, 0x5d, 0x86, 0x5e,
+ 0x8d, 0xfd, 0x83, 0xff, 0x4e, 0x74, 0x6f, 0xf9,
+ 0xe6, 0x70, 0x17, 0x70, 0x3e, 0x96, 0xa7, 0x7e,
+ 0xcb, 0xab, 0x8f, 0x58, 0x24, 0x9b, 0x01, 0xfd,
+ 0xcb, 0xe6, 0x4d, 0x9b, 0xf0, 0x88, 0x94, 0x57,
+ 0x66, 0xef, 0x72, 0x4c, 0x42, 0x6e, 0x16, 0x19,
+ 0x15, 0xea, 0x70, 0x5b, 0xac, 0x13, 0xdb, 0x9f,
+ 0x18, 0xe2, 0x3c, 0x26, 0x97, 0xbc, 0xdc, 0x45,
+ 0x8c, 0x6c, 0x24, 0x69, 0x9c, 0xf7, 0x65, 0x1e,
+ 0x18, 0x59, 0x31, 0x7c, 0xe4, 0x73, 0xbc, 0x39,
+ 0x62, 0xc6, 0x5c, 0x9f, 0xbf, 0xfa, 0x90, 0x03,
+ 0xc9, 0x72, 0x26, 0xb6, 0x1b, 0xc2, 0xb7, 0x3f,
+ 0xf2, 0x13, 0x77, 0xf2, 0x8d, 0xb9, 0x47, 0xd0,
+ 0x53, 0xdd, 0xc8, 0x91, 0x83, 0x8b, 0xb1, 0xce,
+ 0xa3, 0xfe, 0xcd, 0xd9, 0xdd, 0x92, 0x7b, 0xdb,
+ 0xb8, 0xfb, 0xc9, 0x2d, 0x01, 0x59, 0x39, 0x52,
+ 0xad, 0x1b, 0xec, 0xcf, 0xd7, 0x70, 0x13, 0x21,
+ 0xf5, 0x47, 0xaa, 0x18, 0x21, 0x5c, 0xc9, 0x9a,
+ 0xd2, 0x6b, 0x05, 0x9c, 0x01, 0xa1, 0xda, 0x35,
+ 0x5d, 0xb3, 0x70, 0xe6, 0xa9, 0x80, 0x8b, 0x91,
+ 0xb7, 0xb3, 0x5f, 0x24, 0x9a, 0xb7, 0xd1, 0x6b,
+ 0xa1, 0x1c, 0x50, 0xba, 0x49, 0xe0, 0xee, 0x2e,
+ 0x75, 0xac, 0x69, 0xc0, 0xeb, 0x03, 0xdd, 0x19,
+ 0xe5, 0xf6, 0x06, 0xdd, 0xc3, 0xd7, 0x2b, 0x07,
+ 0x07, 0x30, 0xa7, 0x19, 0x0c, 0xbf, 0xe6, 0x18,
+ 0xcc, 0xb1, 0x01, 0x11, 0x85, 0x77, 0x1d, 0x96,
+ 0xa7, 0xa3, 0x00, 0x84, 0x02, 0xa2, 0x83, 0x68,
+ 0xda, 0x17, 0x27, 0xc8, 0x7f, 0x23, 0xb7, 0xf4,
+ 0x13, 0x85, 0xcf, 0xdd, 0x7a, 0x7d, 0x24, 0x57,
+ 0xfe, 0x05, 0x93, 0xf5, 0x74, 0xce, 0xed, 0x0c,
+ 0x20, 0x98, 0x8d, 0x92, 0x30, 0xa1, 0x29, 0x23,
+ 0x1a, 0xa0, 0x4f, 0x69, 0x56, 0x4c, 0xe1, 0xc8,
+ 0xce, 0xf6, 0x9a, 0x0c, 0xa4, 0xfa, 0x04, 0xf6,
+ 0x62, 0x95, 0xf2, 0xfa, 0xc7, 0x40, 0x68, 0x40,
+ 0x8f, 0x41, 0xda, 0xb4, 0x26, 0x6f, 0x70, 0xab,
+ 0x40, 0x61, 0xa4, 0x0e, 0x75, 0xfb, 0x86, 0xeb,
+ 0x9d, 0x9a, 0x1f, 0xec, 0x76, 0x99, 0xe7, 0xea,
+ 0xaa, 0x1e, 0x2d, 0xb5, 0xd4, 0xa6, 0x1a, 0xb8,
+ 0x61, 0x0a, 0x1d, 0x16, 0x5b, 0x98, 0xc2, 0x31,
+ 0x40, 0xe7, 0x23, 0x1d, 0x66, 0x99, 0xc8, 0xc0,
+ 0xd7, 0xce, 0xf3, 0x57, 0x40, 0x04, 0x3f, 0xfc,
+ 0xea, 0xb3, 0xfc, 0xd2, 0xd3, 0x99, 0xa4, 0x94,
+ 0x69, 0xa0, 0xef, 0xd1, 0x85, 0xb3, 0xa6, 0xb1,
+ 0x28, 0xbf, 0x94, 0x67, 0x22, 0xc3, 0x36, 0x46,
+ 0xf8, 0xd2, 0x0f, 0x5f, 0xf4, 0x59, 0x80, 0xe6,
+ 0x2d, 0x43, 0x08, 0x7d, 0x19, 0x09, 0x97, 0xa7,
+ 0x4c, 0x3d, 0x8d, 0xba, 0x65, 0x62, 0xa3, 0x71,
+ 0x33, 0x29, 0x62, 0xdb, 0xc1, 0x33, 0x34, 0x1a,
+ 0x63, 0x33, 0x16, 0xb6, 0x64, 0x7e, 0xab, 0x33,
+ 0xf0, 0xe6, 0x26, 0x68, 0xba, 0x1d, 0x2e, 0x38,
+ 0x08, 0xe6, 0x02, 0xd3, 0x25, 0x2c, 0x47, 0x23,
+ 0x58, 0x34, 0x0f, 0x9d, 0x63, 0x4f, 0x63, 0xbb,
+ 0x7f, 0x3b, 0x34, 0x38, 0xa7, 0xb5, 0x8d, 0x65,
+ 0xd9, 0x9f, 0x79, 0x55, 0x3e, 0x4d, 0xe7, 0x73,
+ 0xd8, 0xf6, 0x98, 0x97, 0x84, 0x60, 0x9c, 0xc8,
+ 0xa9, 0x3c, 0xf6, 0xdc, 0x12, 0x5c, 0xe1, 0xbb,
+ 0x0b, 0x8b, 0x98, 0x9c, 0x9d, 0x26, 0x7c, 0x4a,
+ 0xe6, 0x46, 0x36, 0x58, 0x21, 0x4a, 0xee, 0xca,
+ 0xd7, 0x3b, 0xc2, 0x6c, 0x49, 0x2f, 0xe5, 0xd5,
+ 0x03, 0x59, 0x84, 0x53, 0xcb, 0xfe, 0x92, 0x71,
+ 0x2e, 0x7c, 0x21, 0xcc, 0x99, 0x85, 0x7f, 0xb8,
+ 0x74, 0x90, 0x13, 0x42, 0x3f, 0xe0, 0x6b, 0x1d,
+ 0xf2, 0x4d, 0x54, 0xd4, 0xfc, 0x3a, 0x05, 0xe6,
+ 0x74, 0xaf, 0xa6, 0xa0, 0x2a, 0x20, 0x23, 0x5d,
+ 0x34, 0x5c, 0xd9, 0x3e, 0x4e, 0xfa, 0x93, 0xe7,
+ 0xaa, 0xe9, 0x6f, 0x08, 0x43, 0x67, 0x41, 0xc5,
+ 0xad, 0xfb, 0x31, 0x95, 0x82, 0x73, 0x32, 0xd8,
+ 0xa6, 0xa3, 0xed, 0x0e, 0x2d, 0xf6, 0x5f, 0xfd,
+ 0x80, 0xa6, 0x7a, 0xe0, 0xdf, 0x78, 0x15, 0x29,
+ 0x74, 0x33, 0xd0, 0x9e, 0x83, 0x86, 0x72, 0x22,
+ 0x57, 0x29, 0xb9, 0x9e, 0x5d, 0xd3, 0x1a, 0xb5,
+ 0x96, 0x72, 0x41, 0x3d, 0xf1, 0x64, 0x43, 0x67,
+ 0xee, 0xaa, 0x5c, 0xd3, 0x9a, 0x96, 0x13, 0x11,
+ 0x5d, 0xf3, 0x0c, 0x87, 0x82, 0x1e, 0x41, 0x9e,
+ 0xd0, 0x27, 0xd7, 0x54, 0x3b, 0x67, 0x73, 0x09,
+ 0x91, 0xe9, 0xd5, 0x36, 0xa7, 0xb5, 0x55, 0xe4,
+ 0xf3, 0x21, 0x51, 0x49, 0x22, 0x07, 0x55, 0x4f,
+ 0x44, 0x4b, 0xd2, 0x15, 0x93, 0x17, 0x2a, 0xfa,
+ 0x4d, 0x4a, 0x57, 0xdb, 0x4c, 0xa6, 0xeb, 0xec,
+ 0x53, 0x25, 0x6c, 0x21, 0xed, 0x00, 0x4c, 0x3b,
+ 0xca, 0x14, 0x57, 0xa9, 0xd6, 0x6a, 0xcd, 0x8d,
+ 0x5e, 0x74, 0xac, 0x72, 0xc1, 0x97, 0xe5, 0x1b,
+ 0x45, 0x4e, 0xda, 0xfc, 0xcc, 0x40, 0xe8, 0x48,
+ 0x88, 0x0b, 0xa3, 0xe3, 0x8d, 0x83, 0x42, 0xc3,
+ 0x23, 0xfd, 0x68, 0xb5, 0x8e, 0xf1, 0x9d, 0x63,
+ 0x77, 0xe9, 0xa3, 0x8e, 0x8c, 0x26, 0x6b, 0xbd,
+ 0x72, 0x73, 0x35, 0x0c, 0x03, 0xf8, 0x43, 0x78,
+ 0x52, 0x71, 0x15, 0x1f, 0x71, 0x5d, 0x6e, 0xed,
+ 0xb9, 0xcc, 0x86, 0x30, 0xdb, 0x2b, 0xd3, 0x82,
+ 0x88, 0x23, 0x71, 0x90, 0x53, 0x5c, 0xa9, 0x2f,
+ 0x76, 0x01, 0xb7, 0x9a, 0xfe, 0x43, 0x55, 0xa3,
+ 0x04, 0x9b, 0x0e, 0xe4, 0x59, 0xdf, 0xc9, 0xe9,
+ 0xb1, 0xea, 0x29, 0x28, 0x3c, 0x5c, 0xae, 0x72,
+ 0x84, 0xb6, 0xc6, 0xeb, 0x0c, 0x27, 0x07, 0x74,
+ 0x90, 0x0d, 0x31, 0xb0, 0x00, 0x77, 0xe9, 0x40,
+ 0x70, 0x6f, 0x68, 0xa7, 0xfd, 0x06, 0xec, 0x4b,
+ 0xc0, 0xb7, 0xac, 0xbc, 0x33, 0xb7, 0x6d, 0x0a,
+ 0xbd, 0x12, 0x1b, 0x59, 0xcb, 0xdd, 0x32, 0xf5,
+ 0x1d, 0x94, 0x57, 0x76, 0x9e, 0x0c, 0x18, 0x98,
+ 0x71, 0xd7, 0x2a, 0xdb, 0x0b, 0x7b, 0xa7, 0x71,
+ 0xb7, 0x67, 0x81, 0x23, 0x96, 0xae, 0xb9, 0x7e,
+ 0x32, 0x43, 0x92, 0x8a, 0x19, 0xa0, 0xc4, 0xd4,
+ 0x3b, 0x57, 0xf9, 0x4a, 0x2c, 0xfb, 0x51, 0x46,
+ 0xbb, 0xcb, 0x5d, 0xb3, 0xef, 0x13, 0x93, 0x6e,
+ 0x68, 0x42, 0x54, 0x57, 0xd3, 0x6a, 0x3a, 0x8f,
+ 0x9d, 0x66, 0xbf, 0xbd, 0x36, 0x23, 0xf5, 0x93,
+ 0x83, 0x7b, 0x9c, 0xc0, 0xdd, 0xc5, 0x49, 0xc0,
+ 0x64, 0xed, 0x07, 0x12, 0xb3, 0xe6, 0xe4, 0xe5,
+ 0x38, 0x95, 0x23, 0xb1, 0xa0, 0x3b, 0x1a, 0x61,
+ 0xda, 0x17, 0xac, 0xc3, 0x58, 0xdd, 0x74, 0x64,
+ 0x22, 0x11, 0xe8, 0x32, 0x1d, 0x16, 0x93, 0x85,
+ 0x99, 0xa5, 0x9c, 0x34, 0x55, 0xb1, 0xe9, 0x20,
+ 0x72, 0xc9, 0x28, 0x7b, 0x79, 0x00, 0xa1, 0xa6,
+ 0xa3, 0x27, 0x40, 0x18, 0x8a, 0x54, 0xe0, 0xcc,
+ 0xe8, 0x4e, 0x8e, 0x43, 0x96, 0xe7, 0x3f, 0xc8,
+ 0xe9, 0xb2, 0xf9, 0xc9, 0xda, 0x04, 0x71, 0x50,
+ 0x47, 0xe4, 0xaa, 0xce, 0xa2, 0x30, 0xc8, 0xe4,
+ 0xac, 0xc7, 0x0d, 0x06, 0x2e, 0xe6, 0xe8, 0x80,
+ 0x36, 0x29, 0x9e, 0x01, 0xb8, 0xc3, 0xf0, 0xa0,
+ 0x5d, 0x7a, 0xca, 0x4d, 0xa0, 0x57, 0xbd, 0x2a,
+ 0x45, 0xa7, 0x7f, 0x9c, 0x93, 0x07, 0x8f, 0x35,
+ 0x67, 0x92, 0xe3, 0xe9, 0x7f, 0xa8, 0x61, 0x43,
+ 0x9e, 0x25, 0x4f, 0x33, 0x76, 0x13, 0x6e, 0x12,
+ 0xb9, 0xdd, 0xa4, 0x7c, 0x08, 0x9f, 0x7c, 0xe7,
+ 0x0a, 0x8d, 0x84, 0x06, 0xa4, 0x33, 0x17, 0x34,
+ 0x5e, 0x10, 0x7c, 0xc0, 0xa8, 0x3d, 0x1f, 0x42,
+ 0x20, 0x51, 0x65, 0x5d, 0x09, 0xc3, 0xaa, 0xc0,
+ 0xc8, 0x0d, 0xf0, 0x79, 0xbc, 0x20, 0x1b, 0x95,
+ 0xe7, 0x06, 0x7d, 0x47, 0x20, 0x03, 0x1a, 0x74,
+ 0xdd, 0xe2, 0xd4, 0xae, 0x38, 0x71, 0x9b, 0xf5,
+ 0x80, 0xec, 0x08, 0x4e, 0x56, 0xba, 0x76, 0x12,
+ 0x1a, 0xdf, 0x48, 0xf3, 0xae, 0xb3, 0xe6, 0xe6,
+ 0xbe, 0xc0, 0x91, 0x2e, 0x01, 0xb3, 0x01, 0x86,
+ 0xa2, 0xb9, 0x52, 0xd1, 0x21, 0xae, 0xd4, 0x97,
+ 0x1d, 0xef, 0x41, 0x12, 0x95, 0x3d, 0x48, 0x45,
+ 0x1c, 0x56, 0x32, 0x8f, 0xb8, 0x43, 0xbb, 0x19,
+ 0xf3, 0xca, 0xe9, 0xeb, 0x6d, 0x84, 0xbe, 0x86,
+ 0x06, 0xe2, 0x36, 0xb2, 0x62, 0x9d, 0xd3, 0x4c,
+ 0x48, 0x18, 0x54, 0x13, 0x4e, 0xcf, 0xfd, 0xba,
+ 0x84, 0xb9, 0x30, 0x53, 0xcf, 0xfb, 0xb9, 0x29,
+ 0x8f, 0xdc, 0x9f, 0xef, 0x60, 0x0b, 0x64, 0xf6,
+ 0x8b, 0xee, 0xa6, 0x91, 0xc2, 0x41, 0x6c, 0xf6,
+ 0xfa, 0x79, 0x67, 0x4b, 0xc1, 0x3f, 0xaf, 0x09,
+ 0x81, 0xd4, 0x5d, 0xcb, 0x09, 0xdf, 0x36, 0x31,
+ 0xc0, 0x14, 0x3c, 0x7c, 0x0e, 0x65, 0x95, 0x99,
+ 0x6d, 0xa3, 0xf4, 0xd7, 0x38, 0xee, 0x1a, 0x2b,
+ 0x37, 0xe2, 0xa4, 0x3b, 0x4b, 0xd0, 0x65, 0xca,
+ 0xf8, 0xc3, 0xe8, 0x15, 0x20, 0xef, 0xf2, 0x00,
+ 0xfd, 0x01, 0x09, 0xc5, 0xc8, 0x17, 0x04, 0x93,
+ 0xd0, 0x93, 0x03, 0x55, 0xc5, 0xfe, 0x32, 0xa3,
+ 0x3e, 0x28, 0x2d, 0x3b, 0x93, 0x8a, 0xcc, 0x07,
+ 0x72, 0x80, 0x8b, 0x74, 0x16, 0x24, 0xbb, 0xda,
+ 0x94, 0x39, 0x30, 0x8f, 0xb1, 0xcd, 0x4a, 0x90,
+ 0x92, 0x7c, 0x14, 0x8f, 0x95, 0x4e, 0xac, 0x9b,
+ 0xd8, 0x8f, 0x1a, 0x87, 0xa4, 0x32, 0x27, 0x8a,
+ 0xba, 0xf7, 0x41, 0xcf, 0x84, 0x37, 0x19, 0xe6,
+ 0x06, 0xf5, 0x0e, 0xcf, 0x36, 0xf5, 0x9e, 0x6c,
+ 0xde, 0xbc, 0xff, 0x64, 0x7e, 0x4e, 0x59, 0x57,
+ 0x48, 0xfe, 0x14, 0xf7, 0x9c, 0x93, 0x5d, 0x15,
+ 0xad, 0xcc, 0x11, 0xb1, 0x17, 0x18, 0xb2, 0x7e,
+ 0xcc, 0xab, 0xe9, 0xce, 0x7d, 0x77, 0x5b, 0x51,
+ 0x1b, 0x1e, 0x20, 0xa8, 0x32, 0x06, 0x0e, 0x75,
+ 0x93, 0xac, 0xdb, 0x35, 0x37, 0x1f, 0xe9, 0x19,
+ 0x1d, 0xb4, 0x71, 0x97, 0xd6, 0x4e, 0x2c, 0x08,
+ 0xa5, 0x13, 0xf9, 0x0e, 0x7e, 0x78, 0x6e, 0x14,
+ 0xe0, 0xa9, 0xb9, 0x96, 0x4c, 0x80, 0x82, 0xba,
+ 0x17, 0xb3, 0x9d, 0x69, 0xb0, 0x84, 0x46, 0xff,
+ 0xf9, 0x52, 0x79, 0x94, 0x58, 0x3a, 0x62, 0x90,
+ 0x15, 0x35, 0x71, 0x10, 0x37, 0xed, 0xa1, 0x8e,
+ 0x53, 0x6e, 0xf4, 0x26, 0x57, 0x93, 0x15, 0x93,
+ 0xf6, 0x81, 0x2c, 0x5a, 0x10, 0xda, 0x92, 0xad,
+ 0x2f, 0xdb, 0x28, 0x31, 0x2d, 0x55, 0x04, 0xd2,
+ 0x06, 0x28, 0x8c, 0x1e, 0xdc, 0xea, 0x54, 0xac,
+ 0xff, 0xb7, 0x6c, 0x30, 0x15, 0xd4, 0xb4, 0x0d,
+ 0x00, 0x93, 0x57, 0xdd, 0xd2, 0x07, 0x07, 0x06,
+ 0xd9, 0x43, 0x9b, 0xcd, 0x3a, 0xf4, 0x7d, 0x4c,
+ 0x36, 0x5d, 0x23, 0xa2, 0xcc, 0x57, 0x40, 0x91,
+ 0xe9, 0x2c, 0x2f, 0x2c, 0xd5, 0x30, 0x9b, 0x17,
+ 0xb0, 0xc9, 0xf7, 0xa7, 0x2f, 0xd1, 0x93, 0x20,
+ 0x6b, 0xc6, 0xc1, 0xe4, 0x6f, 0xcb, 0xd1, 0xe7,
+ 0x09, 0x0f, 0x9e, 0xdc, 0xaa, 0x9f, 0x2f, 0xdf,
+ 0x56, 0x9f, 0xd4, 0x33, 0x04, 0xaf, 0xd3, 0x6c,
+ 0x58, 0x61, 0xf0, 0x30, 0xec, 0xf2, 0x7f, 0xf2,
+ 0x9c, 0xdf, 0x39, 0xbb, 0x6f, 0xa2, 0x8c, 0x7e,
+ 0xc4, 0x22, 0x51, 0x71, 0xc0, 0x4d, 0x14, 0x1a,
+ 0xc4, 0xcd, 0x04, 0xd9, 0x87, 0x08, 0x50, 0x05,
+ 0xcc, 0xaf, 0xf6, 0xf0, 0x8f, 0x92, 0x54, 0x58,
+ 0xc2, 0xc7, 0x09, 0x7a, 0x59, 0x02, 0x05, 0xe8,
+ 0xb0, 0x86, 0xd9, 0xbf, 0x7b, 0x35, 0x51, 0x4d,
+ 0xaf, 0x08, 0x97, 0x2c, 0x65, 0xda, 0x2a, 0x71,
+ 0x3a, 0xa8, 0x51, 0xcc, 0xf2, 0x73, 0x27, 0xc3,
+ 0xfd, 0x62, 0xcf, 0xe3, 0xb2, 0xca, 0xcb, 0xbe,
+ 0x1a, 0x0a, 0xa1, 0x34, 0x7b, 0x77, 0xc4, 0x62,
+ 0x68, 0x78, 0x5f, 0x94, 0x07, 0x04, 0x65, 0x16,
+ 0x4b, 0x61, 0xcb, 0xff, 0x75, 0x26, 0x50, 0x66,
+ 0x1f, 0x6e, 0x93, 0xf8, 0xc5, 0x51, 0xeb, 0xa4,
+ 0x4a, 0x48, 0x68, 0x6b, 0xe2, 0x5e, 0x44, 0xb2,
+ 0x50, 0x2c, 0x6c, 0xae, 0x79, 0x4e, 0x66, 0x35,
+ 0x81, 0x50, 0xac, 0xbc, 0x3f, 0xb1, 0x0c, 0xf3,
+ 0x05, 0x3c, 0x4a, 0xa3, 0x6c, 0x2a, 0x79, 0xb4,
+ 0xb7, 0xab, 0xca, 0xc7, 0x9b, 0x8e, 0xcd, 0x5f,
+ 0x11, 0x03, 0xcb, 0x30, 0xa3, 0xab, 0xda, 0xfe,
+ 0x64, 0xb9, 0xbb, 0xd8, 0x5e, 0x3a, 0x1a, 0x56,
+ 0xe5, 0x05, 0x48, 0x90, 0x1e, 0x61, 0x69, 0x1b,
+ 0x22, 0xe6, 0x1a, 0x3c, 0x75, 0xad, 0x1f, 0x37,
+ 0x28, 0xdc, 0xe4, 0x6d, 0xbd, 0x42, 0xdc, 0xd3,
+ 0xc8, 0xb6, 0x1c, 0x48, 0xfe, 0x94, 0x77, 0x7f,
+ 0xbd, 0x62, 0xac, 0xa3, 0x47, 0x27, 0xcf, 0x5f,
+ 0xd9, 0xdb, 0xaf, 0xec, 0xf7, 0x5e, 0xc1, 0xb0,
+ 0x9d, 0x01, 0x26, 0x99, 0x7e, 0x8f, 0x03, 0x70,
+ 0xb5, 0x42, 0xbe, 0x67, 0x28, 0x1b, 0x7c, 0xbd,
+ 0x61, 0x21, 0x97, 0xcc, 0x5c, 0xe1, 0x97, 0x8f,
+ 0x8d, 0xde, 0x2b, 0xaa, 0xa7, 0x71, 0x1d, 0x1e,
+ 0x02, 0x73, 0x70, 0x58, 0x32, 0x5b, 0x1d, 0x67,
+ 0x3d, 0xe0, 0x74, 0x4f, 0x03, 0xf2, 0x70, 0x51,
+ 0x79, 0xf1, 0x61, 0x70, 0x15, 0x74, 0x9d, 0x23,
+ 0x89, 0xde, 0xac, 0xfd, 0xde, 0xd0, 0x1f, 0xc3,
+ 0x87, 0x44, 0x35, 0x4b, 0xe5, 0xb0, 0x60, 0xc5,
+ 0x22, 0xe4, 0x9e, 0xca, 0xeb, 0xd5, 0x3a, 0x09,
+ 0x45, 0xa4, 0xdb, 0xfa, 0x3f, 0xeb, 0x1b, 0xc7,
+ 0xc8, 0x14, 0x99, 0x51, 0x92, 0x10, 0xed, 0xed,
+ 0x28, 0xe0, 0xa1, 0xf8, 0x26, 0xcf, 0xcd, 0xcb,
+ 0x63, 0xa1, 0x3b, 0xe3, 0xdf, 0x7e, 0xfe, 0xa6,
+ 0xf0, 0x81, 0x9a, 0xbf, 0x55, 0xde, 0x54, 0xd5,
+ 0x56, 0x60, 0x98, 0x10, 0x68, 0xf4, 0x38, 0x96,
+ 0x8e, 0x6f, 0x1d, 0x44, 0x7f, 0xd6, 0x2f, 0xfe,
+ 0x55, 0xfb, 0x0c, 0x7e, 0x67, 0xe2, 0x61, 0x44,
+ 0xed, 0xf2, 0x35, 0x30, 0x5d, 0xe9, 0xc7, 0xd6,
+ 0x6d, 0xe0, 0xa0, 0xed, 0xf3, 0xfc, 0xd8, 0x3e,
+ 0x0a, 0x7b, 0xcd, 0xaf, 0x65, 0x68, 0x18, 0xc0,
+ 0xec, 0x04, 0x1c, 0x74, 0x6d, 0xe2, 0x6e, 0x79,
+ 0xd4, 0x11, 0x2b, 0x62, 0xd5, 0x27, 0xad, 0x4f,
+ 0x01, 0x59, 0x73, 0xcc, 0x6a, 0x53, 0xfb, 0x2d,
+ 0xd5, 0x4e, 0x99, 0x21, 0x65, 0x4d, 0xf5, 0x82,
+ 0xf7, 0xd8, 0x42, 0xce, 0x6f, 0x3d, 0x36, 0x47,
+ 0xf1, 0x05, 0x16, 0xe8, 0x1b, 0x6a, 0x8f, 0x93,
+ 0xf2, 0x8f, 0x37, 0x40, 0x12, 0x28, 0xa3, 0xe6,
+ 0xb9, 0x17, 0x4a, 0x1f, 0xb1, 0xd1, 0x66, 0x69,
+ 0x86, 0xc4, 0xfc, 0x97, 0xae, 0x3f, 0x8f, 0x1e,
+ 0x2b, 0xdf, 0xcd, 0xf9, 0x3c
+};
+static const u8 dec_output011[] __initconst = {
+ 0x7a, 0x57, 0xf2, 0xc7, 0x06, 0x3f, 0x50, 0x7b,
+ 0x36, 0x1a, 0x66, 0x5c, 0xb9, 0x0e, 0x5e, 0x3b,
+ 0x45, 0x60, 0xbe, 0x9a, 0x31, 0x9f, 0xff, 0x5d,
+ 0x66, 0x34, 0xb4, 0xdc, 0xfb, 0x9d, 0x8e, 0xee,
+ 0x6a, 0x33, 0xa4, 0x07, 0x3c, 0xf9, 0x4c, 0x30,
+ 0xa1, 0x24, 0x52, 0xf9, 0x50, 0x46, 0x88, 0x20,
+ 0x02, 0x32, 0x3a, 0x0e, 0x99, 0x63, 0xaf, 0x1f,
+ 0x15, 0x28, 0x2a, 0x05, 0xff, 0x57, 0x59, 0x5e,
+ 0x18, 0xa1, 0x1f, 0xd0, 0x92, 0x5c, 0x88, 0x66,
+ 0x1b, 0x00, 0x64, 0xa5, 0x93, 0x8d, 0x06, 0x46,
+ 0xb0, 0x64, 0x8b, 0x8b, 0xef, 0x99, 0x05, 0x35,
+ 0x85, 0xb3, 0xf3, 0x33, 0xbb, 0xec, 0x66, 0xb6,
+ 0x3d, 0x57, 0x42, 0xe3, 0xb4, 0xc6, 0xaa, 0xb0,
+ 0x41, 0x2a, 0xb9, 0x59, 0xa9, 0xf6, 0x3e, 0x15,
+ 0x26, 0x12, 0x03, 0x21, 0x4c, 0x74, 0x43, 0x13,
+ 0x2a, 0x03, 0x27, 0x09, 0xb4, 0xfb, 0xe7, 0xb7,
+ 0x40, 0xff, 0x5e, 0xce, 0x48, 0x9a, 0x60, 0xe3,
+ 0x8b, 0x80, 0x8c, 0x38, 0x2d, 0xcb, 0x93, 0x37,
+ 0x74, 0x05, 0x52, 0x6f, 0x73, 0x3e, 0xc3, 0xbc,
+ 0xca, 0x72, 0x0a, 0xeb, 0xf1, 0x3b, 0xa0, 0x95,
+ 0xdc, 0x8a, 0xc4, 0xa9, 0xdc, 0xca, 0x44, 0xd8,
+ 0x08, 0x63, 0x6a, 0x36, 0xd3, 0x3c, 0xb8, 0xac,
+ 0x46, 0x7d, 0xfd, 0xaa, 0xeb, 0x3e, 0x0f, 0x45,
+ 0x8f, 0x49, 0xda, 0x2b, 0xf2, 0x12, 0xbd, 0xaf,
+ 0x67, 0x8a, 0x63, 0x48, 0x4b, 0x55, 0x5f, 0x6d,
+ 0x8c, 0xb9, 0x76, 0x34, 0x84, 0xae, 0xc2, 0xfc,
+ 0x52, 0x64, 0x82, 0xf7, 0xb0, 0x06, 0xf0, 0x45,
+ 0x73, 0x12, 0x50, 0x30, 0x72, 0xea, 0x78, 0x9a,
+ 0xa8, 0xaf, 0xb5, 0xe3, 0xbb, 0x77, 0x52, 0xec,
+ 0x59, 0x84, 0xbf, 0x6b, 0x8f, 0xce, 0x86, 0x5e,
+ 0x1f, 0x23, 0xe9, 0xfb, 0x08, 0x86, 0xf7, 0x10,
+ 0xb9, 0xf2, 0x44, 0x96, 0x44, 0x63, 0xa9, 0xa8,
+ 0x78, 0x00, 0x23, 0xd6, 0xc7, 0xe7, 0x6e, 0x66,
+ 0x4f, 0xcc, 0xee, 0x15, 0xb3, 0xbd, 0x1d, 0xa0,
+ 0xe5, 0x9c, 0x1b, 0x24, 0x2c, 0x4d, 0x3c, 0x62,
+ 0x35, 0x9c, 0x88, 0x59, 0x09, 0xdd, 0x82, 0x1b,
+ 0xcf, 0x0a, 0x83, 0x6b, 0x3f, 0xae, 0x03, 0xc4,
+ 0xb4, 0xdd, 0x7e, 0x5b, 0x28, 0x76, 0x25, 0x96,
+ 0xd9, 0xc9, 0x9d, 0x5f, 0x86, 0xfa, 0xf6, 0xd7,
+ 0xd2, 0xe6, 0x76, 0x1d, 0x0f, 0xa1, 0xdc, 0x74,
+ 0x05, 0x1b, 0x1d, 0xe0, 0xcd, 0x16, 0xb0, 0xa8,
+ 0x8a, 0x34, 0x7b, 0x15, 0x11, 0x77, 0xe5, 0x7b,
+ 0x7e, 0x20, 0xf7, 0xda, 0x38, 0xda, 0xce, 0x70,
+ 0xe9, 0xf5, 0x6c, 0xd9, 0xbe, 0x0c, 0x4c, 0x95,
+ 0x4c, 0xc2, 0x9b, 0x34, 0x55, 0x55, 0xe1, 0xf3,
+ 0x46, 0x8e, 0x48, 0x74, 0x14, 0x4f, 0x9d, 0xc9,
+ 0xf5, 0xe8, 0x1a, 0xf0, 0x11, 0x4a, 0xc1, 0x8d,
+ 0xe0, 0x93, 0xa0, 0xbe, 0x09, 0x1c, 0x2b, 0x4e,
+ 0x0f, 0xb2, 0x87, 0x8b, 0x84, 0xfe, 0x92, 0x32,
+ 0x14, 0xd7, 0x93, 0xdf, 0xe7, 0x44, 0xbc, 0xc5,
+ 0xae, 0x53, 0x69, 0xd8, 0xb3, 0x79, 0x37, 0x80,
+ 0xe3, 0x17, 0x5c, 0xec, 0x53, 0x00, 0x9a, 0xe3,
+ 0x8e, 0xdc, 0x38, 0xb8, 0x66, 0xf0, 0xd3, 0xad,
+ 0x1d, 0x02, 0x96, 0x86, 0x3e, 0x9d, 0x3b, 0x5d,
+ 0xa5, 0x7f, 0x21, 0x10, 0xf1, 0x1f, 0x13, 0x20,
+ 0xf9, 0x57, 0x87, 0x20, 0xf5, 0x5f, 0xf1, 0x17,
+ 0x48, 0x0a, 0x51, 0x5a, 0xcd, 0x19, 0x03, 0xa6,
+ 0x5a, 0xd1, 0x12, 0x97, 0xe9, 0x48, 0xe2, 0x1d,
+ 0x83, 0x75, 0x50, 0xd9, 0x75, 0x7d, 0x6a, 0x82,
+ 0xa1, 0xf9, 0x4e, 0x54, 0x87, 0x89, 0xc9, 0x0c,
+ 0xb7, 0x5b, 0x6a, 0x91, 0xc1, 0x9c, 0xb2, 0xa9,
+ 0xdc, 0x9a, 0xa4, 0x49, 0x0a, 0x6d, 0x0d, 0xbb,
+ 0xde, 0x86, 0x44, 0xdd, 0x5d, 0x89, 0x2b, 0x96,
+ 0x0f, 0x23, 0x95, 0xad, 0xcc, 0xa2, 0xb3, 0xb9,
+ 0x7e, 0x74, 0x38, 0xba, 0x9f, 0x73, 0xae, 0x5f,
+ 0xf8, 0x68, 0xa2, 0xe0, 0xa9, 0xce, 0xbd, 0x40,
+ 0xd4, 0x4c, 0x6b, 0xd2, 0x56, 0x62, 0xb0, 0xcc,
+ 0x63, 0x7e, 0x5b, 0xd3, 0xae, 0xd1, 0x75, 0xce,
+ 0xbb, 0xb4, 0x5b, 0xa8, 0xf8, 0xb4, 0xac, 0x71,
+ 0x75, 0xaa, 0xc9, 0x9f, 0xbb, 0x6c, 0xad, 0x0f,
+ 0x55, 0x5d, 0xe8, 0x85, 0x7d, 0xf9, 0x21, 0x35,
+ 0xea, 0x92, 0x85, 0x2b, 0x00, 0xec, 0x84, 0x90,
+ 0x0a, 0x63, 0x96, 0xe4, 0x6b, 0xa9, 0x77, 0xb8,
+ 0x91, 0xf8, 0x46, 0x15, 0x72, 0x63, 0x70, 0x01,
+ 0x40, 0xa3, 0xa5, 0x76, 0x62, 0x2b, 0xbf, 0xf1,
+ 0xe5, 0x8d, 0x9f, 0xa3, 0xfa, 0x9b, 0x03, 0xbe,
+ 0xfe, 0x65, 0x6f, 0xa2, 0x29, 0x0d, 0x54, 0xb4,
+ 0x71, 0xce, 0xa9, 0xd6, 0x3d, 0x88, 0xf9, 0xaf,
+ 0x6b, 0xa8, 0x9e, 0xf4, 0x16, 0x96, 0x36, 0xb9,
+ 0x00, 0xdc, 0x10, 0xab, 0xb5, 0x08, 0x31, 0x1f,
+ 0x00, 0xb1, 0x3c, 0xd9, 0x38, 0x3e, 0xc6, 0x04,
+ 0xa7, 0x4e, 0xe8, 0xae, 0xed, 0x98, 0xc2, 0xf7,
+ 0xb9, 0x00, 0x5f, 0x8c, 0x60, 0xd1, 0xe5, 0x15,
+ 0xf7, 0xae, 0x1e, 0x84, 0x88, 0xd1, 0xf6, 0xbc,
+ 0x3a, 0x89, 0x35, 0x22, 0x83, 0x7c, 0xca, 0xf0,
+ 0x33, 0x82, 0x4c, 0x79, 0x3c, 0xfd, 0xb1, 0xae,
+ 0x52, 0x62, 0x55, 0xd2, 0x41, 0x60, 0xc6, 0xbb,
+ 0xfa, 0x0e, 0x59, 0xd6, 0xa8, 0xfe, 0x5d, 0xed,
+ 0x47, 0x3d, 0xe0, 0xea, 0x1f, 0x6e, 0x43, 0x51,
+ 0xec, 0x10, 0x52, 0x56, 0x77, 0x42, 0x6b, 0x52,
+ 0x87, 0xd8, 0xec, 0xe0, 0xaa, 0x76, 0xa5, 0x84,
+ 0x2a, 0x22, 0x24, 0xfd, 0x92, 0x40, 0x88, 0xd5,
+ 0x85, 0x1c, 0x1f, 0x6b, 0x47, 0xa0, 0xc4, 0xe4,
+ 0xef, 0xf4, 0xea, 0xd7, 0x59, 0xac, 0x2a, 0x9e,
+ 0x8c, 0xfa, 0x1f, 0x42, 0x08, 0xfe, 0x4f, 0x74,
+ 0xa0, 0x26, 0xf5, 0xb3, 0x84, 0xf6, 0x58, 0x5f,
+ 0x26, 0x66, 0x3e, 0xd7, 0xe4, 0x22, 0x91, 0x13,
+ 0xc8, 0xac, 0x25, 0x96, 0x23, 0xd8, 0x09, 0xea,
+ 0x45, 0x75, 0x23, 0xb8, 0x5f, 0xc2, 0x90, 0x8b,
+ 0x09, 0xc4, 0xfc, 0x47, 0x6c, 0x6d, 0x0a, 0xef,
+ 0x69, 0xa4, 0x38, 0x19, 0xcf, 0x7d, 0xf9, 0x09,
+ 0x73, 0x9b, 0x60, 0x5a, 0xf7, 0x37, 0xb5, 0xfe,
+ 0x9f, 0xe3, 0x2b, 0x4c, 0x0d, 0x6e, 0x19, 0xf1,
+ 0xd6, 0xc0, 0x70, 0xf3, 0x9d, 0x22, 0x3c, 0xf9,
+ 0x49, 0xce, 0x30, 0x8e, 0x44, 0xb5, 0x76, 0x15,
+ 0x8f, 0x52, 0xfd, 0xa5, 0x04, 0xb8, 0x55, 0x6a,
+ 0x36, 0x59, 0x7c, 0xc4, 0x48, 0xb8, 0xd7, 0xab,
+ 0x05, 0x66, 0xe9, 0x5e, 0x21, 0x6f, 0x6b, 0x36,
+ 0x29, 0xbb, 0xe9, 0xe3, 0xa2, 0x9a, 0xa8, 0xcd,
+ 0x55, 0x25, 0x11, 0xba, 0x5a, 0x58, 0xa0, 0xde,
+ 0xae, 0x19, 0x2a, 0x48, 0x5a, 0xff, 0x36, 0xcd,
+ 0x6d, 0x16, 0x7a, 0x73, 0x38, 0x46, 0xe5, 0x47,
+ 0x59, 0xc8, 0xa2, 0xf6, 0xe2, 0x6c, 0x83, 0xc5,
+ 0x36, 0x2c, 0x83, 0x7d, 0xb4, 0x01, 0x05, 0x69,
+ 0xe7, 0xaf, 0x5c, 0xc4, 0x64, 0x82, 0x12, 0x21,
+ 0xef, 0xf7, 0xd1, 0x7d, 0xb8, 0x8d, 0x8c, 0x98,
+ 0x7c, 0x5f, 0x7d, 0x92, 0x88, 0xb9, 0x94, 0x07,
+ 0x9c, 0xd8, 0xe9, 0x9c, 0x17, 0x38, 0xe3, 0x57,
+ 0x6c, 0xe0, 0xdc, 0xa5, 0x92, 0x42, 0xb3, 0xbd,
+ 0x50, 0xa2, 0x7e, 0xb5, 0xb1, 0x52, 0x72, 0x03,
+ 0x97, 0xd8, 0xaa, 0x9a, 0x1e, 0x75, 0x41, 0x11,
+ 0xa3, 0x4f, 0xcc, 0xd4, 0xe3, 0x73, 0xad, 0x96,
+ 0xdc, 0x47, 0x41, 0x9f, 0xb0, 0xbe, 0x79, 0x91,
+ 0xf5, 0xb6, 0x18, 0xfe, 0xc2, 0x83, 0x18, 0x7d,
+ 0x73, 0xd9, 0x4f, 0x83, 0x84, 0x03, 0xb3, 0xf0,
+ 0x77, 0x66, 0x3d, 0x83, 0x63, 0x2e, 0x2c, 0xf9,
+ 0xdd, 0xa6, 0x1f, 0x89, 0x82, 0xb8, 0x23, 0x42,
+ 0xeb, 0xe2, 0xca, 0x70, 0x82, 0x61, 0x41, 0x0a,
+ 0x6d, 0x5f, 0x75, 0xc5, 0xe2, 0xc4, 0x91, 0x18,
+ 0x44, 0x22, 0xfa, 0x34, 0x10, 0xf5, 0x20, 0xdc,
+ 0xb7, 0xdd, 0x2a, 0x20, 0x77, 0xf5, 0xf9, 0xce,
+ 0xdb, 0xa0, 0x0a, 0x52, 0x2a, 0x4e, 0xdd, 0xcc,
+ 0x97, 0xdf, 0x05, 0xe4, 0x5e, 0xb7, 0xaa, 0xf0,
+ 0xe2, 0x80, 0xff, 0xba, 0x1a, 0x0f, 0xac, 0xdf,
+ 0x02, 0x32, 0xe6, 0xf7, 0xc7, 0x17, 0x13, 0xb7,
+ 0xfc, 0x98, 0x48, 0x8c, 0x0d, 0x82, 0xc9, 0x80,
+ 0x7a, 0xe2, 0x0a, 0xc5, 0xb4, 0xde, 0x7c, 0x3c,
+ 0x79, 0x81, 0x0e, 0x28, 0x65, 0x79, 0x67, 0x82,
+ 0x69, 0x44, 0x66, 0x09, 0xf7, 0x16, 0x1a, 0xf9,
+ 0x7d, 0x80, 0xa1, 0x79, 0x14, 0xa9, 0xc8, 0x20,
+ 0xfb, 0xa2, 0x46, 0xbe, 0x08, 0x35, 0x17, 0x58,
+ 0xc1, 0x1a, 0xda, 0x2a, 0x6b, 0x2e, 0x1e, 0xe6,
+ 0x27, 0x55, 0x7b, 0x19, 0xe2, 0xfb, 0x64, 0xfc,
+ 0x5e, 0x15, 0x54, 0x3c, 0xe7, 0xc2, 0x11, 0x50,
+ 0x30, 0xb8, 0x72, 0x03, 0x0b, 0x1a, 0x9f, 0x86,
+ 0x27, 0x11, 0x5c, 0x06, 0x2b, 0xbd, 0x75, 0x1a,
+ 0x0a, 0xda, 0x01, 0xfa, 0x5c, 0x4a, 0xc1, 0x80,
+ 0x3a, 0x6e, 0x30, 0xc8, 0x2c, 0xeb, 0x56, 0xec,
+ 0x89, 0xfa, 0x35, 0x7b, 0xb2, 0xf0, 0x97, 0x08,
+ 0x86, 0x53, 0xbe, 0xbd, 0x40, 0x41, 0x38, 0x1c,
+ 0xb4, 0x8b, 0x79, 0x2e, 0x18, 0x96, 0x94, 0xde,
+ 0xe8, 0xca, 0xe5, 0x9f, 0x92, 0x9f, 0x15, 0x5d,
+ 0x56, 0x60, 0x5c, 0x09, 0xf9, 0x16, 0xf4, 0x17,
+ 0x0f, 0xf6, 0x4c, 0xda, 0xe6, 0x67, 0x89, 0x9f,
+ 0xca, 0x6c, 0xe7, 0x9b, 0x04, 0x62, 0x0e, 0x26,
+ 0xa6, 0x52, 0xbd, 0x29, 0xff, 0xc7, 0xa4, 0x96,
+ 0xe6, 0x6a, 0x02, 0xa5, 0x2e, 0x7b, 0xfe, 0x97,
+ 0x68, 0x3e, 0x2e, 0x5f, 0x3b, 0x0f, 0x36, 0xd6,
+ 0x98, 0x19, 0x59, 0x48, 0xd2, 0xc6, 0xe1, 0x55,
+ 0x1a, 0x6e, 0xd6, 0xed, 0x2c, 0xba, 0xc3, 0x9e,
+ 0x64, 0xc9, 0x95, 0x86, 0x35, 0x5e, 0x3e, 0x88,
+ 0x69, 0x99, 0x4b, 0xee, 0xbe, 0x9a, 0x99, 0xb5,
+ 0x6e, 0x58, 0xae, 0xdd, 0x22, 0xdb, 0xdd, 0x6b,
+ 0xfc, 0xaf, 0x90, 0xa3, 0x3d, 0xa4, 0xc1, 0x15,
+ 0x92, 0x18, 0x8d, 0xd2, 0x4b, 0x7b, 0x06, 0xd1,
+ 0x37, 0xb5, 0xe2, 0x7c, 0x2c, 0xf0, 0x25, 0xe4,
+ 0x94, 0x2a, 0xbd, 0xe3, 0x82, 0x70, 0x78, 0xa3,
+ 0x82, 0x10, 0x5a, 0x90, 0xd7, 0xa4, 0xfa, 0xaf,
+ 0x1a, 0x88, 0x59, 0xdc, 0x74, 0x12, 0xb4, 0x8e,
+ 0xd7, 0x19, 0x46, 0xf4, 0x84, 0x69, 0x9f, 0xbb,
+ 0x70, 0xa8, 0x4c, 0x52, 0x81, 0xa9, 0xff, 0x76,
+ 0x1c, 0xae, 0xd8, 0x11, 0x3d, 0x7f, 0x7d, 0xc5,
+ 0x12, 0x59, 0x28, 0x18, 0xc2, 0xa2, 0xb7, 0x1c,
+ 0x88, 0xf8, 0xd6, 0x1b, 0xa6, 0x7d, 0x9e, 0xde,
+ 0x29, 0xf8, 0xed, 0xff, 0xeb, 0x92, 0x24, 0x4f,
+ 0x05, 0xaa, 0xd9, 0x49, 0xba, 0x87, 0x59, 0x51,
+ 0xc9, 0x20, 0x5c, 0x9b, 0x74, 0xcf, 0x03, 0xd9,
+ 0x2d, 0x34, 0xc7, 0x5b, 0xa5, 0x40, 0xb2, 0x99,
+ 0xf5, 0xcb, 0xb4, 0xf6, 0xb7, 0x72, 0x4a, 0xd6,
+ 0xbd, 0xb0, 0xf3, 0x93, 0xe0, 0x1b, 0xa8, 0x04,
+ 0x1e, 0x35, 0xd4, 0x80, 0x20, 0xf4, 0x9c, 0x31,
+ 0x6b, 0x45, 0xb9, 0x15, 0xb0, 0x5e, 0xdd, 0x0a,
+ 0x33, 0x9c, 0x83, 0xcd, 0x58, 0x89, 0x50, 0x56,
+ 0xbb, 0x81, 0x00, 0x91, 0x32, 0xf3, 0x1b, 0x3e,
+ 0xcf, 0x45, 0xe1, 0xf9, 0xe1, 0x2c, 0x26, 0x78,
+ 0x93, 0x9a, 0x60, 0x46, 0xc9, 0xb5, 0x5e, 0x6a,
+ 0x28, 0x92, 0x87, 0x3f, 0x63, 0x7b, 0xdb, 0xf7,
+ 0xd0, 0x13, 0x9d, 0x32, 0x40, 0x5e, 0xcf, 0xfb,
+ 0x79, 0x68, 0x47, 0x4c, 0xfd, 0x01, 0x17, 0xe6,
+ 0x97, 0x93, 0x78, 0xbb, 0xa6, 0x27, 0xa3, 0xe8,
+ 0x1a, 0xe8, 0x94, 0x55, 0x7d, 0x08, 0xe5, 0xdc,
+ 0x66, 0xa3, 0x69, 0xc8, 0xca, 0xc5, 0xa1, 0x84,
+ 0x55, 0xde, 0x08, 0x91, 0x16, 0x3a, 0x0c, 0x86,
+ 0xab, 0x27, 0x2b, 0x64, 0x34, 0x02, 0x6c, 0x76,
+ 0x8b, 0xc6, 0xaf, 0xcc, 0xe1, 0xd6, 0x8c, 0x2a,
+ 0x18, 0x3d, 0xa6, 0x1b, 0x37, 0x75, 0x45, 0x73,
+ 0xc2, 0x75, 0xd7, 0x53, 0x78, 0x3a, 0xd6, 0xe8,
+ 0x29, 0xd2, 0x4a, 0xa8, 0x1e, 0x82, 0xf6, 0xb6,
+ 0x81, 0xde, 0x21, 0xed, 0x2b, 0x56, 0xbb, 0xf2,
+ 0xd0, 0x57, 0xc1, 0x7c, 0xd2, 0x6a, 0xd2, 0x56,
+ 0xf5, 0x13, 0x5f, 0x1c, 0x6a, 0x0b, 0x74, 0xfb,
+ 0xe9, 0xfe, 0x9e, 0xea, 0x95, 0xb2, 0x46, 0xab,
+ 0x0a, 0xfc, 0xfd, 0xf3, 0xbb, 0x04, 0x2b, 0x76,
+ 0x1b, 0xa4, 0x74, 0xb0, 0xc1, 0x78, 0xc3, 0x69,
+ 0xe2, 0xb0, 0x01, 0xe1, 0xde, 0x32, 0x4c, 0x8d,
+ 0x1a, 0xb3, 0x38, 0x08, 0xd5, 0xfc, 0x1f, 0xdc,
+ 0x0e, 0x2c, 0x9c, 0xb1, 0xa1, 0x63, 0x17, 0x22,
+ 0xf5, 0x6c, 0x93, 0x70, 0x74, 0x00, 0xf8, 0x39,
+ 0x01, 0x94, 0xd1, 0x32, 0x23, 0x56, 0x5d, 0xa6,
+ 0x02, 0x76, 0x76, 0x93, 0xce, 0x2f, 0x19, 0xe9,
+ 0x17, 0x52, 0xae, 0x6e, 0x2c, 0x6d, 0x61, 0x7f,
+ 0x3b, 0xaa, 0xe0, 0x52, 0x85, 0xc5, 0x65, 0xc1,
+ 0xbb, 0x8e, 0x5b, 0x21, 0xd5, 0xc9, 0x78, 0x83,
+ 0x07, 0x97, 0x4c, 0x62, 0x61, 0x41, 0xd4, 0xfc,
+ 0xc9, 0x39, 0xe3, 0x9b, 0xd0, 0xcc, 0x75, 0xc4,
+ 0x97, 0xe6, 0xdd, 0x2a, 0x5f, 0xa6, 0xe8, 0x59,
+ 0x6c, 0x98, 0xb9, 0x02, 0xe2, 0xa2, 0xd6, 0x68,
+ 0xee, 0x3b, 0x1d, 0xe3, 0x4d, 0x5b, 0x30, 0xef,
+ 0x03, 0xf2, 0xeb, 0x18, 0x57, 0x36, 0xe8, 0xa1,
+ 0xf4, 0x47, 0xfb, 0xcb, 0x8f, 0xcb, 0xc8, 0xf3,
+ 0x4f, 0x74, 0x9d, 0x9d, 0xb1, 0x8d, 0x14, 0x44,
+ 0xd9, 0x19, 0xb4, 0x54, 0x4f, 0x75, 0x19, 0x09,
+ 0xa0, 0x75, 0xbc, 0x3b, 0x82, 0xc6, 0x3f, 0xb8,
+ 0x83, 0x19, 0x6e, 0xd6, 0x37, 0xfe, 0x6e, 0x8a,
+ 0x4e, 0xe0, 0x4a, 0xab, 0x7b, 0xc8, 0xb4, 0x1d,
+ 0xf4, 0xed, 0x27, 0x03, 0x65, 0xa2, 0xa1, 0xae,
+ 0x11, 0xe7, 0x98, 0x78, 0x48, 0x91, 0xd2, 0xd2,
+ 0xd4, 0x23, 0x78, 0x50, 0xb1, 0x5b, 0x85, 0x10,
+ 0x8d, 0xca, 0x5f, 0x0f, 0x71, 0xae, 0x72, 0x9a,
+ 0xf6, 0x25, 0x19, 0x60, 0x06, 0xf7, 0x10, 0x34,
+ 0x18, 0x0d, 0xc9, 0x9f, 0x7b, 0x0c, 0x9b, 0x8f,
+ 0x91, 0x1b, 0x9f, 0xcd, 0x10, 0xee, 0x75, 0xf9,
+ 0x97, 0x66, 0xfc, 0x4d, 0x33, 0x6e, 0x28, 0x2b,
+ 0x92, 0x85, 0x4f, 0xab, 0x43, 0x8d, 0x8f, 0x7d,
+ 0x86, 0xa7, 0xc7, 0xd8, 0xd3, 0x0b, 0x8b, 0x57,
+ 0xb6, 0x1d, 0x95, 0x0d, 0xe9, 0xbc, 0xd9, 0x03,
+ 0xd9, 0x10, 0x19, 0xc3, 0x46, 0x63, 0x55, 0x87,
+ 0x61, 0x79, 0x6c, 0x95, 0x0e, 0x9c, 0xdd, 0xca,
+ 0xc3, 0xf3, 0x64, 0xf0, 0x7d, 0x76, 0xb7, 0x53,
+ 0x67, 0x2b, 0x1e, 0x44, 0x56, 0x81, 0xea, 0x8f,
+ 0x5c, 0x42, 0x16, 0xb8, 0x28, 0xeb, 0x1b, 0x61,
+ 0x10, 0x1e, 0xbf, 0xec, 0xa8
+};
+static const u8 dec_assoc011[] __initconst = {
+ 0xd6, 0x31, 0xda, 0x5d, 0x42, 0x5e, 0xd7
+};
+static const u8 dec_nonce011[] __initconst = {
+ 0xfd, 0x87, 0xd4, 0xd8, 0x62, 0xfd, 0xec, 0xaa
+};
+static const u8 dec_key011[] __initconst = {
+ 0x35, 0x4e, 0xb5, 0x70, 0x50, 0x42, 0x8a, 0x85,
+ 0xf2, 0xfb, 0xed, 0x7b, 0xd0, 0x9e, 0x97, 0xca,
+ 0xfa, 0x98, 0x66, 0x63, 0xee, 0x37, 0xcc, 0x52,
+ 0xfe, 0xd1, 0xdf, 0x95, 0x15, 0x34, 0x29, 0x38
+};
+
+static const u8 dec_input012[] __initconst = {
+ 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3,
+ 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf,
+ 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1,
+ 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f,
+ 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e,
+ 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5,
+ 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b,
+ 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b,
+ 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2,
+ 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1,
+ 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74,
+ 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e,
+ 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae,
+ 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd,
+ 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04,
+ 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55,
+ 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef,
+ 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b,
+ 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74,
+ 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26,
+ 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f,
+ 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64,
+ 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd,
+ 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad,
+ 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b,
+ 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e,
+ 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e,
+ 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0,
+ 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f,
+ 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50,
+ 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97,
+ 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03,
+ 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a,
+ 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15,
+ 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb,
+ 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34,
+ 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47,
+ 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86,
+ 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24,
+ 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c,
+ 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9,
+ 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7,
+ 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48,
+ 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b,
+ 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e,
+ 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61,
+ 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75,
+ 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26,
+ 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74,
+ 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43,
+ 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1,
+ 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79,
+ 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3,
+ 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5,
+ 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9,
+ 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d,
+ 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8,
+ 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26,
+ 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5,
+ 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d,
+ 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29,
+ 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57,
+ 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92,
+ 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9,
+ 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc,
+ 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd,
+ 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57,
+ 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3,
+ 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4,
+ 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c,
+ 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27,
+ 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c,
+ 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5,
+ 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14,
+ 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94,
+ 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b,
+ 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99,
+ 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84,
+ 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a,
+ 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa,
+ 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75,
+ 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74,
+ 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40,
+ 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72,
+ 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f,
+ 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92,
+ 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8,
+ 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c,
+ 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f,
+ 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb,
+ 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a,
+ 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b,
+ 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d,
+ 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c,
+ 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4,
+ 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00,
+ 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b,
+ 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4,
+ 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84,
+ 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba,
+ 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47,
+ 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4,
+ 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88,
+ 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81,
+ 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1,
+ 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a,
+ 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e,
+ 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1,
+ 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07,
+ 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24,
+ 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f,
+ 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a,
+ 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9,
+ 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9,
+ 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51,
+ 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1,
+ 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c,
+ 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53,
+ 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40,
+ 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a,
+ 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2,
+ 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2,
+ 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8,
+ 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07,
+ 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9,
+ 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d,
+ 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde,
+ 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f,
+ 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d,
+ 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d,
+ 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56,
+ 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c,
+ 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3,
+ 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d,
+ 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26,
+ 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10,
+ 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c,
+ 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11,
+ 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf,
+ 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c,
+ 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb,
+ 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79,
+ 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa,
+ 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80,
+ 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08,
+ 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c,
+ 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc,
+ 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab,
+ 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6,
+ 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9,
+ 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7,
+ 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2,
+ 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33,
+ 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2,
+ 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e,
+ 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c,
+ 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b,
+ 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66,
+ 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6,
+ 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44,
+ 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74,
+ 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6,
+ 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f,
+ 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24,
+ 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1,
+ 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2,
+ 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5,
+ 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d,
+ 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0,
+ 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b,
+ 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3,
+ 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0,
+ 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3,
+ 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c,
+ 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b,
+ 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5,
+ 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51,
+ 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71,
+ 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68,
+ 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb,
+ 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e,
+ 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b,
+ 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8,
+ 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb,
+ 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54,
+ 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7,
+ 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff,
+ 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd,
+ 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde,
+ 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c,
+ 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1,
+ 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8,
+ 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14,
+ 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c,
+ 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4,
+ 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06,
+ 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52,
+ 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d,
+ 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c,
+ 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6,
+ 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5,
+ 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f,
+ 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e,
+ 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98,
+ 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8,
+ 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb,
+ 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b,
+ 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79,
+ 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11,
+ 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d,
+ 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10,
+ 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23,
+ 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23,
+ 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90,
+ 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4,
+ 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1,
+ 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7,
+ 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11,
+ 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50,
+ 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8,
+ 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97,
+ 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38,
+ 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f,
+ 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33,
+ 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f,
+ 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75,
+ 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21,
+ 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90,
+ 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8,
+ 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91,
+ 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1,
+ 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f,
+ 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3,
+ 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc,
+ 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a,
+ 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62,
+ 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55,
+ 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23,
+ 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6,
+ 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac,
+ 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12,
+ 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a,
+ 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7,
+ 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec,
+ 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28,
+ 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88,
+ 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4,
+ 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17,
+ 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2,
+ 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33,
+ 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a,
+ 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28,
+ 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62,
+ 0x70, 0xcf, 0xd6
+};
+static const u8 dec_output012[] __initconst = {
+ 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0,
+ 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5,
+ 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57,
+ 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff,
+ 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5,
+ 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b,
+ 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46,
+ 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b,
+ 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71,
+ 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0,
+ 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b,
+ 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d,
+ 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f,
+ 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24,
+ 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23,
+ 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e,
+ 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14,
+ 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d,
+ 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb,
+ 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4,
+ 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf,
+ 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e,
+ 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6,
+ 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33,
+ 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb,
+ 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0,
+ 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe,
+ 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00,
+ 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d,
+ 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b,
+ 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50,
+ 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e,
+ 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4,
+ 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28,
+ 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8,
+ 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b,
+ 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86,
+ 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67,
+ 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff,
+ 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59,
+ 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe,
+ 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6,
+ 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e,
+ 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b,
+ 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50,
+ 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39,
+ 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02,
+ 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9,
+ 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a,
+ 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38,
+ 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9,
+ 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65,
+ 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb,
+ 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2,
+ 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae,
+ 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee,
+ 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00,
+ 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c,
+ 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8,
+ 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31,
+ 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68,
+ 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4,
+ 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0,
+ 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11,
+ 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7,
+ 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39,
+ 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1,
+ 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1,
+ 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2,
+ 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66,
+ 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49,
+ 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2,
+ 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5,
+ 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3,
+ 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c,
+ 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa,
+ 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00,
+ 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54,
+ 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87,
+ 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03,
+ 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39,
+ 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40,
+ 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6,
+ 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22,
+ 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5,
+ 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e,
+ 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32,
+ 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53,
+ 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42,
+ 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c,
+ 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68,
+ 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48,
+ 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c,
+ 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce,
+ 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd,
+ 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa,
+ 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69,
+ 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8,
+ 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58,
+ 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0,
+ 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45,
+ 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb,
+ 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33,
+ 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c,
+ 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23,
+ 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80,
+ 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1,
+ 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff,
+ 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24,
+ 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9,
+ 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46,
+ 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8,
+ 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20,
+ 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35,
+ 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63,
+ 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb,
+ 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36,
+ 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a,
+ 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c,
+ 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f,
+ 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02,
+ 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03,
+ 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa,
+ 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16,
+ 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d,
+ 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5,
+ 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7,
+ 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac,
+ 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47,
+ 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3,
+ 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35,
+ 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e,
+ 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6,
+ 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74,
+ 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e,
+ 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a,
+ 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0,
+ 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4,
+ 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8,
+ 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16,
+ 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32,
+ 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65,
+ 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06,
+ 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a,
+ 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7,
+ 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85,
+ 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb,
+ 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46,
+ 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e,
+ 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61,
+ 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb,
+ 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d,
+ 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00,
+ 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5,
+ 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6,
+ 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1,
+ 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a,
+ 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7,
+ 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63,
+ 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38,
+ 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3,
+ 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed,
+ 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49,
+ 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42,
+ 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0,
+ 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f,
+ 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1,
+ 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd,
+ 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d,
+ 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88,
+ 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1,
+ 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25,
+ 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22,
+ 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28,
+ 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f,
+ 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53,
+ 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28,
+ 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8,
+ 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc,
+ 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8,
+ 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb,
+ 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3,
+ 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3,
+ 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac,
+ 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2,
+ 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a,
+ 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad,
+ 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e,
+ 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd,
+ 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf,
+ 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba,
+ 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41,
+ 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91,
+ 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d,
+ 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6,
+ 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf,
+ 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92,
+ 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e,
+ 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72,
+ 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04,
+ 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46,
+ 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55,
+ 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84,
+ 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61,
+ 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d,
+ 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8,
+ 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d,
+ 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87,
+ 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70,
+ 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94,
+ 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f,
+ 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb,
+ 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90,
+ 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31,
+ 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06,
+ 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05,
+ 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7,
+ 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e,
+ 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae,
+ 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2,
+ 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21,
+ 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0,
+ 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d,
+ 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0,
+ 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6,
+ 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5,
+ 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9,
+ 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8,
+ 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57,
+ 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1,
+ 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c,
+ 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b,
+ 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69,
+ 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d,
+ 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d,
+ 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19,
+ 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82,
+ 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20,
+ 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f,
+ 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e,
+ 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f,
+ 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47,
+ 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b,
+ 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4,
+ 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b,
+ 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4,
+ 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9,
+ 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3,
+ 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0,
+ 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16,
+ 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d,
+ 0x78, 0xec, 0x00
+};
+static const u8 dec_assoc012[] __initconst = {
+ 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8,
+ 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce,
+ 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c,
+ 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc,
+ 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e,
+ 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f,
+ 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b,
+ 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9
+};
+static const u8 dec_nonce012[] __initconst = {
+ 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06
+};
+static const u8 dec_key012[] __initconst = {
+ 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e,
+ 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d,
+ 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e,
+ 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64
+};
+
+static const u8 dec_input013[] __initconst = {
+ 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3,
+ 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf,
+ 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1,
+ 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f,
+ 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e,
+ 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5,
+ 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b,
+ 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b,
+ 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2,
+ 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1,
+ 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74,
+ 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e,
+ 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae,
+ 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd,
+ 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04,
+ 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55,
+ 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef,
+ 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b,
+ 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74,
+ 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26,
+ 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f,
+ 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64,
+ 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd,
+ 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad,
+ 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b,
+ 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e,
+ 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e,
+ 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0,
+ 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f,
+ 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50,
+ 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97,
+ 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03,
+ 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a,
+ 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15,
+ 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb,
+ 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34,
+ 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47,
+ 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86,
+ 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24,
+ 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c,
+ 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9,
+ 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7,
+ 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48,
+ 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b,
+ 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e,
+ 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61,
+ 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75,
+ 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26,
+ 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74,
+ 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43,
+ 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1,
+ 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79,
+ 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3,
+ 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5,
+ 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9,
+ 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d,
+ 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8,
+ 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26,
+ 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5,
+ 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d,
+ 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29,
+ 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57,
+ 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92,
+ 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9,
+ 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc,
+ 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd,
+ 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57,
+ 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3,
+ 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4,
+ 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c,
+ 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27,
+ 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c,
+ 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5,
+ 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14,
+ 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94,
+ 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b,
+ 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99,
+ 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84,
+ 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a,
+ 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa,
+ 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75,
+ 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74,
+ 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40,
+ 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72,
+ 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f,
+ 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92,
+ 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8,
+ 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c,
+ 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f,
+ 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb,
+ 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a,
+ 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b,
+ 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d,
+ 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c,
+ 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4,
+ 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00,
+ 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b,
+ 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4,
+ 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84,
+ 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba,
+ 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47,
+ 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4,
+ 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88,
+ 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81,
+ 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1,
+ 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a,
+ 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e,
+ 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1,
+ 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07,
+ 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24,
+ 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f,
+ 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a,
+ 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9,
+ 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9,
+ 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51,
+ 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1,
+ 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c,
+ 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53,
+ 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40,
+ 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a,
+ 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2,
+ 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2,
+ 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8,
+ 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07,
+ 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9,
+ 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d,
+ 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde,
+ 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f,
+ 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d,
+ 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d,
+ 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56,
+ 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c,
+ 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3,
+ 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d,
+ 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26,
+ 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10,
+ 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c,
+ 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11,
+ 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf,
+ 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c,
+ 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb,
+ 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79,
+ 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa,
+ 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80,
+ 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08,
+ 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c,
+ 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc,
+ 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab,
+ 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6,
+ 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9,
+ 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7,
+ 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2,
+ 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33,
+ 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2,
+ 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e,
+ 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c,
+ 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b,
+ 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66,
+ 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6,
+ 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44,
+ 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74,
+ 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6,
+ 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f,
+ 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24,
+ 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1,
+ 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2,
+ 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5,
+ 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d,
+ 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0,
+ 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b,
+ 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3,
+ 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0,
+ 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3,
+ 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c,
+ 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b,
+ 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5,
+ 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51,
+ 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71,
+ 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68,
+ 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb,
+ 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e,
+ 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b,
+ 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8,
+ 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb,
+ 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54,
+ 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7,
+ 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff,
+ 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd,
+ 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde,
+ 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c,
+ 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1,
+ 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8,
+ 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14,
+ 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c,
+ 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4,
+ 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06,
+ 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52,
+ 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d,
+ 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c,
+ 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6,
+ 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5,
+ 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f,
+ 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e,
+ 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98,
+ 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8,
+ 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb,
+ 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b,
+ 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79,
+ 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11,
+ 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d,
+ 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10,
+ 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23,
+ 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23,
+ 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90,
+ 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4,
+ 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1,
+ 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7,
+ 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11,
+ 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50,
+ 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8,
+ 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97,
+ 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38,
+ 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f,
+ 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33,
+ 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f,
+ 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75,
+ 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21,
+ 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90,
+ 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8,
+ 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91,
+ 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1,
+ 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f,
+ 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3,
+ 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc,
+ 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a,
+ 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62,
+ 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55,
+ 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23,
+ 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6,
+ 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac,
+ 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12,
+ 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a,
+ 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7,
+ 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec,
+ 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28,
+ 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88,
+ 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4,
+ 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17,
+ 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2,
+ 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33,
+ 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a,
+ 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28,
+ 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62,
+ 0x70, 0xcf, 0xd7
+};
+static const u8 dec_output013[] __initconst = {
+ 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0,
+ 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5,
+ 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57,
+ 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff,
+ 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5,
+ 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b,
+ 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46,
+ 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b,
+ 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71,
+ 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0,
+ 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b,
+ 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d,
+ 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f,
+ 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24,
+ 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23,
+ 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e,
+ 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14,
+ 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d,
+ 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb,
+ 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4,
+ 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf,
+ 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e,
+ 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6,
+ 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33,
+ 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb,
+ 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0,
+ 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe,
+ 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00,
+ 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d,
+ 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b,
+ 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50,
+ 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e,
+ 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4,
+ 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28,
+ 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8,
+ 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b,
+ 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86,
+ 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67,
+ 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff,
+ 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59,
+ 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe,
+ 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6,
+ 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e,
+ 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b,
+ 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50,
+ 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39,
+ 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02,
+ 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9,
+ 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a,
+ 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38,
+ 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9,
+ 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65,
+ 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb,
+ 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2,
+ 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae,
+ 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee,
+ 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00,
+ 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c,
+ 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8,
+ 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31,
+ 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68,
+ 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4,
+ 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0,
+ 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11,
+ 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7,
+ 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39,
+ 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1,
+ 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1,
+ 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2,
+ 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66,
+ 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49,
+ 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2,
+ 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5,
+ 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3,
+ 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c,
+ 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa,
+ 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00,
+ 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54,
+ 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87,
+ 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03,
+ 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39,
+ 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40,
+ 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6,
+ 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22,
+ 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5,
+ 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e,
+ 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32,
+ 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53,
+ 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42,
+ 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c,
+ 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68,
+ 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48,
+ 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c,
+ 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce,
+ 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd,
+ 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa,
+ 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69,
+ 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8,
+ 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58,
+ 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0,
+ 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45,
+ 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb,
+ 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33,
+ 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c,
+ 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23,
+ 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80,
+ 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1,
+ 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff,
+ 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24,
+ 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9,
+ 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46,
+ 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8,
+ 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20,
+ 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35,
+ 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63,
+ 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb,
+ 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36,
+ 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a,
+ 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c,
+ 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f,
+ 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02,
+ 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03,
+ 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa,
+ 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16,
+ 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d,
+ 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5,
+ 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7,
+ 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac,
+ 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47,
+ 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3,
+ 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35,
+ 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e,
+ 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6,
+ 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74,
+ 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e,
+ 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a,
+ 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0,
+ 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4,
+ 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8,
+ 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16,
+ 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32,
+ 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65,
+ 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06,
+ 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a,
+ 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7,
+ 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85,
+ 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb,
+ 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46,
+ 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e,
+ 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61,
+ 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb,
+ 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d,
+ 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00,
+ 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5,
+ 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6,
+ 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1,
+ 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a,
+ 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7,
+ 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63,
+ 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38,
+ 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3,
+ 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed,
+ 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49,
+ 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42,
+ 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0,
+ 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f,
+ 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1,
+ 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd,
+ 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d,
+ 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88,
+ 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1,
+ 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25,
+ 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22,
+ 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28,
+ 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f,
+ 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53,
+ 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28,
+ 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8,
+ 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc,
+ 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8,
+ 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb,
+ 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3,
+ 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3,
+ 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac,
+ 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2,
+ 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a,
+ 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad,
+ 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e,
+ 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd,
+ 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf,
+ 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba,
+ 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41,
+ 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91,
+ 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d,
+ 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6,
+ 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf,
+ 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92,
+ 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e,
+ 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72,
+ 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04,
+ 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46,
+ 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55,
+ 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84,
+ 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61,
+ 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d,
+ 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8,
+ 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d,
+ 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87,
+ 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70,
+ 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94,
+ 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f,
+ 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb,
+ 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90,
+ 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31,
+ 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06,
+ 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05,
+ 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7,
+ 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e,
+ 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae,
+ 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2,
+ 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21,
+ 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0,
+ 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d,
+ 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0,
+ 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6,
+ 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5,
+ 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9,
+ 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8,
+ 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57,
+ 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1,
+ 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c,
+ 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b,
+ 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69,
+ 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d,
+ 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d,
+ 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19,
+ 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82,
+ 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20,
+ 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f,
+ 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e,
+ 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f,
+ 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47,
+ 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b,
+ 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4,
+ 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b,
+ 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4,
+ 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9,
+ 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3,
+ 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0,
+ 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16,
+ 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d,
+ 0x78, 0xec, 0x00
+};
+static const u8 dec_assoc013[] __initconst = {
+ 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8,
+ 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce,
+ 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c,
+ 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc,
+ 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e,
+ 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f,
+ 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b,
+ 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9
+};
+static const u8 dec_nonce013[] __initconst = {
+ 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06
+};
+static const u8 dec_key013[] __initconst = {
+ 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e,
+ 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d,
+ 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e,
+ 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64
+};
+
+static const struct chacha20poly1305_testvec
+chacha20poly1305_dec_vectors[] __initconst = {
+ { dec_input001, dec_output001, dec_assoc001, dec_nonce001, dec_key001,
+ sizeof(dec_input001), sizeof(dec_assoc001), sizeof(dec_nonce001) },
+ { dec_input002, dec_output002, dec_assoc002, dec_nonce002, dec_key002,
+ sizeof(dec_input002), sizeof(dec_assoc002), sizeof(dec_nonce002) },
+ { dec_input003, dec_output003, dec_assoc003, dec_nonce003, dec_key003,
+ sizeof(dec_input003), sizeof(dec_assoc003), sizeof(dec_nonce003) },
+ { dec_input004, dec_output004, dec_assoc004, dec_nonce004, dec_key004,
+ sizeof(dec_input004), sizeof(dec_assoc004), sizeof(dec_nonce004) },
+ { dec_input005, dec_output005, dec_assoc005, dec_nonce005, dec_key005,
+ sizeof(dec_input005), sizeof(dec_assoc005), sizeof(dec_nonce005) },
+ { dec_input006, dec_output006, dec_assoc006, dec_nonce006, dec_key006,
+ sizeof(dec_input006), sizeof(dec_assoc006), sizeof(dec_nonce006) },
+ { dec_input007, dec_output007, dec_assoc007, dec_nonce007, dec_key007,
+ sizeof(dec_input007), sizeof(dec_assoc007), sizeof(dec_nonce007) },
+ { dec_input008, dec_output008, dec_assoc008, dec_nonce008, dec_key008,
+ sizeof(dec_input008), sizeof(dec_assoc008), sizeof(dec_nonce008) },
+ { dec_input009, dec_output009, dec_assoc009, dec_nonce009, dec_key009,
+ sizeof(dec_input009), sizeof(dec_assoc009), sizeof(dec_nonce009) },
+ { dec_input010, dec_output010, dec_assoc010, dec_nonce010, dec_key010,
+ sizeof(dec_input010), sizeof(dec_assoc010), sizeof(dec_nonce010) },
+ { dec_input011, dec_output011, dec_assoc011, dec_nonce011, dec_key011,
+ sizeof(dec_input011), sizeof(dec_assoc011), sizeof(dec_nonce011) },
+ { dec_input012, dec_output012, dec_assoc012, dec_nonce012, dec_key012,
+ sizeof(dec_input012), sizeof(dec_assoc012), sizeof(dec_nonce012) },
+ { dec_input013, dec_output013, dec_assoc013, dec_nonce013, dec_key013,
+ sizeof(dec_input013), sizeof(dec_assoc013), sizeof(dec_nonce013),
+ true }
+};
+
+static const u8 xenc_input001[] __initconst = {
+ 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74,
+ 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20,
+ 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66,
+ 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69,
+ 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20,
+ 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20,
+ 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d,
+ 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e,
+ 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65,
+ 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64,
+ 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63,
+ 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f,
+ 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64,
+ 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65,
+ 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61,
+ 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e,
+ 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69,
+ 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72,
+ 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20,
+ 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65,
+ 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61,
+ 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72,
+ 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65,
+ 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61,
+ 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20,
+ 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65,
+ 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20,
+ 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20,
+ 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b,
+ 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67,
+ 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80,
+ 0x9d
+};
+static const u8 xenc_output001[] __initconst = {
+ 0x1a, 0x6e, 0x3a, 0xd9, 0xfd, 0x41, 0x3f, 0x77,
+ 0x54, 0x72, 0x0a, 0x70, 0x9a, 0xa0, 0x29, 0x92,
+ 0x2e, 0xed, 0x93, 0xcf, 0x0f, 0x71, 0x88, 0x18,
+ 0x7a, 0x9d, 0x2d, 0x24, 0xe0, 0xf5, 0xea, 0x3d,
+ 0x55, 0x64, 0xd7, 0xad, 0x2a, 0x1a, 0x1f, 0x7e,
+ 0x86, 0x6d, 0xb0, 0xce, 0x80, 0x41, 0x72, 0x86,
+ 0x26, 0xee, 0x84, 0xd7, 0xef, 0x82, 0x9e, 0xe2,
+ 0x60, 0x9d, 0x5a, 0xfc, 0xf0, 0xe4, 0x19, 0x85,
+ 0xea, 0x09, 0xc6, 0xfb, 0xb3, 0xa9, 0x50, 0x09,
+ 0xec, 0x5e, 0x11, 0x90, 0xa1, 0xc5, 0x4e, 0x49,
+ 0xef, 0x50, 0xd8, 0x8f, 0xe0, 0x78, 0xd7, 0xfd,
+ 0xb9, 0x3b, 0xc9, 0xf2, 0x91, 0xc8, 0x25, 0xc8,
+ 0xa7, 0x63, 0x60, 0xce, 0x10, 0xcd, 0xc6, 0x7f,
+ 0xf8, 0x16, 0xf8, 0xe1, 0x0a, 0xd9, 0xde, 0x79,
+ 0x50, 0x33, 0xf2, 0x16, 0x0f, 0x17, 0xba, 0xb8,
+ 0x5d, 0xd8, 0xdf, 0x4e, 0x51, 0xa8, 0x39, 0xd0,
+ 0x85, 0xca, 0x46, 0x6a, 0x10, 0xa7, 0xa3, 0x88,
+ 0xef, 0x79, 0xb9, 0xf8, 0x24, 0xf3, 0xe0, 0x71,
+ 0x7b, 0x76, 0x28, 0x46, 0x3a, 0x3a, 0x1b, 0x91,
+ 0xb6, 0xd4, 0x3e, 0x23, 0xe5, 0x44, 0x15, 0xbf,
+ 0x60, 0x43, 0x9d, 0xa4, 0xbb, 0xd5, 0x5f, 0x89,
+ 0xeb, 0xef, 0x8e, 0xfd, 0xdd, 0xb4, 0x0d, 0x46,
+ 0xf0, 0x69, 0x23, 0x63, 0xae, 0x94, 0xf5, 0x5e,
+ 0xa5, 0xad, 0x13, 0x1c, 0x41, 0x76, 0xe6, 0x90,
+ 0xd6, 0x6d, 0xa2, 0x8f, 0x97, 0x4c, 0xa8, 0x0b,
+ 0xcf, 0x8d, 0x43, 0x2b, 0x9c, 0x9b, 0xc5, 0x58,
+ 0xa5, 0xb6, 0x95, 0x9a, 0xbf, 0x81, 0xc6, 0x54,
+ 0xc9, 0x66, 0x0c, 0xe5, 0x4f, 0x6a, 0x53, 0xa1,
+ 0xe5, 0x0c, 0xba, 0x31, 0xde, 0x34, 0x64, 0x73,
+ 0x8a, 0x3b, 0xbd, 0x92, 0x01, 0xdb, 0x71, 0x69,
+ 0xf3, 0x58, 0x99, 0xbc, 0xd1, 0xcb, 0x4a, 0x05,
+ 0xe2, 0x58, 0x9c, 0x25, 0x17, 0xcd, 0xdc, 0x83,
+ 0xb7, 0xff, 0xfb, 0x09, 0x61, 0xad, 0xbf, 0x13,
+ 0x5b, 0x5e, 0xed, 0x46, 0x82, 0x6f, 0x22, 0xd8,
+ 0x93, 0xa6, 0x85, 0x5b, 0x40, 0x39, 0x5c, 0xc5,
+ 0x9c
+};
+static const u8 xenc_assoc001[] __initconst = {
+ 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x4e, 0x91
+};
+static const u8 xenc_nonce001[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17
+};
+static const u8 xenc_key001[] __initconst = {
+ 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a,
+ 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0,
+ 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09,
+ 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0
+};
+
+static const struct chacha20poly1305_testvec
+xchacha20poly1305_enc_vectors[] __initconst = {
+ { xenc_input001, xenc_output001, xenc_assoc001, xenc_nonce001, xenc_key001,
+ sizeof(xenc_input001), sizeof(xenc_assoc001), sizeof(xenc_nonce001) }
+};
+
+static const u8 xdec_input001[] __initconst = {
+ 0x1a, 0x6e, 0x3a, 0xd9, 0xfd, 0x41, 0x3f, 0x77,
+ 0x54, 0x72, 0x0a, 0x70, 0x9a, 0xa0, 0x29, 0x92,
+ 0x2e, 0xed, 0x93, 0xcf, 0x0f, 0x71, 0x88, 0x18,
+ 0x7a, 0x9d, 0x2d, 0x24, 0xe0, 0xf5, 0xea, 0x3d,
+ 0x55, 0x64, 0xd7, 0xad, 0x2a, 0x1a, 0x1f, 0x7e,
+ 0x86, 0x6d, 0xb0, 0xce, 0x80, 0x41, 0x72, 0x86,
+ 0x26, 0xee, 0x84, 0xd7, 0xef, 0x82, 0x9e, 0xe2,
+ 0x60, 0x9d, 0x5a, 0xfc, 0xf0, 0xe4, 0x19, 0x85,
+ 0xea, 0x09, 0xc6, 0xfb, 0xb3, 0xa9, 0x50, 0x09,
+ 0xec, 0x5e, 0x11, 0x90, 0xa1, 0xc5, 0x4e, 0x49,
+ 0xef, 0x50, 0xd8, 0x8f, 0xe0, 0x78, 0xd7, 0xfd,
+ 0xb9, 0x3b, 0xc9, 0xf2, 0x91, 0xc8, 0x25, 0xc8,
+ 0xa7, 0x63, 0x60, 0xce, 0x10, 0xcd, 0xc6, 0x7f,
+ 0xf8, 0x16, 0xf8, 0xe1, 0x0a, 0xd9, 0xde, 0x79,
+ 0x50, 0x33, 0xf2, 0x16, 0x0f, 0x17, 0xba, 0xb8,
+ 0x5d, 0xd8, 0xdf, 0x4e, 0x51, 0xa8, 0x39, 0xd0,
+ 0x85, 0xca, 0x46, 0x6a, 0x10, 0xa7, 0xa3, 0x88,
+ 0xef, 0x79, 0xb9, 0xf8, 0x24, 0xf3, 0xe0, 0x71,
+ 0x7b, 0x76, 0x28, 0x46, 0x3a, 0x3a, 0x1b, 0x91,
+ 0xb6, 0xd4, 0x3e, 0x23, 0xe5, 0x44, 0x15, 0xbf,
+ 0x60, 0x43, 0x9d, 0xa4, 0xbb, 0xd5, 0x5f, 0x89,
+ 0xeb, 0xef, 0x8e, 0xfd, 0xdd, 0xb4, 0x0d, 0x46,
+ 0xf0, 0x69, 0x23, 0x63, 0xae, 0x94, 0xf5, 0x5e,
+ 0xa5, 0xad, 0x13, 0x1c, 0x41, 0x76, 0xe6, 0x90,
+ 0xd6, 0x6d, 0xa2, 0x8f, 0x97, 0x4c, 0xa8, 0x0b,
+ 0xcf, 0x8d, 0x43, 0x2b, 0x9c, 0x9b, 0xc5, 0x58,
+ 0xa5, 0xb6, 0x95, 0x9a, 0xbf, 0x81, 0xc6, 0x54,
+ 0xc9, 0x66, 0x0c, 0xe5, 0x4f, 0x6a, 0x53, 0xa1,
+ 0xe5, 0x0c, 0xba, 0x31, 0xde, 0x34, 0x64, 0x73,
+ 0x8a, 0x3b, 0xbd, 0x92, 0x01, 0xdb, 0x71, 0x69,
+ 0xf3, 0x58, 0x99, 0xbc, 0xd1, 0xcb, 0x4a, 0x05,
+ 0xe2, 0x58, 0x9c, 0x25, 0x17, 0xcd, 0xdc, 0x83,
+ 0xb7, 0xff, 0xfb, 0x09, 0x61, 0xad, 0xbf, 0x13,
+ 0x5b, 0x5e, 0xed, 0x46, 0x82, 0x6f, 0x22, 0xd8,
+ 0x93, 0xa6, 0x85, 0x5b, 0x40, 0x39, 0x5c, 0xc5,
+ 0x9c
+};
+static const u8 xdec_output001[] __initconst = {
+ 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74,
+ 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20,
+ 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66,
+ 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69,
+ 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20,
+ 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20,
+ 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d,
+ 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e,
+ 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65,
+ 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64,
+ 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63,
+ 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f,
+ 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64,
+ 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65,
+ 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65,
+ 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61,
+ 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e,
+ 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69,
+ 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72,
+ 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20,
+ 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65,
+ 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61,
+ 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72,
+ 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65,
+ 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61,
+ 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20,
+ 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65,
+ 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20,
+ 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20,
+ 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b,
+ 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67,
+ 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80,
+ 0x9d
+};
+static const u8 xdec_assoc001[] __initconst = {
+ 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x4e, 0x91
+};
+static const u8 xdec_nonce001[] __initconst = {
+ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17
+};
+static const u8 xdec_key001[] __initconst = {
+ 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a,
+ 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0,
+ 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09,
+ 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0
+};
+
+static const struct chacha20poly1305_testvec
+xchacha20poly1305_dec_vectors[] __initconst = {
+ { xdec_input001, xdec_output001, xdec_assoc001, xdec_nonce001, xdec_key001,
+ sizeof(xdec_input001), sizeof(xdec_assoc001), sizeof(xdec_nonce001) }
+};
+
+static void __init
+chacha20poly1305_selftest_encrypt_bignonce(u8 *dst, const u8 *src,
+ const size_t src_len, const u8 *ad,
+ const size_t ad_len,
+ const u8 nonce[12],
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ simd_context_t simd_context;
+ struct poly1305_ctx poly1305_state;
+ struct chacha20_ctx chacha20_state;
+ union {
+ u8 block0[POLY1305_KEY_SIZE];
+ __le64 lens[2];
+ } b = {{ 0 }};
+
+ simd_get(&simd_context);
+ chacha20_init(&chacha20_state, key, 0);
+ chacha20_state.counter[1] = get_unaligned_le32(nonce + 0);
+ chacha20_state.counter[2] = get_unaligned_le32(nonce + 4);
+ chacha20_state.counter[3] = get_unaligned_le32(nonce + 8);
+ chacha20(&chacha20_state, b.block0, b.block0, sizeof(b.block0),
+ &simd_context);
+ poly1305_init(&poly1305_state, b.block0);
+ poly1305_update(&poly1305_state, ad, ad_len, &simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf,
+ &simd_context);
+ chacha20(&chacha20_state, dst, src, src_len, &simd_context);
+ poly1305_update(&poly1305_state, dst, src_len, &simd_context);
+ poly1305_update(&poly1305_state, pad0, (0x10 - src_len) & 0xf,
+ &simd_context);
+ b.lens[0] = cpu_to_le64(ad_len);
+ b.lens[1] = cpu_to_le64(src_len);
+ poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens),
+ &simd_context);
+ poly1305_final(&poly1305_state, dst + src_len, &simd_context);
+ simd_put(&simd_context);
+ memzero_explicit(&chacha20_state, sizeof(chacha20_state));
+ memzero_explicit(&b, sizeof(b));
+}
+
+static void __init
+chacha20poly1305_selftest_encrypt(u8 *dst, const u8 *src, const size_t src_len,
+ const u8 *ad, const size_t ad_len,
+ const u8 *nonce, const size_t nonce_len,
+ const u8 key[CHACHA20POLY1305_KEY_SIZE])
+{
+ if (nonce_len == 8)
+ chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len,
+ get_unaligned_le64(nonce), key);
+ else if (nonce_len == 12)
+ chacha20poly1305_selftest_encrypt_bignonce(dst, src, src_len,
+ ad, ad_len, nonce,
+ key);
+ else
+ BUG();
+}
+
+static bool __init
+decryption_success(bool func_ret, bool expect_failure, int memcmp_result)
+{
+ if (expect_failure)
+ return !func_ret;
+ return func_ret && !memcmp_result;
+}
+
+static bool __init chacha20poly1305_selftest(void)
+{
+ enum { MAXIMUM_TEST_BUFFER_LEN = 1UL << 12 };
+ size_t i;
+ u8 *computed_output = NULL, *heap_src = NULL;
+ bool success = true, ret;
+ simd_context_t simd_context;
+ struct scatterlist sg_src, sg_dst;
+
+ heap_src = kmalloc(MAXIMUM_TEST_BUFFER_LEN, GFP_KERNEL);
+ computed_output = kmalloc(MAXIMUM_TEST_BUFFER_LEN, GFP_KERNEL);
+ if (!heap_src || !computed_output) {
+ pr_err("chacha20poly1305 self-test malloc: FAIL\n");
+ success = false;
+ goto out;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(chacha20poly1305_enc_vectors); ++i) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ chacha20poly1305_selftest_encrypt(computed_output,
+ chacha20poly1305_enc_vectors[i].input,
+ chacha20poly1305_enc_vectors[i].ilen,
+ chacha20poly1305_enc_vectors[i].assoc,
+ chacha20poly1305_enc_vectors[i].alen,
+ chacha20poly1305_enc_vectors[i].nonce,
+ chacha20poly1305_enc_vectors[i].nlen,
+ chacha20poly1305_enc_vectors[i].key);
+ if (memcmp(computed_output,
+ chacha20poly1305_enc_vectors[i].output,
+ chacha20poly1305_enc_vectors[i].ilen +
+ POLY1305_MAC_SIZE)) {
+ pr_err("chacha20poly1305 encryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+ simd_get(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(chacha20poly1305_enc_vectors); ++i) {
+ if (chacha20poly1305_enc_vectors[i].nlen != 8)
+ continue;
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ memcpy(heap_src, chacha20poly1305_enc_vectors[i].input,
+ chacha20poly1305_enc_vectors[i].ilen);
+ sg_init_one(&sg_src, heap_src,
+ chacha20poly1305_enc_vectors[i].ilen);
+ sg_init_one(&sg_dst, computed_output,
+ chacha20poly1305_enc_vectors[i].ilen +
+ POLY1305_MAC_SIZE);
+ ret = chacha20poly1305_encrypt_sg(&sg_dst, &sg_src,
+ chacha20poly1305_enc_vectors[i].ilen,
+ chacha20poly1305_enc_vectors[i].assoc,
+ chacha20poly1305_enc_vectors[i].alen,
+ get_unaligned_le64(chacha20poly1305_enc_vectors[i].nonce),
+ chacha20poly1305_enc_vectors[i].key,
+ &simd_context);
+ if (!ret || memcmp(computed_output,
+ chacha20poly1305_enc_vectors[i].output,
+ chacha20poly1305_enc_vectors[i].ilen +
+ POLY1305_MAC_SIZE)) {
+ pr_err("chacha20poly1305 sg encryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+ simd_put(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(chacha20poly1305_dec_vectors); ++i) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ ret = chacha20poly1305_decrypt(computed_output,
+ chacha20poly1305_dec_vectors[i].input,
+ chacha20poly1305_dec_vectors[i].ilen,
+ chacha20poly1305_dec_vectors[i].assoc,
+ chacha20poly1305_dec_vectors[i].alen,
+ get_unaligned_le64(chacha20poly1305_dec_vectors[i].nonce),
+ chacha20poly1305_dec_vectors[i].key);
+ if (!decryption_success(ret,
+ chacha20poly1305_dec_vectors[i].failure,
+ memcmp(computed_output,
+ chacha20poly1305_dec_vectors[i].output,
+ chacha20poly1305_dec_vectors[i].ilen -
+ POLY1305_MAC_SIZE))) {
+ pr_err("chacha20poly1305 decryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+ simd_get(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(chacha20poly1305_dec_vectors); ++i) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ memcpy(heap_src, chacha20poly1305_dec_vectors[i].input,
+ chacha20poly1305_dec_vectors[i].ilen);
+ sg_init_one(&sg_src, heap_src,
+ chacha20poly1305_dec_vectors[i].ilen);
+ sg_init_one(&sg_dst, computed_output,
+ chacha20poly1305_dec_vectors[i].ilen -
+ POLY1305_MAC_SIZE);
+ ret = chacha20poly1305_decrypt_sg(&sg_dst, &sg_src,
+ chacha20poly1305_dec_vectors[i].ilen,
+ chacha20poly1305_dec_vectors[i].assoc,
+ chacha20poly1305_dec_vectors[i].alen,
+ get_unaligned_le64(chacha20poly1305_dec_vectors[i].nonce),
+ chacha20poly1305_dec_vectors[i].key, &simd_context);
+ if (!decryption_success(ret,
+ chacha20poly1305_dec_vectors[i].failure,
+ memcmp(computed_output, chacha20poly1305_dec_vectors[i].output,
+ chacha20poly1305_dec_vectors[i].ilen -
+ POLY1305_MAC_SIZE))) {
+ pr_err("chacha20poly1305 sg decryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+ simd_put(&simd_context);
+ for (i = 0; i < ARRAY_SIZE(xchacha20poly1305_enc_vectors); ++i) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ xchacha20poly1305_encrypt(computed_output,
+ xchacha20poly1305_enc_vectors[i].input,
+ xchacha20poly1305_enc_vectors[i].ilen,
+ xchacha20poly1305_enc_vectors[i].assoc,
+ xchacha20poly1305_enc_vectors[i].alen,
+ xchacha20poly1305_enc_vectors[i].nonce,
+ xchacha20poly1305_enc_vectors[i].key);
+ if (memcmp(computed_output,
+ xchacha20poly1305_enc_vectors[i].output,
+ xchacha20poly1305_enc_vectors[i].ilen +
+ POLY1305_MAC_SIZE)) {
+ pr_err("xchacha20poly1305 encryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+ for (i = 0; i < ARRAY_SIZE(xchacha20poly1305_dec_vectors); ++i) {
+ memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN);
+ ret = xchacha20poly1305_decrypt(computed_output,
+ xchacha20poly1305_dec_vectors[i].input,
+ xchacha20poly1305_dec_vectors[i].ilen,
+ xchacha20poly1305_dec_vectors[i].assoc,
+ xchacha20poly1305_dec_vectors[i].alen,
+ xchacha20poly1305_dec_vectors[i].nonce,
+ xchacha20poly1305_dec_vectors[i].key);
+ if (!decryption_success(ret,
+ xchacha20poly1305_dec_vectors[i].failure,
+ memcmp(computed_output,
+ xchacha20poly1305_dec_vectors[i].output,
+ xchacha20poly1305_dec_vectors[i].ilen -
+ POLY1305_MAC_SIZE))) {
+ pr_err("xchacha20poly1305 decryption self-test %zu: FAIL\n",
+ i + 1);
+ success = false;
+ }
+ }
+
+out:
+ kfree(heap_src);
+ kfree(computed_output);
+ return success;
+}
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 19/28] zinc: BLAKE2s generic C implementation and selftest
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (15 preceding siblings ...)
2018-10-06 2:56 ` [PATCH net-next v7 18/28] zinc: ChaCha20Poly1305 construction and selftest Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 20/28] zinc: BLAKE2s x86_64 implementation Jason A. Donenfeld
` (6 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
The C implementation was originally based on Samuel Neves' public
domain reference implementation but has since been heavily modified
for the kernel. We're able to do compile-time optimizations by moving
some scaffolding around the final function into the header file.
Information: https://blake2.net/
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
include/zinc/blake2s.h | 56 +
lib/zinc/Kconfig | 3 +
lib/zinc/Makefile | 3 +
lib/zinc/blake2s/blake2s.c | 295 +++++
lib/zinc/selftest/blake2s.c | 2090 +++++++++++++++++++++++++++++++++++
5 files changed, 2447 insertions(+)
create mode 100644 include/zinc/blake2s.h
create mode 100644 lib/zinc/blake2s/blake2s.c
create mode 100644 lib/zinc/selftest/blake2s.c
diff --git a/include/zinc/blake2s.h b/include/zinc/blake2s.h
new file mode 100644
index 000000000000..701a08ba47c1
--- /dev/null
+++ b/include/zinc/blake2s.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_BLAKE2S_H
+#define _ZINC_BLAKE2S_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <asm/bug.h>
+
+enum blake2s_lengths {
+ BLAKE2S_BLOCK_SIZE = 64,
+ BLAKE2S_HASH_SIZE = 32,
+ BLAKE2S_KEY_SIZE = 32
+};
+
+struct blake2s_state {
+ u32 h[8];
+ u32 t[2];
+ u32 f[2];
+ u8 buf[BLAKE2S_BLOCK_SIZE];
+ size_t buflen;
+ u8 last_node;
+};
+
+void blake2s_init(struct blake2s_state *state, const size_t outlen);
+void blake2s_init_key(struct blake2s_state *state, const size_t outlen,
+ const void *key, const size_t keylen);
+void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen);
+void blake2s_final(struct blake2s_state *state, u8 *out, const size_t outlen);
+
+static inline void blake2s(u8 *out, const u8 *in, const u8 *key,
+ const size_t outlen, const size_t inlen,
+ const size_t keylen)
+{
+ struct blake2s_state state;
+
+ WARN_ON(IS_ENABLED(DEBUG) && ((!in && inlen > 0) || !out || !outlen ||
+ outlen > BLAKE2S_HASH_SIZE || keylen > BLAKE2S_KEY_SIZE ||
+ (!key && keylen)));
+
+ if (keylen)
+ blake2s_init_key(&state, outlen, key, keylen);
+ else
+ blake2s_init(&state, outlen);
+
+ blake2s_update(&state, in, inlen);
+ blake2s_final(&state, out, outlen);
+}
+
+void blake2s_hmac(u8 *out, const u8 *in, const u8 *key, const size_t outlen,
+ const size_t inlen, const size_t keylen);
+
+#endif /* _ZINC_BLAKE2S_H */
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
index 765eba3267c9..9fc21f93ee9f 100644
--- a/lib/zinc/Kconfig
+++ b/lib/zinc/Kconfig
@@ -11,6 +11,9 @@ config ZINC_CHACHA20POLY1305
select ZINC_POLY1305
select CRYPTO_BLKCIPHER
+config ZINC_BLAKE2S
+ tristate
+
config ZINC_SELFTEST
bool "Zinc cryptography library self-tests"
help
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index c31186b491e8..d2ec55c33ef0 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -21,3 +21,6 @@ obj-$(CONFIG_ZINC_POLY1305) += zinc_poly1305.o
zinc_chacha20poly1305-y := chacha20poly1305.o
obj-$(CONFIG_ZINC_CHACHA20POLY1305) += zinc_chacha20poly1305.o
+
+zinc_blake2s-y := blake2s/blake2s.o
+obj-$(CONFIG_ZINC_BLAKE2S) += zinc_blake2s.o
diff --git a/lib/zinc/blake2s/blake2s.c b/lib/zinc/blake2s/blake2s.c
new file mode 100644
index 000000000000..58d7e9378bd4
--- /dev/null
+++ b/lib/zinc/blake2s/blake2s.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2012 Samuel Neves <sneves@dei.uc.pt>. All Rights Reserved.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is an implementation of the BLAKE2s hash and PRF functions.
+ *
+ * Information: https://blake2.net/
+ *
+ */
+
+#include <zinc/blake2s.h>
+#include "../selftest/run.h"
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/bug.h>
+#include <asm/unaligned.h>
+
+typedef union {
+ struct {
+ u8 digest_length;
+ u8 key_length;
+ u8 fanout;
+ u8 depth;
+ u32 leaf_length;
+ u32 node_offset;
+ u16 xof_length;
+ u8 node_depth;
+ u8 inner_length;
+ u8 salt[8];
+ u8 personal[8];
+ };
+ __le32 words[8];
+} __packed blake2s_param;
+
+static const u32 blake2s_iv[8] = {
+ 0x6A09E667UL, 0xBB67AE85UL, 0x3C6EF372UL, 0xA54FF53AUL,
+ 0x510E527FUL, 0x9B05688CUL, 0x1F83D9ABUL, 0x5BE0CD19UL
+};
+
+static const u8 blake2s_sigma[10][16] = {
+ { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 },
+ { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 },
+ { 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 },
+ { 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 },
+ { 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 },
+ { 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 },
+ { 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 },
+ { 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 },
+ { 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 },
+ { 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 },
+};
+
+static inline void blake2s_set_lastblock(struct blake2s_state *state)
+{
+ if (state->last_node)
+ state->f[1] = -1;
+ state->f[0] = -1;
+}
+
+static inline void blake2s_increment_counter(struct blake2s_state *state,
+ const u32 inc)
+{
+ state->t[0] += inc;
+ state->t[1] += (state->t[0] < inc);
+}
+
+static inline void blake2s_init_param(struct blake2s_state *state,
+ const blake2s_param *param)
+{
+ int i;
+
+ memset(state, 0, sizeof(*state));
+ for (i = 0; i < 8; ++i)
+ state->h[i] = blake2s_iv[i] ^ le32_to_cpu(param->words[i]);
+}
+
+void blake2s_init(struct blake2s_state *state, const size_t outlen)
+{
+ blake2s_param param __aligned(__alignof__(u32)) = {
+ .digest_length = outlen,
+ .fanout = 1,
+ .depth = 1
+ };
+
+ WARN_ON(IS_ENABLED(DEBUG) && (!outlen || outlen > BLAKE2S_HASH_SIZE));
+ blake2s_init_param(state, ¶m);
+}
+EXPORT_SYMBOL(blake2s_init);
+
+void blake2s_init_key(struct blake2s_state *state, const size_t outlen,
+ const void *key, const size_t keylen)
+{
+ blake2s_param param = { .digest_length = outlen,
+ .key_length = keylen,
+ .fanout = 1,
+ .depth = 1 };
+ u8 block[BLAKE2S_BLOCK_SIZE] = { 0 };
+
+ WARN_ON(IS_ENABLED(DEBUG) && (!outlen || outlen > BLAKE2S_HASH_SIZE ||
+ !key || !keylen || keylen > BLAKE2S_KEY_SIZE));
+ blake2s_init_param(state, ¶m);
+ memcpy(block, key, keylen);
+ blake2s_update(state, block, BLAKE2S_BLOCK_SIZE);
+ memzero_explicit(block, BLAKE2S_BLOCK_SIZE);
+}
+EXPORT_SYMBOL(blake2s_init_key);
+
+static bool *const blake2s_nobs[] __initconst = { };
+static void __init blake2s_fpu_init(void)
+{
+}
+static inline bool blake2s_compress_arch(struct blake2s_state *state,
+ const u8 *block, size_t nblocks,
+ const u32 inc)
+{
+ return false;
+}
+
+static inline void blake2s_compress(struct blake2s_state *state,
+ const u8 *block, size_t nblocks,
+ const u32 inc)
+{
+ u32 m[16];
+ u32 v[16];
+ int i;
+
+ WARN_ON(IS_ENABLED(DEBUG) &&
+ (nblocks > 1 && inc != BLAKE2S_BLOCK_SIZE));
+
+ if (blake2s_compress_arch(state, block, nblocks, inc))
+ return;
+
+ while (nblocks > 0) {
+ blake2s_increment_counter(state, inc);
+ memcpy(m, block, BLAKE2S_BLOCK_SIZE);
+ le32_to_cpu_array(m, ARRAY_SIZE(m));
+ memcpy(v, state->h, 32);
+ v[ 8] = blake2s_iv[0];
+ v[ 9] = blake2s_iv[1];
+ v[10] = blake2s_iv[2];
+ v[11] = blake2s_iv[3];
+ v[12] = blake2s_iv[4] ^ state->t[0];
+ v[13] = blake2s_iv[5] ^ state->t[1];
+ v[14] = blake2s_iv[6] ^ state->f[0];
+ v[15] = blake2s_iv[7] ^ state->f[1];
+
+#define G(r, i, a, b, c, d) do { \
+ a += b + m[blake2s_sigma[r][2 * i + 0]]; \
+ d = ror32(d ^ a, 16); \
+ c += d; \
+ b = ror32(b ^ c, 12); \
+ a += b + m[blake2s_sigma[r][2 * i + 1]]; \
+ d = ror32(d ^ a, 8); \
+ c += d; \
+ b = ror32(b ^ c, 7); \
+} while (0)
+
+#define ROUND(r) do { \
+ G(r, 0, v[0], v[ 4], v[ 8], v[12]); \
+ G(r, 1, v[1], v[ 5], v[ 9], v[13]); \
+ G(r, 2, v[2], v[ 6], v[10], v[14]); \
+ G(r, 3, v[3], v[ 7], v[11], v[15]); \
+ G(r, 4, v[0], v[ 5], v[10], v[15]); \
+ G(r, 5, v[1], v[ 6], v[11], v[12]); \
+ G(r, 6, v[2], v[ 7], v[ 8], v[13]); \
+ G(r, 7, v[3], v[ 4], v[ 9], v[14]); \
+} while (0)
+ ROUND(0);
+ ROUND(1);
+ ROUND(2);
+ ROUND(3);
+ ROUND(4);
+ ROUND(5);
+ ROUND(6);
+ ROUND(7);
+ ROUND(8);
+ ROUND(9);
+
+#undef G
+#undef ROUND
+
+ for (i = 0; i < 8; ++i)
+ state->h[i] ^= v[i] ^ v[i + 8];
+
+ block += BLAKE2S_BLOCK_SIZE;
+ --nblocks;
+ }
+}
+
+void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen)
+{
+ const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
+
+ if (unlikely(!inlen))
+ return;
+ if (inlen > fill) {
+ memcpy(state->buf + state->buflen, in, fill);
+ blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
+ state->buflen = 0;
+ in += fill;
+ inlen -= fill;
+ }
+ if (inlen > BLAKE2S_BLOCK_SIZE) {
+ const size_t nblocks =
+ (inlen + BLAKE2S_BLOCK_SIZE - 1) / BLAKE2S_BLOCK_SIZE;
+ /* Hash one less (full) block than strictly possible */
+ blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
+ in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+ inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+ }
+ memcpy(state->buf + state->buflen, in, inlen);
+ state->buflen += inlen;
+}
+EXPORT_SYMBOL(blake2s_update);
+
+void blake2s_final(struct blake2s_state *state, u8 *out, const size_t outlen)
+{
+ WARN_ON(IS_ENABLED(DEBUG) &&
+ (!out || !outlen || outlen > BLAKE2S_HASH_SIZE));
+ blake2s_set_lastblock(state);
+ memset(state->buf + state->buflen, 0,
+ BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
+ blake2s_compress(state, state->buf, 1, state->buflen);
+ cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
+ memcpy(out, state->h, outlen);
+ memzero_explicit(state, sizeof(*state));
+}
+EXPORT_SYMBOL(blake2s_final);
+
+void blake2s_hmac(u8 *out, const u8 *in, const u8 *key, const size_t outlen,
+ const size_t inlen, const size_t keylen)
+{
+ struct blake2s_state state;
+ u8 x_key[BLAKE2S_BLOCK_SIZE] __aligned(__alignof__(u32)) = { 0 };
+ u8 i_hash[BLAKE2S_HASH_SIZE] __aligned(__alignof__(u32));
+ int i;
+
+ if (keylen > BLAKE2S_BLOCK_SIZE) {
+ blake2s_init(&state, BLAKE2S_HASH_SIZE);
+ blake2s_update(&state, key, keylen);
+ blake2s_final(&state, x_key, BLAKE2S_HASH_SIZE);
+ } else
+ memcpy(x_key, key, keylen);
+
+ for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i)
+ x_key[i] ^= 0x36;
+
+ blake2s_init(&state, BLAKE2S_HASH_SIZE);
+ blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE);
+ blake2s_update(&state, in, inlen);
+ blake2s_final(&state, i_hash, BLAKE2S_HASH_SIZE);
+
+ for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i)
+ x_key[i] ^= 0x5c ^ 0x36;
+
+ blake2s_init(&state, BLAKE2S_HASH_SIZE);
+ blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE);
+ blake2s_update(&state, i_hash, BLAKE2S_HASH_SIZE);
+ blake2s_final(&state, i_hash, BLAKE2S_HASH_SIZE);
+
+ memcpy(out, i_hash, outlen);
+ memzero_explicit(x_key, BLAKE2S_BLOCK_SIZE);
+ memzero_explicit(i_hash, BLAKE2S_HASH_SIZE);
+}
+EXPORT_SYMBOL(blake2s_hmac);
+
+#include "../selftest/blake2s.c"
+
+static bool nosimd __initdata = false;
+
+static int __init mod_init(void)
+{
+ if (!nosimd)
+ blake2s_fpu_init();
+ if (!selftest_run("blake2s", blake2s_selftest, blake2s_nobs,
+ ARRAY_SIZE(blake2s_nobs)))
+ return -ENOTRECOVERABLE;
+ return 0;
+}
+
+static void __exit mod_exit(void)
+{
+}
+
+module_param(nosimd, bool, 0);
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("BLAKE2s hash function");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/zinc/selftest/blake2s.c b/lib/zinc/selftest/blake2s.c
new file mode 100644
index 000000000000..7325a42334aa
--- /dev/null
+++ b/lib/zinc/selftest/blake2s.c
@@ -0,0 +1,2090 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+static const u8 blake2s_testvecs[][BLAKE2S_HASH_SIZE] __initconst = {
+ { 0x69, 0x21, 0x7a, 0x30, 0x79, 0x90, 0x80, 0x94,
+ 0xe1, 0x11, 0x21, 0xd0, 0x42, 0x35, 0x4a, 0x7c,
+ 0x1f, 0x55, 0xb6, 0x48, 0x2c, 0xa1, 0xa5, 0x1e,
+ 0x1b, 0x25, 0x0d, 0xfd, 0x1e, 0xd0, 0xee, 0xf9 },
+ { 0xe3, 0x4d, 0x74, 0xdb, 0xaf, 0x4f, 0xf4, 0xc6,
+ 0xab, 0xd8, 0x71, 0xcc, 0x22, 0x04, 0x51, 0xd2,
+ 0xea, 0x26, 0x48, 0x84, 0x6c, 0x77, 0x57, 0xfb,
+ 0xaa, 0xc8, 0x2f, 0xe5, 0x1a, 0xd6, 0x4b, 0xea },
+ { 0xdd, 0xad, 0x9a, 0xb1, 0x5d, 0xac, 0x45, 0x49,
+ 0xba, 0x42, 0xf4, 0x9d, 0x26, 0x24, 0x96, 0xbe,
+ 0xf6, 0xc0, 0xba, 0xe1, 0xdd, 0x34, 0x2a, 0x88,
+ 0x08, 0xf8, 0xea, 0x26, 0x7c, 0x6e, 0x21, 0x0c },
+ { 0xe8, 0xf9, 0x1c, 0x6e, 0xf2, 0x32, 0xa0, 0x41,
+ 0x45, 0x2a, 0xb0, 0xe1, 0x49, 0x07, 0x0c, 0xdd,
+ 0x7d, 0xd1, 0x76, 0x9e, 0x75, 0xb3, 0xa5, 0x92,
+ 0x1b, 0xe3, 0x78, 0x76, 0xc4, 0x5c, 0x99, 0x00 },
+ { 0x0c, 0xc7, 0x0e, 0x00, 0x34, 0x8b, 0x86, 0xba,
+ 0x29, 0x44, 0xd0, 0xc3, 0x20, 0x38, 0xb2, 0x5c,
+ 0x55, 0x58, 0x4f, 0x90, 0xdf, 0x23, 0x04, 0xf5,
+ 0x5f, 0xa3, 0x32, 0xaf, 0x5f, 0xb0, 0x1e, 0x20 },
+ { 0xec, 0x19, 0x64, 0x19, 0x10, 0x87, 0xa4, 0xfe,
+ 0x9d, 0xf1, 0xc7, 0x95, 0x34, 0x2a, 0x02, 0xff,
+ 0xc1, 0x91, 0xa5, 0xb2, 0x51, 0x76, 0x48, 0x56,
+ 0xae, 0x5b, 0x8b, 0x57, 0x69, 0xf0, 0xc6, 0xcd },
+ { 0xe1, 0xfa, 0x51, 0x61, 0x8d, 0x7d, 0xf4, 0xeb,
+ 0x70, 0xcf, 0x0d, 0x5a, 0x9e, 0x90, 0x6f, 0x80,
+ 0x6e, 0x9d, 0x19, 0xf7, 0xf4, 0xf0, 0x1e, 0x3b,
+ 0x62, 0x12, 0x88, 0xe4, 0x12, 0x04, 0x05, 0xd6 },
+ { 0x59, 0x80, 0x01, 0xfa, 0xfb, 0xe8, 0xf9, 0x4e,
+ 0xc6, 0x6d, 0xc8, 0x27, 0xd0, 0x12, 0xcf, 0xcb,
+ 0xba, 0x22, 0x28, 0x56, 0x9f, 0x44, 0x8e, 0x89,
+ 0xea, 0x22, 0x08, 0xc8, 0xbf, 0x76, 0x92, 0x93 },
+ { 0xc7, 0xe8, 0x87, 0xb5, 0x46, 0x62, 0x36, 0x35,
+ 0xe9, 0x3e, 0x04, 0x95, 0x59, 0x8f, 0x17, 0x26,
+ 0x82, 0x19, 0x96, 0xc2, 0x37, 0x77, 0x05, 0xb9,
+ 0x3a, 0x1f, 0x63, 0x6f, 0x87, 0x2b, 0xfa, 0x2d },
+ { 0xc3, 0x15, 0xa4, 0x37, 0xdd, 0x28, 0x06, 0x2a,
+ 0x77, 0x0d, 0x48, 0x19, 0x67, 0x13, 0x6b, 0x1b,
+ 0x5e, 0xb8, 0x8b, 0x21, 0xee, 0x53, 0xd0, 0x32,
+ 0x9c, 0x58, 0x97, 0x12, 0x6e, 0x9d, 0xb0, 0x2c },
+ { 0xbb, 0x47, 0x3d, 0xed, 0xdc, 0x05, 0x5f, 0xea,
+ 0x62, 0x28, 0xf2, 0x07, 0xda, 0x57, 0x53, 0x47,
+ 0xbb, 0x00, 0x40, 0x4c, 0xd3, 0x49, 0xd3, 0x8c,
+ 0x18, 0x02, 0x63, 0x07, 0xa2, 0x24, 0xcb, 0xff },
+ { 0x68, 0x7e, 0x18, 0x73, 0xa8, 0x27, 0x75, 0x91,
+ 0xbb, 0x33, 0xd9, 0xad, 0xf9, 0xa1, 0x39, 0x12,
+ 0xef, 0xef, 0xe5, 0x57, 0xca, 0xfc, 0x39, 0xa7,
+ 0x95, 0x26, 0x23, 0xe4, 0x72, 0x55, 0xf1, 0x6d },
+ { 0x1a, 0xc7, 0xba, 0x75, 0x4d, 0x6e, 0x2f, 0x94,
+ 0xe0, 0xe8, 0x6c, 0x46, 0xbf, 0xb2, 0x62, 0xab,
+ 0xbb, 0x74, 0xf4, 0x50, 0xef, 0x45, 0x6d, 0x6b,
+ 0x4d, 0x97, 0xaa, 0x80, 0xce, 0x6d, 0xa7, 0x67 },
+ { 0x01, 0x2c, 0x97, 0x80, 0x96, 0x14, 0x81, 0x6b,
+ 0x5d, 0x94, 0x94, 0x47, 0x7d, 0x4b, 0x68, 0x7d,
+ 0x15, 0xb9, 0x6e, 0xb6, 0x9c, 0x0e, 0x80, 0x74,
+ 0xa8, 0x51, 0x6f, 0x31, 0x22, 0x4b, 0x5c, 0x98 },
+ { 0x91, 0xff, 0xd2, 0x6c, 0xfa, 0x4d, 0xa5, 0x13,
+ 0x4c, 0x7e, 0xa2, 0x62, 0xf7, 0x88, 0x9c, 0x32,
+ 0x9f, 0x61, 0xf6, 0xa6, 0x57, 0x22, 0x5c, 0xc2,
+ 0x12, 0xf4, 0x00, 0x56, 0xd9, 0x86, 0xb3, 0xf4 },
+ { 0xd9, 0x7c, 0x82, 0x8d, 0x81, 0x82, 0xa7, 0x21,
+ 0x80, 0xa0, 0x6a, 0x78, 0x26, 0x83, 0x30, 0x67,
+ 0x3f, 0x7c, 0x4e, 0x06, 0x35, 0x94, 0x7c, 0x04,
+ 0xc0, 0x23, 0x23, 0xfd, 0x45, 0xc0, 0xa5, 0x2d },
+ { 0xef, 0xc0, 0x4c, 0xdc, 0x39, 0x1c, 0x7e, 0x91,
+ 0x19, 0xbd, 0x38, 0x66, 0x8a, 0x53, 0x4e, 0x65,
+ 0xfe, 0x31, 0x03, 0x6d, 0x6a, 0x62, 0x11, 0x2e,
+ 0x44, 0xeb, 0xeb, 0x11, 0xf9, 0xc5, 0x70, 0x80 },
+ { 0x99, 0x2c, 0xf5, 0xc0, 0x53, 0x44, 0x2a, 0x5f,
+ 0xbc, 0x4f, 0xaf, 0x58, 0x3e, 0x04, 0xe5, 0x0b,
+ 0xb7, 0x0d, 0x2f, 0x39, 0xfb, 0xb6, 0xa5, 0x03,
+ 0xf8, 0x9e, 0x56, 0xa6, 0x3e, 0x18, 0x57, 0x8a },
+ { 0x38, 0x64, 0x0e, 0x9f, 0x21, 0x98, 0x3e, 0x67,
+ 0xb5, 0x39, 0xca, 0xcc, 0xae, 0x5e, 0xcf, 0x61,
+ 0x5a, 0xe2, 0x76, 0x4f, 0x75, 0xa0, 0x9c, 0x9c,
+ 0x59, 0xb7, 0x64, 0x83, 0xc1, 0xfb, 0xc7, 0x35 },
+ { 0x21, 0x3d, 0xd3, 0x4c, 0x7e, 0xfe, 0x4f, 0xb2,
+ 0x7a, 0x6b, 0x35, 0xf6, 0xb4, 0x00, 0x0d, 0x1f,
+ 0xe0, 0x32, 0x81, 0xaf, 0x3c, 0x72, 0x3e, 0x5c,
+ 0x9f, 0x94, 0x74, 0x7a, 0x5f, 0x31, 0xcd, 0x3b },
+ { 0xec, 0x24, 0x6e, 0xee, 0xb9, 0xce, 0xd3, 0xf7,
+ 0xad, 0x33, 0xed, 0x28, 0x66, 0x0d, 0xd9, 0xbb,
+ 0x07, 0x32, 0x51, 0x3d, 0xb4, 0xe2, 0xfa, 0x27,
+ 0x8b, 0x60, 0xcd, 0xe3, 0x68, 0x2a, 0x4c, 0xcd },
+ { 0xac, 0x9b, 0x61, 0xd4, 0x46, 0x64, 0x8c, 0x30,
+ 0x05, 0xd7, 0x89, 0x2b, 0xf3, 0xa8, 0x71, 0x9f,
+ 0x4c, 0x81, 0x81, 0xcf, 0xdc, 0xbc, 0x2b, 0x79,
+ 0xfe, 0xf1, 0x0a, 0x27, 0x9b, 0x91, 0x10, 0x95 },
+ { 0x7b, 0xf8, 0xb2, 0x29, 0x59, 0xe3, 0x4e, 0x3a,
+ 0x43, 0xf7, 0x07, 0x92, 0x23, 0xe8, 0x3a, 0x97,
+ 0x54, 0x61, 0x7d, 0x39, 0x1e, 0x21, 0x3d, 0xfd,
+ 0x80, 0x8e, 0x41, 0xb9, 0xbe, 0xad, 0x4c, 0xe7 },
+ { 0x68, 0xd4, 0xb5, 0xd4, 0xfa, 0x0e, 0x30, 0x2b,
+ 0x64, 0xcc, 0xc5, 0xaf, 0x79, 0x29, 0x13, 0xac,
+ 0x4c, 0x88, 0xec, 0x95, 0xc0, 0x7d, 0xdf, 0x40,
+ 0x69, 0x42, 0x56, 0xeb, 0x88, 0xce, 0x9f, 0x3d },
+ { 0xb2, 0xc2, 0x42, 0x0f, 0x05, 0xf9, 0xab, 0xe3,
+ 0x63, 0x15, 0x91, 0x93, 0x36, 0xb3, 0x7e, 0x4e,
+ 0x0f, 0xa3, 0x3f, 0xf7, 0xe7, 0x6a, 0x49, 0x27,
+ 0x67, 0x00, 0x6f, 0xdb, 0x5d, 0x93, 0x54, 0x62 },
+ { 0x13, 0x4f, 0x61, 0xbb, 0xd0, 0xbb, 0xb6, 0x9a,
+ 0xed, 0x53, 0x43, 0x90, 0x45, 0x51, 0xa3, 0xe6,
+ 0xc1, 0xaa, 0x7d, 0xcd, 0xd7, 0x7e, 0x90, 0x3e,
+ 0x70, 0x23, 0xeb, 0x7c, 0x60, 0x32, 0x0a, 0xa7 },
+ { 0x46, 0x93, 0xf9, 0xbf, 0xf7, 0xd4, 0xf3, 0x98,
+ 0x6a, 0x7d, 0x17, 0x6e, 0x6e, 0x06, 0xf7, 0x2a,
+ 0xd1, 0x49, 0x0d, 0x80, 0x5c, 0x99, 0xe2, 0x53,
+ 0x47, 0xb8, 0xde, 0x77, 0xb4, 0xdb, 0x6d, 0x9b },
+ { 0x85, 0x3e, 0x26, 0xf7, 0x41, 0x95, 0x3b, 0x0f,
+ 0xd5, 0xbd, 0xb4, 0x24, 0xe8, 0xab, 0x9e, 0x8b,
+ 0x37, 0x50, 0xea, 0xa8, 0xef, 0x61, 0xe4, 0x79,
+ 0x02, 0xc9, 0x1e, 0x55, 0x4e, 0x9c, 0x73, 0xb9 },
+ { 0xf7, 0xde, 0x53, 0x63, 0x61, 0xab, 0xaa, 0x0e,
+ 0x15, 0x81, 0x56, 0xcf, 0x0e, 0xa4, 0xf6, 0x3a,
+ 0x99, 0xb5, 0xe4, 0x05, 0x4f, 0x8f, 0xa4, 0xc9,
+ 0xd4, 0x5f, 0x62, 0x85, 0xca, 0xd5, 0x56, 0x94 },
+ { 0x4c, 0x23, 0x06, 0x08, 0x86, 0x0a, 0x99, 0xae,
+ 0x8d, 0x7b, 0xd5, 0xc2, 0xcc, 0x17, 0xfa, 0x52,
+ 0x09, 0x6b, 0x9a, 0x61, 0xbe, 0xdb, 0x17, 0xcb,
+ 0x76, 0x17, 0x86, 0x4a, 0xd2, 0x9c, 0xa7, 0xa6 },
+ { 0xae, 0xb9, 0x20, 0xea, 0x87, 0x95, 0x2d, 0xad,
+ 0xb1, 0xfb, 0x75, 0x92, 0x91, 0xe3, 0x38, 0x81,
+ 0x39, 0xa8, 0x72, 0x86, 0x50, 0x01, 0x88, 0x6e,
+ 0xd8, 0x47, 0x52, 0xe9, 0x3c, 0x25, 0x0c, 0x2a },
+ { 0xab, 0xa4, 0xad, 0x9b, 0x48, 0x0b, 0x9d, 0xf3,
+ 0xd0, 0x8c, 0xa5, 0xe8, 0x7b, 0x0c, 0x24, 0x40,
+ 0xd4, 0xe4, 0xea, 0x21, 0x22, 0x4c, 0x2e, 0xb4,
+ 0x2c, 0xba, 0xe4, 0x69, 0xd0, 0x89, 0xb9, 0x31 },
+ { 0x05, 0x82, 0x56, 0x07, 0xd7, 0xfd, 0xf2, 0xd8,
+ 0x2e, 0xf4, 0xc3, 0xc8, 0xc2, 0xae, 0xa9, 0x61,
+ 0xad, 0x98, 0xd6, 0x0e, 0xdf, 0xf7, 0xd0, 0x18,
+ 0x98, 0x3e, 0x21, 0x20, 0x4c, 0x0d, 0x93, 0xd1 },
+ { 0xa7, 0x42, 0xf8, 0xb6, 0xaf, 0x82, 0xd8, 0xa6,
+ 0xca, 0x23, 0x57, 0xc5, 0xf1, 0xcf, 0x91, 0xde,
+ 0xfb, 0xd0, 0x66, 0x26, 0x7d, 0x75, 0xc0, 0x48,
+ 0xb3, 0x52, 0x36, 0x65, 0x85, 0x02, 0x59, 0x62 },
+ { 0x2b, 0xca, 0xc8, 0x95, 0x99, 0x00, 0x0b, 0x42,
+ 0xc9, 0x5a, 0xe2, 0x38, 0x35, 0xa7, 0x13, 0x70,
+ 0x4e, 0xd7, 0x97, 0x89, 0xc8, 0x4f, 0xef, 0x14,
+ 0x9a, 0x87, 0x4f, 0xf7, 0x33, 0xf0, 0x17, 0xa2 },
+ { 0xac, 0x1e, 0xd0, 0x7d, 0x04, 0x8f, 0x10, 0x5a,
+ 0x9e, 0x5b, 0x7a, 0xb8, 0x5b, 0x09, 0xa4, 0x92,
+ 0xd5, 0xba, 0xff, 0x14, 0xb8, 0xbf, 0xb0, 0xe9,
+ 0xfd, 0x78, 0x94, 0x86, 0xee, 0xa2, 0xb9, 0x74 },
+ { 0xe4, 0x8d, 0x0e, 0xcf, 0xaf, 0x49, 0x7d, 0x5b,
+ 0x27, 0xc2, 0x5d, 0x99, 0xe1, 0x56, 0xcb, 0x05,
+ 0x79, 0xd4, 0x40, 0xd6, 0xe3, 0x1f, 0xb6, 0x24,
+ 0x73, 0x69, 0x6d, 0xbf, 0x95, 0xe0, 0x10, 0xe4 },
+ { 0x12, 0xa9, 0x1f, 0xad, 0xf8, 0xb2, 0x16, 0x44,
+ 0xfd, 0x0f, 0x93, 0x4f, 0x3c, 0x4a, 0x8f, 0x62,
+ 0xba, 0x86, 0x2f, 0xfd, 0x20, 0xe8, 0xe9, 0x61,
+ 0x15, 0x4c, 0x15, 0xc1, 0x38, 0x84, 0xed, 0x3d },
+ { 0x7c, 0xbe, 0xe9, 0x6e, 0x13, 0x98, 0x97, 0xdc,
+ 0x98, 0xfb, 0xef, 0x3b, 0xe8, 0x1a, 0xd4, 0xd9,
+ 0x64, 0xd2, 0x35, 0xcb, 0x12, 0x14, 0x1f, 0xb6,
+ 0x67, 0x27, 0xe6, 0xe5, 0xdf, 0x73, 0xa8, 0x78 },
+ { 0xeb, 0xf6, 0x6a, 0xbb, 0x59, 0x7a, 0xe5, 0x72,
+ 0xa7, 0x29, 0x7c, 0xb0, 0x87, 0x1e, 0x35, 0x5a,
+ 0xcc, 0xaf, 0xad, 0x83, 0x77, 0xb8, 0xe7, 0x8b,
+ 0xf1, 0x64, 0xce, 0x2a, 0x18, 0xde, 0x4b, 0xaf },
+ { 0x71, 0xb9, 0x33, 0xb0, 0x7e, 0x4f, 0xf7, 0x81,
+ 0x8c, 0xe0, 0x59, 0xd0, 0x08, 0x82, 0x9e, 0x45,
+ 0x3c, 0x6f, 0xf0, 0x2e, 0xc0, 0xa7, 0xdb, 0x39,
+ 0x3f, 0xc2, 0xd8, 0x70, 0xf3, 0x7a, 0x72, 0x86 },
+ { 0x7c, 0xf7, 0xc5, 0x13, 0x31, 0x22, 0x0b, 0x8d,
+ 0x3e, 0xba, 0xed, 0x9c, 0x29, 0x39, 0x8a, 0x16,
+ 0xd9, 0x81, 0x56, 0xe2, 0x61, 0x3c, 0xb0, 0x88,
+ 0xf2, 0xb0, 0xe0, 0x8a, 0x1b, 0xe4, 0xcf, 0x4f },
+ { 0x3e, 0x41, 0xa1, 0x08, 0xe0, 0xf6, 0x4a, 0xd2,
+ 0x76, 0xb9, 0x79, 0xe1, 0xce, 0x06, 0x82, 0x79,
+ 0xe1, 0x6f, 0x7b, 0xc7, 0xe4, 0xaa, 0x1d, 0x21,
+ 0x1e, 0x17, 0xb8, 0x11, 0x61, 0xdf, 0x16, 0x02 },
+ { 0x88, 0x65, 0x02, 0xa8, 0x2a, 0xb4, 0x7b, 0xa8,
+ 0xd8, 0x67, 0x10, 0xaa, 0x9d, 0xe3, 0xd4, 0x6e,
+ 0xa6, 0x5c, 0x47, 0xaf, 0x6e, 0xe8, 0xde, 0x45,
+ 0x0c, 0xce, 0xb8, 0xb1, 0x1b, 0x04, 0x5f, 0x50 },
+ { 0xc0, 0x21, 0xbc, 0x5f, 0x09, 0x54, 0xfe, 0xe9,
+ 0x4f, 0x46, 0xea, 0x09, 0x48, 0x7e, 0x10, 0xa8,
+ 0x48, 0x40, 0xd0, 0x2f, 0x64, 0x81, 0x0b, 0xc0,
+ 0x8d, 0x9e, 0x55, 0x1f, 0x7d, 0x41, 0x68, 0x14 },
+ { 0x20, 0x30, 0x51, 0x6e, 0x8a, 0x5f, 0xe1, 0x9a,
+ 0xe7, 0x9c, 0x33, 0x6f, 0xce, 0x26, 0x38, 0x2a,
+ 0x74, 0x9d, 0x3f, 0xd0, 0xec, 0x91, 0xe5, 0x37,
+ 0xd4, 0xbd, 0x23, 0x58, 0xc1, 0x2d, 0xfb, 0x22 },
+ { 0x55, 0x66, 0x98, 0xda, 0xc8, 0x31, 0x7f, 0xd3,
+ 0x6d, 0xfb, 0xdf, 0x25, 0xa7, 0x9c, 0xb1, 0x12,
+ 0xd5, 0x42, 0x58, 0x60, 0x60, 0x5c, 0xba, 0xf5,
+ 0x07, 0xf2, 0x3b, 0xf7, 0xe9, 0xf4, 0x2a, 0xfe },
+ { 0x2f, 0x86, 0x7b, 0xa6, 0x77, 0x73, 0xfd, 0xc3,
+ 0xe9, 0x2f, 0xce, 0xd9, 0x9a, 0x64, 0x09, 0xad,
+ 0x39, 0xd0, 0xb8, 0x80, 0xfd, 0xe8, 0xf1, 0x09,
+ 0xa8, 0x17, 0x30, 0xc4, 0x45, 0x1d, 0x01, 0x78 },
+ { 0x17, 0x2e, 0xc2, 0x18, 0xf1, 0x19, 0xdf, 0xae,
+ 0x98, 0x89, 0x6d, 0xff, 0x29, 0xdd, 0x98, 0x76,
+ 0xc9, 0x4a, 0xf8, 0x74, 0x17, 0xf9, 0xae, 0x4c,
+ 0x70, 0x14, 0xbb, 0x4e, 0x4b, 0x96, 0xaf, 0xc7 },
+ { 0x3f, 0x85, 0x81, 0x4a, 0x18, 0x19, 0x5f, 0x87,
+ 0x9a, 0xa9, 0x62, 0xf9, 0x5d, 0x26, 0xbd, 0x82,
+ 0xa2, 0x78, 0xf2, 0xb8, 0x23, 0x20, 0x21, 0x8f,
+ 0x6b, 0x3b, 0xd6, 0xf7, 0xf6, 0x67, 0xa6, 0xd9 },
+ { 0x1b, 0x61, 0x8f, 0xba, 0xa5, 0x66, 0xb3, 0xd4,
+ 0x98, 0xc1, 0x2e, 0x98, 0x2c, 0x9e, 0xc5, 0x2e,
+ 0x4d, 0xa8, 0x5a, 0x8c, 0x54, 0xf3, 0x8f, 0x34,
+ 0xc0, 0x90, 0x39, 0x4f, 0x23, 0xc1, 0x84, 0xc1 },
+ { 0x0c, 0x75, 0x8f, 0xb5, 0x69, 0x2f, 0xfd, 0x41,
+ 0xa3, 0x57, 0x5d, 0x0a, 0xf0, 0x0c, 0xc7, 0xfb,
+ 0xf2, 0xcb, 0xe5, 0x90, 0x5a, 0x58, 0x32, 0x3a,
+ 0x88, 0xae, 0x42, 0x44, 0xf6, 0xe4, 0xc9, 0x93 },
+ { 0xa9, 0x31, 0x36, 0x0c, 0xad, 0x62, 0x8c, 0x7f,
+ 0x12, 0xa6, 0xc1, 0xc4, 0xb7, 0x53, 0xb0, 0xf4,
+ 0x06, 0x2a, 0xef, 0x3c, 0xe6, 0x5a, 0x1a, 0xe3,
+ 0xf1, 0x93, 0x69, 0xda, 0xdf, 0x3a, 0xe2, 0x3d },
+ { 0xcb, 0xac, 0x7d, 0x77, 0x3b, 0x1e, 0x3b, 0x3c,
+ 0x66, 0x91, 0xd7, 0xab, 0xb7, 0xe9, 0xdf, 0x04,
+ 0x5c, 0x8b, 0xa1, 0x92, 0x68, 0xde, 0xd1, 0x53,
+ 0x20, 0x7f, 0x5e, 0x80, 0x43, 0x52, 0xec, 0x5d },
+ { 0x23, 0xa1, 0x96, 0xd3, 0x80, 0x2e, 0xd3, 0xc1,
+ 0xb3, 0x84, 0x01, 0x9a, 0x82, 0x32, 0x58, 0x40,
+ 0xd3, 0x2f, 0x71, 0x95, 0x0c, 0x45, 0x80, 0xb0,
+ 0x34, 0x45, 0xe0, 0x89, 0x8e, 0x14, 0x05, 0x3c },
+ { 0xf4, 0x49, 0x54, 0x70, 0xf2, 0x26, 0xc8, 0xc2,
+ 0x14, 0xbe, 0x08, 0xfd, 0xfa, 0xd4, 0xbc, 0x4a,
+ 0x2a, 0x9d, 0xbe, 0xa9, 0x13, 0x6a, 0x21, 0x0d,
+ 0xf0, 0xd4, 0xb6, 0x49, 0x29, 0xe6, 0xfc, 0x14 },
+ { 0xe2, 0x90, 0xdd, 0x27, 0x0b, 0x46, 0x7f, 0x34,
+ 0xab, 0x1c, 0x00, 0x2d, 0x34, 0x0f, 0xa0, 0x16,
+ 0x25, 0x7f, 0xf1, 0x9e, 0x58, 0x33, 0xfd, 0xbb,
+ 0xf2, 0xcb, 0x40, 0x1c, 0x3b, 0x28, 0x17, 0xde },
+ { 0x9f, 0xc7, 0xb5, 0xde, 0xd3, 0xc1, 0x50, 0x42,
+ 0xb2, 0xa6, 0x58, 0x2d, 0xc3, 0x9b, 0xe0, 0x16,
+ 0xd2, 0x4a, 0x68, 0x2d, 0x5e, 0x61, 0xad, 0x1e,
+ 0xff, 0x9c, 0x63, 0x30, 0x98, 0x48, 0xf7, 0x06 },
+ { 0x8c, 0xca, 0x67, 0xa3, 0x6d, 0x17, 0xd5, 0xe6,
+ 0x34, 0x1c, 0xb5, 0x92, 0xfd, 0x7b, 0xef, 0x99,
+ 0x26, 0xc9, 0xe3, 0xaa, 0x10, 0x27, 0xea, 0x11,
+ 0xa7, 0xd8, 0xbd, 0x26, 0x0b, 0x57, 0x6e, 0x04 },
+ { 0x40, 0x93, 0x92, 0xf5, 0x60, 0xf8, 0x68, 0x31,
+ 0xda, 0x43, 0x73, 0xee, 0x5e, 0x00, 0x74, 0x26,
+ 0x05, 0x95, 0xd7, 0xbc, 0x24, 0x18, 0x3b, 0x60,
+ 0xed, 0x70, 0x0d, 0x45, 0x83, 0xd3, 0xf6, 0xf0 },
+ { 0x28, 0x02, 0x16, 0x5d, 0xe0, 0x90, 0x91, 0x55,
+ 0x46, 0xf3, 0x39, 0x8c, 0xd8, 0x49, 0x16, 0x4a,
+ 0x19, 0xf9, 0x2a, 0xdb, 0xc3, 0x61, 0xad, 0xc9,
+ 0x9b, 0x0f, 0x20, 0xc8, 0xea, 0x07, 0x10, 0x54 },
+ { 0xad, 0x83, 0x91, 0x68, 0xd9, 0xf8, 0xa4, 0xbe,
+ 0x95, 0xba, 0x9e, 0xf9, 0xa6, 0x92, 0xf0, 0x72,
+ 0x56, 0xae, 0x43, 0xfe, 0x6f, 0x98, 0x64, 0xe2,
+ 0x90, 0x69, 0x1b, 0x02, 0x56, 0xce, 0x50, 0xa9 },
+ { 0x75, 0xfd, 0xaa, 0x50, 0x38, 0xc2, 0x84, 0xb8,
+ 0x6d, 0x6e, 0x8a, 0xff, 0xe8, 0xb2, 0x80, 0x7e,
+ 0x46, 0x7b, 0x86, 0x60, 0x0e, 0x79, 0xaf, 0x36,
+ 0x89, 0xfb, 0xc0, 0x63, 0x28, 0xcb, 0xf8, 0x94 },
+ { 0xe5, 0x7c, 0xb7, 0x94, 0x87, 0xdd, 0x57, 0x90,
+ 0x24, 0x32, 0xb2, 0x50, 0x73, 0x38, 0x13, 0xbd,
+ 0x96, 0xa8, 0x4e, 0xfc, 0xe5, 0x9f, 0x65, 0x0f,
+ 0xac, 0x26, 0xe6, 0x69, 0x6a, 0xef, 0xaf, 0xc3 },
+ { 0x56, 0xf3, 0x4e, 0x8b, 0x96, 0x55, 0x7e, 0x90,
+ 0xc1, 0xf2, 0x4b, 0x52, 0xd0, 0xc8, 0x9d, 0x51,
+ 0x08, 0x6a, 0xcf, 0x1b, 0x00, 0xf6, 0x34, 0xcf,
+ 0x1d, 0xde, 0x92, 0x33, 0xb8, 0xea, 0xaa, 0x3e },
+ { 0x1b, 0x53, 0xee, 0x94, 0xaa, 0xf3, 0x4e, 0x4b,
+ 0x15, 0x9d, 0x48, 0xde, 0x35, 0x2c, 0x7f, 0x06,
+ 0x61, 0xd0, 0xa4, 0x0e, 0xdf, 0xf9, 0x5a, 0x0b,
+ 0x16, 0x39, 0xb4, 0x09, 0x0e, 0x97, 0x44, 0x72 },
+ { 0x05, 0x70, 0x5e, 0x2a, 0x81, 0x75, 0x7c, 0x14,
+ 0xbd, 0x38, 0x3e, 0xa9, 0x8d, 0xda, 0x54, 0x4e,
+ 0xb1, 0x0e, 0x6b, 0xc0, 0x7b, 0xae, 0x43, 0x5e,
+ 0x25, 0x18, 0xdb, 0xe1, 0x33, 0x52, 0x53, 0x75 },
+ { 0xd8, 0xb2, 0x86, 0x6e, 0x8a, 0x30, 0x9d, 0xb5,
+ 0x3e, 0x52, 0x9e, 0xc3, 0x29, 0x11, 0xd8, 0x2f,
+ 0x5c, 0xa1, 0x6c, 0xff, 0x76, 0x21, 0x68, 0x91,
+ 0xa9, 0x67, 0x6a, 0xa3, 0x1a, 0xaa, 0x6c, 0x42 },
+ { 0xf5, 0x04, 0x1c, 0x24, 0x12, 0x70, 0xeb, 0x04,
+ 0xc7, 0x1e, 0xc2, 0xc9, 0x5d, 0x4c, 0x38, 0xd8,
+ 0x03, 0xb1, 0x23, 0x7b, 0x0f, 0x29, 0xfd, 0x4d,
+ 0xb3, 0xeb, 0x39, 0x76, 0x69, 0xe8, 0x86, 0x99 },
+ { 0x9a, 0x4c, 0xe0, 0x77, 0xc3, 0x49, 0x32, 0x2f,
+ 0x59, 0x5e, 0x0e, 0xe7, 0x9e, 0xd0, 0xda, 0x5f,
+ 0xab, 0x66, 0x75, 0x2c, 0xbf, 0xef, 0x8f, 0x87,
+ 0xd0, 0xe9, 0xd0, 0x72, 0x3c, 0x75, 0x30, 0xdd },
+ { 0x65, 0x7b, 0x09, 0xf3, 0xd0, 0xf5, 0x2b, 0x5b,
+ 0x8f, 0x2f, 0x97, 0x16, 0x3a, 0x0e, 0xdf, 0x0c,
+ 0x04, 0xf0, 0x75, 0x40, 0x8a, 0x07, 0xbb, 0xeb,
+ 0x3a, 0x41, 0x01, 0xa8, 0x91, 0x99, 0x0d, 0x62 },
+ { 0x1e, 0x3f, 0x7b, 0xd5, 0xa5, 0x8f, 0xa5, 0x33,
+ 0x34, 0x4a, 0xa8, 0xed, 0x3a, 0xc1, 0x22, 0xbb,
+ 0x9e, 0x70, 0xd4, 0xef, 0x50, 0xd0, 0x04, 0x53,
+ 0x08, 0x21, 0x94, 0x8f, 0x5f, 0xe6, 0x31, 0x5a },
+ { 0x80, 0xdc, 0xcf, 0x3f, 0xd8, 0x3d, 0xfd, 0x0d,
+ 0x35, 0xaa, 0x28, 0x58, 0x59, 0x22, 0xab, 0x89,
+ 0xd5, 0x31, 0x39, 0x97, 0x67, 0x3e, 0xaf, 0x90,
+ 0x5c, 0xea, 0x9c, 0x0b, 0x22, 0x5c, 0x7b, 0x5f },
+ { 0x8a, 0x0d, 0x0f, 0xbf, 0x63, 0x77, 0xd8, 0x3b,
+ 0xb0, 0x8b, 0x51, 0x4b, 0x4b, 0x1c, 0x43, 0xac,
+ 0xc9, 0x5d, 0x75, 0x17, 0x14, 0xf8, 0x92, 0x56,
+ 0x45, 0xcb, 0x6b, 0xc8, 0x56, 0xca, 0x15, 0x0a },
+ { 0x9f, 0xa5, 0xb4, 0x87, 0x73, 0x8a, 0xd2, 0x84,
+ 0x4c, 0xc6, 0x34, 0x8a, 0x90, 0x19, 0x18, 0xf6,
+ 0x59, 0xa3, 0xb8, 0x9e, 0x9c, 0x0d, 0xfe, 0xea,
+ 0xd3, 0x0d, 0xd9, 0x4b, 0xcf, 0x42, 0xef, 0x8e },
+ { 0x80, 0x83, 0x2c, 0x4a, 0x16, 0x77, 0xf5, 0xea,
+ 0x25, 0x60, 0xf6, 0x68, 0xe9, 0x35, 0x4d, 0xd3,
+ 0x69, 0x97, 0xf0, 0x37, 0x28, 0xcf, 0xa5, 0x5e,
+ 0x1b, 0x38, 0x33, 0x7c, 0x0c, 0x9e, 0xf8, 0x18 },
+ { 0xab, 0x37, 0xdd, 0xb6, 0x83, 0x13, 0x7e, 0x74,
+ 0x08, 0x0d, 0x02, 0x6b, 0x59, 0x0b, 0x96, 0xae,
+ 0x9b, 0xb4, 0x47, 0x72, 0x2f, 0x30, 0x5a, 0x5a,
+ 0xc5, 0x70, 0xec, 0x1d, 0xf9, 0xb1, 0x74, 0x3c },
+ { 0x3e, 0xe7, 0x35, 0xa6, 0x94, 0xc2, 0x55, 0x9b,
+ 0x69, 0x3a, 0xa6, 0x86, 0x29, 0x36, 0x1e, 0x15,
+ 0xd1, 0x22, 0x65, 0xad, 0x6a, 0x3d, 0xed, 0xf4,
+ 0x88, 0xb0, 0xb0, 0x0f, 0xac, 0x97, 0x54, 0xba },
+ { 0xd6, 0xfc, 0xd2, 0x32, 0x19, 0xb6, 0x47, 0xe4,
+ 0xcb, 0xd5, 0xeb, 0x2d, 0x0a, 0xd0, 0x1e, 0xc8,
+ 0x83, 0x8a, 0x4b, 0x29, 0x01, 0xfc, 0x32, 0x5c,
+ 0xc3, 0x70, 0x19, 0x81, 0xca, 0x6c, 0x88, 0x8b },
+ { 0x05, 0x20, 0xec, 0x2f, 0x5b, 0xf7, 0xa7, 0x55,
+ 0xda, 0xcb, 0x50, 0xc6, 0xbf, 0x23, 0x3e, 0x35,
+ 0x15, 0x43, 0x47, 0x63, 0xdb, 0x01, 0x39, 0xcc,
+ 0xd9, 0xfa, 0xef, 0xbb, 0x82, 0x07, 0x61, 0x2d },
+ { 0xaf, 0xf3, 0xb7, 0x5f, 0x3f, 0x58, 0x12, 0x64,
+ 0xd7, 0x66, 0x16, 0x62, 0xb9, 0x2f, 0x5a, 0xd3,
+ 0x7c, 0x1d, 0x32, 0xbd, 0x45, 0xff, 0x81, 0xa4,
+ 0xed, 0x8a, 0xdc, 0x9e, 0xf3, 0x0d, 0xd9, 0x89 },
+ { 0xd0, 0xdd, 0x65, 0x0b, 0xef, 0xd3, 0xba, 0x63,
+ 0xdc, 0x25, 0x10, 0x2c, 0x62, 0x7c, 0x92, 0x1b,
+ 0x9c, 0xbe, 0xb0, 0xb1, 0x30, 0x68, 0x69, 0x35,
+ 0xb5, 0xc9, 0x27, 0xcb, 0x7c, 0xcd, 0x5e, 0x3b },
+ { 0xe1, 0x14, 0x98, 0x16, 0xb1, 0x0a, 0x85, 0x14,
+ 0xfb, 0x3e, 0x2c, 0xab, 0x2c, 0x08, 0xbe, 0xe9,
+ 0xf7, 0x3c, 0xe7, 0x62, 0x21, 0x70, 0x12, 0x46,
+ 0xa5, 0x89, 0xbb, 0xb6, 0x73, 0x02, 0xd8, 0xa9 },
+ { 0x7d, 0xa3, 0xf4, 0x41, 0xde, 0x90, 0x54, 0x31,
+ 0x7e, 0x72, 0xb5, 0xdb, 0xf9, 0x79, 0xda, 0x01,
+ 0xe6, 0xbc, 0xee, 0xbb, 0x84, 0x78, 0xea, 0xe6,
+ 0xa2, 0x28, 0x49, 0xd9, 0x02, 0x92, 0x63, 0x5c },
+ { 0x12, 0x30, 0xb1, 0xfc, 0x8a, 0x7d, 0x92, 0x15,
+ 0xed, 0xc2, 0xd4, 0xa2, 0xde, 0xcb, 0xdd, 0x0a,
+ 0x6e, 0x21, 0x6c, 0x92, 0x42, 0x78, 0xc9, 0x1f,
+ 0xc5, 0xd1, 0x0e, 0x7d, 0x60, 0x19, 0x2d, 0x94 },
+ { 0x57, 0x50, 0xd7, 0x16, 0xb4, 0x80, 0x8f, 0x75,
+ 0x1f, 0xeb, 0xc3, 0x88, 0x06, 0xba, 0x17, 0x0b,
+ 0xf6, 0xd5, 0x19, 0x9a, 0x78, 0x16, 0xbe, 0x51,
+ 0x4e, 0x3f, 0x93, 0x2f, 0xbe, 0x0c, 0xb8, 0x71 },
+ { 0x6f, 0xc5, 0x9b, 0x2f, 0x10, 0xfe, 0xba, 0x95,
+ 0x4a, 0xa6, 0x82, 0x0b, 0x3c, 0xa9, 0x87, 0xee,
+ 0x81, 0xd5, 0xcc, 0x1d, 0xa3, 0xc6, 0x3c, 0xe8,
+ 0x27, 0x30, 0x1c, 0x56, 0x9d, 0xfb, 0x39, 0xce },
+ { 0xc7, 0xc3, 0xfe, 0x1e, 0xeb, 0xdc, 0x7b, 0x5a,
+ 0x93, 0x93, 0x26, 0xe8, 0xdd, 0xb8, 0x3e, 0x8b,
+ 0xf2, 0xb7, 0x80, 0xb6, 0x56, 0x78, 0xcb, 0x62,
+ 0xf2, 0x08, 0xb0, 0x40, 0xab, 0xdd, 0x35, 0xe2 },
+ { 0x0c, 0x75, 0xc1, 0xa1, 0x5c, 0xf3, 0x4a, 0x31,
+ 0x4e, 0xe4, 0x78, 0xf4, 0xa5, 0xce, 0x0b, 0x8a,
+ 0x6b, 0x36, 0x52, 0x8e, 0xf7, 0xa8, 0x20, 0x69,
+ 0x6c, 0x3e, 0x42, 0x46, 0xc5, 0xa1, 0x58, 0x64 },
+ { 0x21, 0x6d, 0xc1, 0x2a, 0x10, 0x85, 0x69, 0xa3,
+ 0xc7, 0xcd, 0xde, 0x4a, 0xed, 0x43, 0xa6, 0xc3,
+ 0x30, 0x13, 0x9d, 0xda, 0x3c, 0xcc, 0x4a, 0x10,
+ 0x89, 0x05, 0xdb, 0x38, 0x61, 0x89, 0x90, 0x50 },
+ { 0xa5, 0x7b, 0xe6, 0xae, 0x67, 0x56, 0xf2, 0x8b,
+ 0x02, 0xf5, 0x9d, 0xad, 0xf7, 0xe0, 0xd7, 0xd8,
+ 0x80, 0x7f, 0x10, 0xfa, 0x15, 0xce, 0xd1, 0xad,
+ 0x35, 0x85, 0x52, 0x1a, 0x1d, 0x99, 0x5a, 0x89 },
+ { 0x81, 0x6a, 0xef, 0x87, 0x59, 0x53, 0x71, 0x6c,
+ 0xd7, 0xa5, 0x81, 0xf7, 0x32, 0xf5, 0x3d, 0xd4,
+ 0x35, 0xda, 0xb6, 0x6d, 0x09, 0xc3, 0x61, 0xd2,
+ 0xd6, 0x59, 0x2d, 0xe1, 0x77, 0x55, 0xd8, 0xa8 },
+ { 0x9a, 0x76, 0x89, 0x32, 0x26, 0x69, 0x3b, 0x6e,
+ 0xa9, 0x7e, 0x6a, 0x73, 0x8f, 0x9d, 0x10, 0xfb,
+ 0x3d, 0x0b, 0x43, 0xae, 0x0e, 0x8b, 0x7d, 0x81,
+ 0x23, 0xea, 0x76, 0xce, 0x97, 0x98, 0x9c, 0x7e },
+ { 0x8d, 0xae, 0xdb, 0x9a, 0x27, 0x15, 0x29, 0xdb,
+ 0xb7, 0xdc, 0x3b, 0x60, 0x7f, 0xe5, 0xeb, 0x2d,
+ 0x32, 0x11, 0x77, 0x07, 0x58, 0xdd, 0x3b, 0x0a,
+ 0x35, 0x93, 0xd2, 0xd7, 0x95, 0x4e, 0x2d, 0x5b },
+ { 0x16, 0xdb, 0xc0, 0xaa, 0x5d, 0xd2, 0xc7, 0x74,
+ 0xf5, 0x05, 0x10, 0x0f, 0x73, 0x37, 0x86, 0xd8,
+ 0xa1, 0x75, 0xfc, 0xbb, 0xb5, 0x9c, 0x43, 0xe1,
+ 0xfb, 0xff, 0x3e, 0x1e, 0xaf, 0x31, 0xcb, 0x4a },
+ { 0x86, 0x06, 0xcb, 0x89, 0x9c, 0x6a, 0xea, 0xf5,
+ 0x1b, 0x9d, 0xb0, 0xfe, 0x49, 0x24, 0xa9, 0xfd,
+ 0x5d, 0xab, 0xc1, 0x9f, 0x88, 0x26, 0xf2, 0xbc,
+ 0x1c, 0x1d, 0x7d, 0xa1, 0x4d, 0x2c, 0x2c, 0x99 },
+ { 0x84, 0x79, 0x73, 0x1a, 0xed, 0xa5, 0x7b, 0xd3,
+ 0x7e, 0xad, 0xb5, 0x1a, 0x50, 0x7e, 0x30, 0x7f,
+ 0x3b, 0xd9, 0x5e, 0x69, 0xdb, 0xca, 0x94, 0xf3,
+ 0xbc, 0x21, 0x72, 0x60, 0x66, 0xad, 0x6d, 0xfd },
+ { 0x58, 0x47, 0x3a, 0x9e, 0xa8, 0x2e, 0xfa, 0x3f,
+ 0x3b, 0x3d, 0x8f, 0xc8, 0x3e, 0xd8, 0x86, 0x31,
+ 0x27, 0xb3, 0x3a, 0xe8, 0xde, 0xae, 0x63, 0x07,
+ 0x20, 0x1e, 0xdb, 0x6d, 0xde, 0x61, 0xde, 0x29 },
+ { 0x9a, 0x92, 0x55, 0xd5, 0x3a, 0xf1, 0x16, 0xde,
+ 0x8b, 0xa2, 0x7c, 0xe3, 0x5b, 0x4c, 0x7e, 0x15,
+ 0x64, 0x06, 0x57, 0xa0, 0xfc, 0xb8, 0x88, 0xc7,
+ 0x0d, 0x95, 0x43, 0x1d, 0xac, 0xd8, 0xf8, 0x30 },
+ { 0x9e, 0xb0, 0x5f, 0xfb, 0xa3, 0x9f, 0xd8, 0x59,
+ 0x6a, 0x45, 0x49, 0x3e, 0x18, 0xd2, 0x51, 0x0b,
+ 0xf3, 0xef, 0x06, 0x5c, 0x51, 0xd6, 0xe1, 0x3a,
+ 0xbe, 0x66, 0xaa, 0x57, 0xe0, 0x5c, 0xfd, 0xb7 },
+ { 0x81, 0xdc, 0xc3, 0xa5, 0x05, 0xea, 0xce, 0x3f,
+ 0x87, 0x9d, 0x8f, 0x70, 0x27, 0x76, 0x77, 0x0f,
+ 0x9d, 0xf5, 0x0e, 0x52, 0x1d, 0x14, 0x28, 0xa8,
+ 0x5d, 0xaf, 0x04, 0xf9, 0xad, 0x21, 0x50, 0xe0 },
+ { 0xe3, 0xe3, 0xc4, 0xaa, 0x3a, 0xcb, 0xbc, 0x85,
+ 0x33, 0x2a, 0xf9, 0xd5, 0x64, 0xbc, 0x24, 0x16,
+ 0x5e, 0x16, 0x87, 0xf6, 0xb1, 0xad, 0xcb, 0xfa,
+ 0xe7, 0x7a, 0x8f, 0x03, 0xc7, 0x2a, 0xc2, 0x8c },
+ { 0x67, 0x46, 0xc8, 0x0b, 0x4e, 0xb5, 0x6a, 0xea,
+ 0x45, 0xe6, 0x4e, 0x72, 0x89, 0xbb, 0xa3, 0xed,
+ 0xbf, 0x45, 0xec, 0xf8, 0x20, 0x64, 0x81, 0xff,
+ 0x63, 0x02, 0x12, 0x29, 0x84, 0xcd, 0x52, 0x6a },
+ { 0x2b, 0x62, 0x8e, 0x52, 0x76, 0x4d, 0x7d, 0x62,
+ 0xc0, 0x86, 0x8b, 0x21, 0x23, 0x57, 0xcd, 0xd1,
+ 0x2d, 0x91, 0x49, 0x82, 0x2f, 0x4e, 0x98, 0x45,
+ 0xd9, 0x18, 0xa0, 0x8d, 0x1a, 0xe9, 0x90, 0xc0 },
+ { 0xe4, 0xbf, 0xe8, 0x0d, 0x58, 0xc9, 0x19, 0x94,
+ 0x61, 0x39, 0x09, 0xdc, 0x4b, 0x1a, 0x12, 0x49,
+ 0x68, 0x96, 0xc0, 0x04, 0xaf, 0x7b, 0x57, 0x01,
+ 0x48, 0x3d, 0xe4, 0x5d, 0x28, 0x23, 0xd7, 0x8e },
+ { 0xeb, 0xb4, 0xba, 0x15, 0x0c, 0xef, 0x27, 0x34,
+ 0x34, 0x5b, 0x5d, 0x64, 0x1b, 0xbe, 0xd0, 0x3a,
+ 0x21, 0xea, 0xfa, 0xe9, 0x33, 0xc9, 0x9e, 0x00,
+ 0x92, 0x12, 0xef, 0x04, 0x57, 0x4a, 0x85, 0x30 },
+ { 0x39, 0x66, 0xec, 0x73, 0xb1, 0x54, 0xac, 0xc6,
+ 0x97, 0xac, 0x5c, 0xf5, 0xb2, 0x4b, 0x40, 0xbd,
+ 0xb0, 0xdb, 0x9e, 0x39, 0x88, 0x36, 0xd7, 0x6d,
+ 0x4b, 0x88, 0x0e, 0x3b, 0x2a, 0xf1, 0xaa, 0x27 },
+ { 0xef, 0x7e, 0x48, 0x31, 0xb3, 0xa8, 0x46, 0x36,
+ 0x51, 0x8d, 0x6e, 0x4b, 0xfc, 0xe6, 0x4a, 0x43,
+ 0xdb, 0x2a, 0x5d, 0xda, 0x9c, 0xca, 0x2b, 0x44,
+ 0xf3, 0x90, 0x33, 0xbd, 0xc4, 0x0d, 0x62, 0x43 },
+ { 0x7a, 0xbf, 0x6a, 0xcf, 0x5c, 0x8e, 0x54, 0x9d,
+ 0xdb, 0xb1, 0x5a, 0xe8, 0xd8, 0xb3, 0x88, 0xc1,
+ 0xc1, 0x97, 0xe6, 0x98, 0x73, 0x7c, 0x97, 0x85,
+ 0x50, 0x1e, 0xd1, 0xf9, 0x49, 0x30, 0xb7, 0xd9 },
+ { 0x88, 0x01, 0x8d, 0xed, 0x66, 0x81, 0x3f, 0x0c,
+ 0xa9, 0x5d, 0xef, 0x47, 0x4c, 0x63, 0x06, 0x92,
+ 0x01, 0x99, 0x67, 0xb9, 0xe3, 0x68, 0x88, 0xda,
+ 0xdd, 0x94, 0x12, 0x47, 0x19, 0xb6, 0x82, 0xf6 },
+ { 0x39, 0x30, 0x87, 0x6b, 0x9f, 0xc7, 0x52, 0x90,
+ 0x36, 0xb0, 0x08, 0xb1, 0xb8, 0xbb, 0x99, 0x75,
+ 0x22, 0xa4, 0x41, 0x63, 0x5a, 0x0c, 0x25, 0xec,
+ 0x02, 0xfb, 0x6d, 0x90, 0x26, 0xe5, 0x5a, 0x97 },
+ { 0x0a, 0x40, 0x49, 0xd5, 0x7e, 0x83, 0x3b, 0x56,
+ 0x95, 0xfa, 0xc9, 0x3d, 0xd1, 0xfb, 0xef, 0x31,
+ 0x66, 0xb4, 0x4b, 0x12, 0xad, 0x11, 0x24, 0x86,
+ 0x62, 0x38, 0x3a, 0xe0, 0x51, 0xe1, 0x58, 0x27 },
+ { 0x81, 0xdc, 0xc0, 0x67, 0x8b, 0xb6, 0xa7, 0x65,
+ 0xe4, 0x8c, 0x32, 0x09, 0x65, 0x4f, 0xe9, 0x00,
+ 0x89, 0xce, 0x44, 0xff, 0x56, 0x18, 0x47, 0x7e,
+ 0x39, 0xab, 0x28, 0x64, 0x76, 0xdf, 0x05, 0x2b },
+ { 0xe6, 0x9b, 0x3a, 0x36, 0xa4, 0x46, 0x19, 0x12,
+ 0xdc, 0x08, 0x34, 0x6b, 0x11, 0xdd, 0xcb, 0x9d,
+ 0xb7, 0x96, 0xf8, 0x85, 0xfd, 0x01, 0x93, 0x6e,
+ 0x66, 0x2f, 0xe2, 0x92, 0x97, 0xb0, 0x99, 0xa4 },
+ { 0x5a, 0xc6, 0x50, 0x3b, 0x0d, 0x8d, 0xa6, 0x91,
+ 0x76, 0x46, 0xe6, 0xdc, 0xc8, 0x7e, 0xdc, 0x58,
+ 0xe9, 0x42, 0x45, 0x32, 0x4c, 0xc2, 0x04, 0xf4,
+ 0xdd, 0x4a, 0xf0, 0x15, 0x63, 0xac, 0xd4, 0x27 },
+ { 0xdf, 0x6d, 0xda, 0x21, 0x35, 0x9a, 0x30, 0xbc,
+ 0x27, 0x17, 0x80, 0x97, 0x1c, 0x1a, 0xbd, 0x56,
+ 0xa6, 0xef, 0x16, 0x7e, 0x48, 0x08, 0x87, 0x88,
+ 0x8e, 0x73, 0xa8, 0x6d, 0x3b, 0xf6, 0x05, 0xe9 },
+ { 0xe8, 0xe6, 0xe4, 0x70, 0x71, 0xe7, 0xb7, 0xdf,
+ 0x25, 0x80, 0xf2, 0x25, 0xcf, 0xbb, 0xed, 0xf8,
+ 0x4c, 0xe6, 0x77, 0x46, 0x62, 0x66, 0x28, 0xd3,
+ 0x30, 0x97, 0xe4, 0xb7, 0xdc, 0x57, 0x11, 0x07 },
+ { 0x53, 0xe4, 0x0e, 0xad, 0x62, 0x05, 0x1e, 0x19,
+ 0xcb, 0x9b, 0xa8, 0x13, 0x3e, 0x3e, 0x5c, 0x1c,
+ 0xe0, 0x0d, 0xdc, 0xad, 0x8a, 0xcf, 0x34, 0x2a,
+ 0x22, 0x43, 0x60, 0xb0, 0xac, 0xc1, 0x47, 0x77 },
+ { 0x9c, 0xcd, 0x53, 0xfe, 0x80, 0xbe, 0x78, 0x6a,
+ 0xa9, 0x84, 0x63, 0x84, 0x62, 0xfb, 0x28, 0xaf,
+ 0xdf, 0x12, 0x2b, 0x34, 0xd7, 0x8f, 0x46, 0x87,
+ 0xec, 0x63, 0x2b, 0xb1, 0x9d, 0xe2, 0x37, 0x1a },
+ { 0xcb, 0xd4, 0x80, 0x52, 0xc4, 0x8d, 0x78, 0x84,
+ 0x66, 0xa3, 0xe8, 0x11, 0x8c, 0x56, 0xc9, 0x7f,
+ 0xe1, 0x46, 0xe5, 0x54, 0x6f, 0xaa, 0xf9, 0x3e,
+ 0x2b, 0xc3, 0xc4, 0x7e, 0x45, 0x93, 0x97, 0x53 },
+ { 0x25, 0x68, 0x83, 0xb1, 0x4e, 0x2a, 0xf4, 0x4d,
+ 0xad, 0xb2, 0x8e, 0x1b, 0x34, 0xb2, 0xac, 0x0f,
+ 0x0f, 0x4c, 0x91, 0xc3, 0x4e, 0xc9, 0x16, 0x9e,
+ 0x29, 0x03, 0x61, 0x58, 0xac, 0xaa, 0x95, 0xb9 },
+ { 0x44, 0x71, 0xb9, 0x1a, 0xb4, 0x2d, 0xb7, 0xc4,
+ 0xdd, 0x84, 0x90, 0xab, 0x95, 0xa2, 0xee, 0x8d,
+ 0x04, 0xe3, 0xef, 0x5c, 0x3d, 0x6f, 0xc7, 0x1a,
+ 0xc7, 0x4b, 0x2b, 0x26, 0x91, 0x4d, 0x16, 0x41 },
+ { 0xa5, 0xeb, 0x08, 0x03, 0x8f, 0x8f, 0x11, 0x55,
+ 0xed, 0x86, 0xe6, 0x31, 0x90, 0x6f, 0xc1, 0x30,
+ 0x95, 0xf6, 0xbb, 0xa4, 0x1d, 0xe5, 0xd4, 0xe7,
+ 0x95, 0x75, 0x8e, 0xc8, 0xc8, 0xdf, 0x8a, 0xf1 },
+ { 0xdc, 0x1d, 0xb6, 0x4e, 0xd8, 0xb4, 0x8a, 0x91,
+ 0x0e, 0x06, 0x0a, 0x6b, 0x86, 0x63, 0x74, 0xc5,
+ 0x78, 0x78, 0x4e, 0x9a, 0xc4, 0x9a, 0xb2, 0x77,
+ 0x40, 0x92, 0xac, 0x71, 0x50, 0x19, 0x34, 0xac },
+ { 0x28, 0x54, 0x13, 0xb2, 0xf2, 0xee, 0x87, 0x3d,
+ 0x34, 0x31, 0x9e, 0xe0, 0xbb, 0xfb, 0xb9, 0x0f,
+ 0x32, 0xda, 0x43, 0x4c, 0xc8, 0x7e, 0x3d, 0xb5,
+ 0xed, 0x12, 0x1b, 0xb3, 0x98, 0xed, 0x96, 0x4b },
+ { 0x02, 0x16, 0xe0, 0xf8, 0x1f, 0x75, 0x0f, 0x26,
+ 0xf1, 0x99, 0x8b, 0xc3, 0x93, 0x4e, 0x3e, 0x12,
+ 0x4c, 0x99, 0x45, 0xe6, 0x85, 0xa6, 0x0b, 0x25,
+ 0xe8, 0xfb, 0xd9, 0x62, 0x5a, 0xb6, 0xb5, 0x99 },
+ { 0x38, 0xc4, 0x10, 0xf5, 0xb9, 0xd4, 0x07, 0x20,
+ 0x50, 0x75, 0x5b, 0x31, 0xdc, 0xa8, 0x9f, 0xd5,
+ 0x39, 0x5c, 0x67, 0x85, 0xee, 0xb3, 0xd7, 0x90,
+ 0xf3, 0x20, 0xff, 0x94, 0x1c, 0x5a, 0x93, 0xbf },
+ { 0xf1, 0x84, 0x17, 0xb3, 0x9d, 0x61, 0x7a, 0xb1,
+ 0xc1, 0x8f, 0xdf, 0x91, 0xeb, 0xd0, 0xfc, 0x6d,
+ 0x55, 0x16, 0xbb, 0x34, 0xcf, 0x39, 0x36, 0x40,
+ 0x37, 0xbc, 0xe8, 0x1f, 0xa0, 0x4c, 0xec, 0xb1 },
+ { 0x1f, 0xa8, 0x77, 0xde, 0x67, 0x25, 0x9d, 0x19,
+ 0x86, 0x3a, 0x2a, 0x34, 0xbc, 0xc6, 0x96, 0x2a,
+ 0x2b, 0x25, 0xfc, 0xbf, 0x5c, 0xbe, 0xcd, 0x7e,
+ 0xde, 0x8f, 0x1f, 0xa3, 0x66, 0x88, 0xa7, 0x96 },
+ { 0x5b, 0xd1, 0x69, 0xe6, 0x7c, 0x82, 0xc2, 0xc2,
+ 0xe9, 0x8e, 0xf7, 0x00, 0x8b, 0xdf, 0x26, 0x1f,
+ 0x2d, 0xdf, 0x30, 0xb1, 0xc0, 0x0f, 0x9e, 0x7f,
+ 0x27, 0x5b, 0xb3, 0xe8, 0xa2, 0x8d, 0xc9, 0xa2 },
+ { 0xc8, 0x0a, 0xbe, 0xeb, 0xb6, 0x69, 0xad, 0x5d,
+ 0xee, 0xb5, 0xf5, 0xec, 0x8e, 0xa6, 0xb7, 0xa0,
+ 0x5d, 0xdf, 0x7d, 0x31, 0xec, 0x4c, 0x0a, 0x2e,
+ 0xe2, 0x0b, 0x0b, 0x98, 0xca, 0xec, 0x67, 0x46 },
+ { 0xe7, 0x6d, 0x3f, 0xbd, 0xa5, 0xba, 0x37, 0x4e,
+ 0x6b, 0xf8, 0xe5, 0x0f, 0xad, 0xc3, 0xbb, 0xb9,
+ 0xba, 0x5c, 0x20, 0x6e, 0xbd, 0xec, 0x89, 0xa3,
+ 0xa5, 0x4c, 0xf3, 0xdd, 0x84, 0xa0, 0x70, 0x16 },
+ { 0x7b, 0xba, 0x9d, 0xc5, 0xb5, 0xdb, 0x20, 0x71,
+ 0xd1, 0x77, 0x52, 0xb1, 0x04, 0x4c, 0x1e, 0xce,
+ 0xd9, 0x6a, 0xaf, 0x2d, 0xd4, 0x6e, 0x9b, 0x43,
+ 0x37, 0x50, 0xe8, 0xea, 0x0d, 0xcc, 0x18, 0x70 },
+ { 0xf2, 0x9b, 0x1b, 0x1a, 0xb9, 0xba, 0xb1, 0x63,
+ 0x01, 0x8e, 0xe3, 0xda, 0x15, 0x23, 0x2c, 0xca,
+ 0x78, 0xec, 0x52, 0xdb, 0xc3, 0x4e, 0xda, 0x5b,
+ 0x82, 0x2e, 0xc1, 0xd8, 0x0f, 0xc2, 0x1b, 0xd0 },
+ { 0x9e, 0xe3, 0xe3, 0xe7, 0xe9, 0x00, 0xf1, 0xe1,
+ 0x1d, 0x30, 0x8c, 0x4b, 0x2b, 0x30, 0x76, 0xd2,
+ 0x72, 0xcf, 0x70, 0x12, 0x4f, 0x9f, 0x51, 0xe1,
+ 0xda, 0x60, 0xf3, 0x78, 0x46, 0xcd, 0xd2, 0xf4 },
+ { 0x70, 0xea, 0x3b, 0x01, 0x76, 0x92, 0x7d, 0x90,
+ 0x96, 0xa1, 0x85, 0x08, 0xcd, 0x12, 0x3a, 0x29,
+ 0x03, 0x25, 0x92, 0x0a, 0x9d, 0x00, 0xa8, 0x9b,
+ 0x5d, 0xe0, 0x42, 0x73, 0xfb, 0xc7, 0x6b, 0x85 },
+ { 0x67, 0xde, 0x25, 0xc0, 0x2a, 0x4a, 0xab, 0xa2,
+ 0x3b, 0xdc, 0x97, 0x3c, 0x8b, 0xb0, 0xb5, 0x79,
+ 0x6d, 0x47, 0xcc, 0x06, 0x59, 0xd4, 0x3d, 0xff,
+ 0x1f, 0x97, 0xde, 0x17, 0x49, 0x63, 0xb6, 0x8e },
+ { 0xb2, 0x16, 0x8e, 0x4e, 0x0f, 0x18, 0xb0, 0xe6,
+ 0x41, 0x00, 0xb5, 0x17, 0xed, 0x95, 0x25, 0x7d,
+ 0x73, 0xf0, 0x62, 0x0d, 0xf8, 0x85, 0xc1, 0x3d,
+ 0x2e, 0xcf, 0x79, 0x36, 0x7b, 0x38, 0x4c, 0xee },
+ { 0x2e, 0x7d, 0xec, 0x24, 0x28, 0x85, 0x3b, 0x2c,
+ 0x71, 0x76, 0x07, 0x45, 0x54, 0x1f, 0x7a, 0xfe,
+ 0x98, 0x25, 0xb5, 0xdd, 0x77, 0xdf, 0x06, 0x51,
+ 0x1d, 0x84, 0x41, 0xa9, 0x4b, 0xac, 0xc9, 0x27 },
+ { 0xca, 0x9f, 0xfa, 0xc4, 0xc4, 0x3f, 0x0b, 0x48,
+ 0x46, 0x1d, 0xc5, 0xc2, 0x63, 0xbe, 0xa3, 0xf6,
+ 0xf0, 0x06, 0x11, 0xce, 0xac, 0xab, 0xf6, 0xf8,
+ 0x95, 0xba, 0x2b, 0x01, 0x01, 0xdb, 0xb6, 0x8d },
+ { 0x74, 0x10, 0xd4, 0x2d, 0x8f, 0xd1, 0xd5, 0xe9,
+ 0xd2, 0xf5, 0x81, 0x5c, 0xb9, 0x34, 0x17, 0x99,
+ 0x88, 0x28, 0xef, 0x3c, 0x42, 0x30, 0xbf, 0xbd,
+ 0x41, 0x2d, 0xf0, 0xa4, 0xa7, 0xa2, 0x50, 0x7a },
+ { 0x50, 0x10, 0xf6, 0x84, 0x51, 0x6d, 0xcc, 0xd0,
+ 0xb6, 0xee, 0x08, 0x52, 0xc2, 0x51, 0x2b, 0x4d,
+ 0xc0, 0x06, 0x6c, 0xf0, 0xd5, 0x6f, 0x35, 0x30,
+ 0x29, 0x78, 0xdb, 0x8a, 0xe3, 0x2c, 0x6a, 0x81 },
+ { 0xac, 0xaa, 0xb5, 0x85, 0xf7, 0xb7, 0x9b, 0x71,
+ 0x99, 0x35, 0xce, 0xb8, 0x95, 0x23, 0xdd, 0xc5,
+ 0x48, 0x27, 0xf7, 0x5c, 0x56, 0x88, 0x38, 0x56,
+ 0x15, 0x4a, 0x56, 0xcd, 0xcd, 0x5e, 0xe9, 0x88 },
+ { 0x66, 0x6d, 0xe5, 0xd1, 0x44, 0x0f, 0xee, 0x73,
+ 0x31, 0xaa, 0xf0, 0x12, 0x3a, 0x62, 0xef, 0x2d,
+ 0x8b, 0xa5, 0x74, 0x53, 0xa0, 0x76, 0x96, 0x35,
+ 0xac, 0x6c, 0xd0, 0x1e, 0x63, 0x3f, 0x77, 0x12 },
+ { 0xa6, 0xf9, 0x86, 0x58, 0xf6, 0xea, 0xba, 0xf9,
+ 0x02, 0xd8, 0xb3, 0x87, 0x1a, 0x4b, 0x10, 0x1d,
+ 0x16, 0x19, 0x6e, 0x8a, 0x4b, 0x24, 0x1e, 0x15,
+ 0x58, 0xfe, 0x29, 0x96, 0x6e, 0x10, 0x3e, 0x8d },
+ { 0x89, 0x15, 0x46, 0xa8, 0xb2, 0x9f, 0x30, 0x47,
+ 0xdd, 0xcf, 0xe5, 0xb0, 0x0e, 0x45, 0xfd, 0x55,
+ 0x75, 0x63, 0x73, 0x10, 0x5e, 0xa8, 0x63, 0x7d,
+ 0xfc, 0xff, 0x54, 0x7b, 0x6e, 0xa9, 0x53, 0x5f },
+ { 0x18, 0xdf, 0xbc, 0x1a, 0xc5, 0xd2, 0x5b, 0x07,
+ 0x61, 0x13, 0x7d, 0xbd, 0x22, 0xc1, 0x7c, 0x82,
+ 0x9d, 0x0f, 0x0e, 0xf1, 0xd8, 0x23, 0x44, 0xe9,
+ 0xc8, 0x9c, 0x28, 0x66, 0x94, 0xda, 0x24, 0xe8 },
+ { 0xb5, 0x4b, 0x9b, 0x67, 0xf8, 0xfe, 0xd5, 0x4b,
+ 0xbf, 0x5a, 0x26, 0x66, 0xdb, 0xdf, 0x4b, 0x23,
+ 0xcf, 0xf1, 0xd1, 0xb6, 0xf4, 0xaf, 0xc9, 0x85,
+ 0xb2, 0xe6, 0xd3, 0x30, 0x5a, 0x9f, 0xf8, 0x0f },
+ { 0x7d, 0xb4, 0x42, 0xe1, 0x32, 0xba, 0x59, 0xbc,
+ 0x12, 0x89, 0xaa, 0x98, 0xb0, 0xd3, 0xe8, 0x06,
+ 0x00, 0x4f, 0x8e, 0xc1, 0x28, 0x11, 0xaf, 0x1e,
+ 0x2e, 0x33, 0xc6, 0x9b, 0xfd, 0xe7, 0x29, 0xe1 },
+ { 0x25, 0x0f, 0x37, 0xcd, 0xc1, 0x5e, 0x81, 0x7d,
+ 0x2f, 0x16, 0x0d, 0x99, 0x56, 0xc7, 0x1f, 0xe3,
+ 0xeb, 0x5d, 0xb7, 0x45, 0x56, 0xe4, 0xad, 0xf9,
+ 0xa4, 0xff, 0xaf, 0xba, 0x74, 0x01, 0x03, 0x96 },
+ { 0x4a, 0xb8, 0xa3, 0xdd, 0x1d, 0xdf, 0x8a, 0xd4,
+ 0x3d, 0xab, 0x13, 0xa2, 0x7f, 0x66, 0xa6, 0x54,
+ 0x4f, 0x29, 0x05, 0x97, 0xfa, 0x96, 0x04, 0x0e,
+ 0x0e, 0x1d, 0xb9, 0x26, 0x3a, 0xa4, 0x79, 0xf8 },
+ { 0xee, 0x61, 0x72, 0x7a, 0x07, 0x66, 0xdf, 0x93,
+ 0x9c, 0xcd, 0xc8, 0x60, 0x33, 0x40, 0x44, 0xc7,
+ 0x9a, 0x3c, 0x9b, 0x15, 0x62, 0x00, 0xbc, 0x3a,
+ 0xa3, 0x29, 0x73, 0x48, 0x3d, 0x83, 0x41, 0xae },
+ { 0x3f, 0x68, 0xc7, 0xec, 0x63, 0xac, 0x11, 0xeb,
+ 0xb9, 0x8f, 0x94, 0xb3, 0x39, 0xb0, 0x5c, 0x10,
+ 0x49, 0x84, 0xfd, 0xa5, 0x01, 0x03, 0x06, 0x01,
+ 0x44, 0xe5, 0xa2, 0xbf, 0xcc, 0xc9, 0xda, 0x95 },
+ { 0x05, 0x6f, 0x29, 0x81, 0x6b, 0x8a, 0xf8, 0xf5,
+ 0x66, 0x82, 0xbc, 0x4d, 0x7c, 0xf0, 0x94, 0x11,
+ 0x1d, 0xa7, 0x73, 0x3e, 0x72, 0x6c, 0xd1, 0x3d,
+ 0x6b, 0x3e, 0x8e, 0xa0, 0x3e, 0x92, 0xa0, 0xd5 },
+ { 0xf5, 0xec, 0x43, 0xa2, 0x8a, 0xcb, 0xef, 0xf1,
+ 0xf3, 0x31, 0x8a, 0x5b, 0xca, 0xc7, 0xc6, 0x6d,
+ 0xdb, 0x52, 0x30, 0xb7, 0x9d, 0xb2, 0xd1, 0x05,
+ 0xbc, 0xbe, 0x15, 0xf3, 0xc1, 0x14, 0x8d, 0x69 },
+ { 0x2a, 0x69, 0x60, 0xad, 0x1d, 0x8d, 0xd5, 0x47,
+ 0x55, 0x5c, 0xfb, 0xd5, 0xe4, 0x60, 0x0f, 0x1e,
+ 0xaa, 0x1c, 0x8e, 0xda, 0x34, 0xde, 0x03, 0x74,
+ 0xec, 0x4a, 0x26, 0xea, 0xaa, 0xa3, 0x3b, 0x4e },
+ { 0xdc, 0xc1, 0xea, 0x7b, 0xaa, 0xb9, 0x33, 0x84,
+ 0xf7, 0x6b, 0x79, 0x68, 0x66, 0x19, 0x97, 0x54,
+ 0x74, 0x2f, 0x7b, 0x96, 0xd6, 0xb4, 0xc1, 0x20,
+ 0x16, 0x5c, 0x04, 0xa6, 0xc4, 0xf5, 0xce, 0x10 },
+ { 0x13, 0xd5, 0xdf, 0x17, 0x92, 0x21, 0x37, 0x9c,
+ 0x6a, 0x78, 0xc0, 0x7c, 0x79, 0x3f, 0xf5, 0x34,
+ 0x87, 0xca, 0xe6, 0xbf, 0x9f, 0xe8, 0x82, 0x54,
+ 0x1a, 0xb0, 0xe7, 0x35, 0xe3, 0xea, 0xda, 0x3b },
+ { 0x8c, 0x59, 0xe4, 0x40, 0x76, 0x41, 0xa0, 0x1e,
+ 0x8f, 0xf9, 0x1f, 0x99, 0x80, 0xdc, 0x23, 0x6f,
+ 0x4e, 0xcd, 0x6f, 0xcf, 0x52, 0x58, 0x9a, 0x09,
+ 0x9a, 0x96, 0x16, 0x33, 0x96, 0x77, 0x14, 0xe1 },
+ { 0x83, 0x3b, 0x1a, 0xc6, 0xa2, 0x51, 0xfd, 0x08,
+ 0xfd, 0x6d, 0x90, 0x8f, 0xea, 0x2a, 0x4e, 0xe1,
+ 0xe0, 0x40, 0xbc, 0xa9, 0x3f, 0xc1, 0xa3, 0x8e,
+ 0xc3, 0x82, 0x0e, 0x0c, 0x10, 0xbd, 0x82, 0xea },
+ { 0xa2, 0x44, 0xf9, 0x27, 0xf3, 0xb4, 0x0b, 0x8f,
+ 0x6c, 0x39, 0x15, 0x70, 0xc7, 0x65, 0x41, 0x8f,
+ 0x2f, 0x6e, 0x70, 0x8e, 0xac, 0x90, 0x06, 0xc5,
+ 0x1a, 0x7f, 0xef, 0xf4, 0xaf, 0x3b, 0x2b, 0x9e },
+ { 0x3d, 0x99, 0xed, 0x95, 0x50, 0xcf, 0x11, 0x96,
+ 0xe6, 0xc4, 0xd2, 0x0c, 0x25, 0x96, 0x20, 0xf8,
+ 0x58, 0xc3, 0xd7, 0x03, 0x37, 0x4c, 0x12, 0x8c,
+ 0xe7, 0xb5, 0x90, 0x31, 0x0c, 0x83, 0x04, 0x6d },
+ { 0x2b, 0x35, 0xc4, 0x7d, 0x7b, 0x87, 0x76, 0x1f,
+ 0x0a, 0xe4, 0x3a, 0xc5, 0x6a, 0xc2, 0x7b, 0x9f,
+ 0x25, 0x83, 0x03, 0x67, 0xb5, 0x95, 0xbe, 0x8c,
+ 0x24, 0x0e, 0x94, 0x60, 0x0c, 0x6e, 0x33, 0x12 },
+ { 0x5d, 0x11, 0xed, 0x37, 0xd2, 0x4d, 0xc7, 0x67,
+ 0x30, 0x5c, 0xb7, 0xe1, 0x46, 0x7d, 0x87, 0xc0,
+ 0x65, 0xac, 0x4b, 0xc8, 0xa4, 0x26, 0xde, 0x38,
+ 0x99, 0x1f, 0xf5, 0x9a, 0xa8, 0x73, 0x5d, 0x02 },
+ { 0xb8, 0x36, 0x47, 0x8e, 0x1c, 0xa0, 0x64, 0x0d,
+ 0xce, 0x6f, 0xd9, 0x10, 0xa5, 0x09, 0x62, 0x72,
+ 0xc8, 0x33, 0x09, 0x90, 0xcd, 0x97, 0x86, 0x4a,
+ 0xc2, 0xbf, 0x14, 0xef, 0x6b, 0x23, 0x91, 0x4a },
+ { 0x91, 0x00, 0xf9, 0x46, 0xd6, 0xcc, 0xde, 0x3a,
+ 0x59, 0x7f, 0x90, 0xd3, 0x9f, 0xc1, 0x21, 0x5b,
+ 0xad, 0xdc, 0x74, 0x13, 0x64, 0x3d, 0x85, 0xc2,
+ 0x1c, 0x3e, 0xee, 0x5d, 0x2d, 0xd3, 0x28, 0x94 },
+ { 0xda, 0x70, 0xee, 0xdd, 0x23, 0xe6, 0x63, 0xaa,
+ 0x1a, 0x74, 0xb9, 0x76, 0x69, 0x35, 0xb4, 0x79,
+ 0x22, 0x2a, 0x72, 0xaf, 0xba, 0x5c, 0x79, 0x51,
+ 0x58, 0xda, 0xd4, 0x1a, 0x3b, 0xd7, 0x7e, 0x40 },
+ { 0xf0, 0x67, 0xed, 0x6a, 0x0d, 0xbd, 0x43, 0xaa,
+ 0x0a, 0x92, 0x54, 0xe6, 0x9f, 0xd6, 0x6b, 0xdd,
+ 0x8a, 0xcb, 0x87, 0xde, 0x93, 0x6c, 0x25, 0x8c,
+ 0xfb, 0x02, 0x28, 0x5f, 0x2c, 0x11, 0xfa, 0x79 },
+ { 0x71, 0x5c, 0x99, 0xc7, 0xd5, 0x75, 0x80, 0xcf,
+ 0x97, 0x53, 0xb4, 0xc1, 0xd7, 0x95, 0xe4, 0x5a,
+ 0x83, 0xfb, 0xb2, 0x28, 0xc0, 0xd3, 0x6f, 0xbe,
+ 0x20, 0xfa, 0xf3, 0x9b, 0xdd, 0x6d, 0x4e, 0x85 },
+ { 0xe4, 0x57, 0xd6, 0xad, 0x1e, 0x67, 0xcb, 0x9b,
+ 0xbd, 0x17, 0xcb, 0xd6, 0x98, 0xfa, 0x6d, 0x7d,
+ 0xae, 0x0c, 0x9b, 0x7a, 0xd6, 0xcb, 0xd6, 0x53,
+ 0x96, 0x34, 0xe3, 0x2a, 0x71, 0x9c, 0x84, 0x92 },
+ { 0xec, 0xe3, 0xea, 0x81, 0x03, 0xe0, 0x24, 0x83,
+ 0xc6, 0x4a, 0x70, 0xa4, 0xbd, 0xce, 0xe8, 0xce,
+ 0xb6, 0x27, 0x8f, 0x25, 0x33, 0xf3, 0xf4, 0x8d,
+ 0xbe, 0xed, 0xfb, 0xa9, 0x45, 0x31, 0xd4, 0xae },
+ { 0x38, 0x8a, 0xa5, 0xd3, 0x66, 0x7a, 0x97, 0xc6,
+ 0x8d, 0x3d, 0x56, 0xf8, 0xf3, 0xee, 0x8d, 0x3d,
+ 0x36, 0x09, 0x1f, 0x17, 0xfe, 0x5d, 0x1b, 0x0d,
+ 0x5d, 0x84, 0xc9, 0x3b, 0x2f, 0xfe, 0x40, 0xbd },
+ { 0x8b, 0x6b, 0x31, 0xb9, 0xad, 0x7c, 0x3d, 0x5c,
+ 0xd8, 0x4b, 0xf9, 0x89, 0x47, 0xb9, 0xcd, 0xb5,
+ 0x9d, 0xf8, 0xa2, 0x5f, 0xf7, 0x38, 0x10, 0x10,
+ 0x13, 0xbe, 0x4f, 0xd6, 0x5e, 0x1d, 0xd1, 0xa3 },
+ { 0x06, 0x62, 0x91, 0xf6, 0xbb, 0xd2, 0x5f, 0x3c,
+ 0x85, 0x3d, 0xb7, 0xd8, 0xb9, 0x5c, 0x9a, 0x1c,
+ 0xfb, 0x9b, 0xf1, 0xc1, 0xc9, 0x9f, 0xb9, 0x5a,
+ 0x9b, 0x78, 0x69, 0xd9, 0x0f, 0x1c, 0x29, 0x03 },
+ { 0xa7, 0x07, 0xef, 0xbc, 0xcd, 0xce, 0xed, 0x42,
+ 0x96, 0x7a, 0x66, 0xf5, 0x53, 0x9b, 0x93, 0xed,
+ 0x75, 0x60, 0xd4, 0x67, 0x30, 0x40, 0x16, 0xc4,
+ 0x78, 0x0d, 0x77, 0x55, 0xa5, 0x65, 0xd4, 0xc4 },
+ { 0x38, 0xc5, 0x3d, 0xfb, 0x70, 0xbe, 0x7e, 0x79,
+ 0x2b, 0x07, 0xa6, 0xa3, 0x5b, 0x8a, 0x6a, 0x0a,
+ 0xba, 0x02, 0xc5, 0xc5, 0xf3, 0x8b, 0xaf, 0x5c,
+ 0x82, 0x3f, 0xdf, 0xd9, 0xe4, 0x2d, 0x65, 0x7e },
+ { 0xf2, 0x91, 0x13, 0x86, 0x50, 0x1d, 0x9a, 0xb9,
+ 0xd7, 0x20, 0xcf, 0x8a, 0xd1, 0x05, 0x03, 0xd5,
+ 0x63, 0x4b, 0xf4, 0xb7, 0xd1, 0x2b, 0x56, 0xdf,
+ 0xb7, 0x4f, 0xec, 0xc6, 0xe4, 0x09, 0x3f, 0x68 },
+ { 0xc6, 0xf2, 0xbd, 0xd5, 0x2b, 0x81, 0xe6, 0xe4,
+ 0xf6, 0x59, 0x5a, 0xbd, 0x4d, 0x7f, 0xb3, 0x1f,
+ 0x65, 0x11, 0x69, 0xd0, 0x0f, 0xf3, 0x26, 0x92,
+ 0x6b, 0x34, 0x94, 0x7b, 0x28, 0xa8, 0x39, 0x59 },
+ { 0x29, 0x3d, 0x94, 0xb1, 0x8c, 0x98, 0xbb, 0x32,
+ 0x23, 0x36, 0x6b, 0x8c, 0xe7, 0x4c, 0x28, 0xfb,
+ 0xdf, 0x28, 0xe1, 0xf8, 0x4a, 0x33, 0x50, 0xb0,
+ 0xeb, 0x2d, 0x18, 0x04, 0xa5, 0x77, 0x57, 0x9b },
+ { 0x2c, 0x2f, 0xa5, 0xc0, 0xb5, 0x15, 0x33, 0x16,
+ 0x5b, 0xc3, 0x75, 0xc2, 0x2e, 0x27, 0x81, 0x76,
+ 0x82, 0x70, 0xa3, 0x83, 0x98, 0x5d, 0x13, 0xbd,
+ 0x6b, 0x67, 0xb6, 0xfd, 0x67, 0xf8, 0x89, 0xeb },
+ { 0xca, 0xa0, 0x9b, 0x82, 0xb7, 0x25, 0x62, 0xe4,
+ 0x3f, 0x4b, 0x22, 0x75, 0xc0, 0x91, 0x91, 0x8e,
+ 0x62, 0x4d, 0x91, 0x16, 0x61, 0xcc, 0x81, 0x1b,
+ 0xb5, 0xfa, 0xec, 0x51, 0xf6, 0x08, 0x8e, 0xf7 },
+ { 0x24, 0x76, 0x1e, 0x45, 0xe6, 0x74, 0x39, 0x53,
+ 0x79, 0xfb, 0x17, 0x72, 0x9c, 0x78, 0xcb, 0x93,
+ 0x9e, 0x6f, 0x74, 0xc5, 0xdf, 0xfb, 0x9c, 0x96,
+ 0x1f, 0x49, 0x59, 0x82, 0xc3, 0xed, 0x1f, 0xe3 },
+ { 0x55, 0xb7, 0x0a, 0x82, 0x13, 0x1e, 0xc9, 0x48,
+ 0x88, 0xd7, 0xab, 0x54, 0xa7, 0xc5, 0x15, 0x25,
+ 0x5c, 0x39, 0x38, 0xbb, 0x10, 0xbc, 0x78, 0x4d,
+ 0xc9, 0xb6, 0x7f, 0x07, 0x6e, 0x34, 0x1a, 0x73 },
+ { 0x6a, 0xb9, 0x05, 0x7b, 0x97, 0x7e, 0xbc, 0x3c,
+ 0xa4, 0xd4, 0xce, 0x74, 0x50, 0x6c, 0x25, 0xcc,
+ 0xcd, 0xc5, 0x66, 0x49, 0x7c, 0x45, 0x0b, 0x54,
+ 0x15, 0xa3, 0x94, 0x86, 0xf8, 0x65, 0x7a, 0x03 },
+ { 0x24, 0x06, 0x6d, 0xee, 0xe0, 0xec, 0xee, 0x15,
+ 0xa4, 0x5f, 0x0a, 0x32, 0x6d, 0x0f, 0x8d, 0xbc,
+ 0x79, 0x76, 0x1e, 0xbb, 0x93, 0xcf, 0x8c, 0x03,
+ 0x77, 0xaf, 0x44, 0x09, 0x78, 0xfc, 0xf9, 0x94 },
+ { 0x20, 0x00, 0x0d, 0x3f, 0x66, 0xba, 0x76, 0x86,
+ 0x0d, 0x5a, 0x95, 0x06, 0x88, 0xb9, 0xaa, 0x0d,
+ 0x76, 0xcf, 0xea, 0x59, 0xb0, 0x05, 0xd8, 0x59,
+ 0x91, 0x4b, 0x1a, 0x46, 0x65, 0x3a, 0x93, 0x9b },
+ { 0xb9, 0x2d, 0xaa, 0x79, 0x60, 0x3e, 0x3b, 0xdb,
+ 0xc3, 0xbf, 0xe0, 0xf4, 0x19, 0xe4, 0x09, 0xb2,
+ 0xea, 0x10, 0xdc, 0x43, 0x5b, 0xee, 0xfe, 0x29,
+ 0x59, 0xda, 0x16, 0x89, 0x5d, 0x5d, 0xca, 0x1c },
+ { 0xe9, 0x47, 0x94, 0x87, 0x05, 0xb2, 0x06, 0xd5,
+ 0x72, 0xb0, 0xe8, 0xf6, 0x2f, 0x66, 0xa6, 0x55,
+ 0x1c, 0xbd, 0x6b, 0xc3, 0x05, 0xd2, 0x6c, 0xe7,
+ 0x53, 0x9a, 0x12, 0xf9, 0xaa, 0xdf, 0x75, 0x71 },
+ { 0x3d, 0x67, 0xc1, 0xb3, 0xf9, 0xb2, 0x39, 0x10,
+ 0xe3, 0xd3, 0x5e, 0x6b, 0x0f, 0x2c, 0xcf, 0x44,
+ 0xa0, 0xb5, 0x40, 0xa4, 0x5c, 0x18, 0xba, 0x3c,
+ 0x36, 0x26, 0x4d, 0xd4, 0x8e, 0x96, 0xaf, 0x6a },
+ { 0xc7, 0x55, 0x8b, 0xab, 0xda, 0x04, 0xbc, 0xcb,
+ 0x76, 0x4d, 0x0b, 0xbf, 0x33, 0x58, 0x42, 0x51,
+ 0x41, 0x90, 0x2d, 0x22, 0x39, 0x1d, 0x9f, 0x8c,
+ 0x59, 0x15, 0x9f, 0xec, 0x9e, 0x49, 0xb1, 0x51 },
+ { 0x0b, 0x73, 0x2b, 0xb0, 0x35, 0x67, 0x5a, 0x50,
+ 0xff, 0x58, 0xf2, 0xc2, 0x42, 0xe4, 0x71, 0x0a,
+ 0xec, 0xe6, 0x46, 0x70, 0x07, 0x9c, 0x13, 0x04,
+ 0x4c, 0x79, 0xc9, 0xb7, 0x49, 0x1f, 0x70, 0x00 },
+ { 0xd1, 0x20, 0xb5, 0xef, 0x6d, 0x57, 0xeb, 0xf0,
+ 0x6e, 0xaf, 0x96, 0xbc, 0x93, 0x3c, 0x96, 0x7b,
+ 0x16, 0xcb, 0xe6, 0xe2, 0xbf, 0x00, 0x74, 0x1c,
+ 0x30, 0xaa, 0x1c, 0x54, 0xba, 0x64, 0x80, 0x1f },
+ { 0x58, 0xd2, 0x12, 0xad, 0x6f, 0x58, 0xae, 0xf0,
+ 0xf8, 0x01, 0x16, 0xb4, 0x41, 0xe5, 0x7f, 0x61,
+ 0x95, 0xbf, 0xef, 0x26, 0xb6, 0x14, 0x63, 0xed,
+ 0xec, 0x11, 0x83, 0xcd, 0xb0, 0x4f, 0xe7, 0x6d },
+ { 0xb8, 0x83, 0x6f, 0x51, 0xd1, 0xe2, 0x9b, 0xdf,
+ 0xdb, 0xa3, 0x25, 0x56, 0x53, 0x60, 0x26, 0x8b,
+ 0x8f, 0xad, 0x62, 0x74, 0x73, 0xed, 0xec, 0xef,
+ 0x7e, 0xae, 0xfe, 0xe8, 0x37, 0xc7, 0x40, 0x03 },
+ { 0xc5, 0x47, 0xa3, 0xc1, 0x24, 0xae, 0x56, 0x85,
+ 0xff, 0xa7, 0xb8, 0xed, 0xaf, 0x96, 0xec, 0x86,
+ 0xf8, 0xb2, 0xd0, 0xd5, 0x0c, 0xee, 0x8b, 0xe3,
+ 0xb1, 0xf0, 0xc7, 0x67, 0x63, 0x06, 0x9d, 0x9c },
+ { 0x5d, 0x16, 0x8b, 0x76, 0x9a, 0x2f, 0x67, 0x85,
+ 0x3d, 0x62, 0x95, 0xf7, 0x56, 0x8b, 0xe4, 0x0b,
+ 0xb7, 0xa1, 0x6b, 0x8d, 0x65, 0xba, 0x87, 0x63,
+ 0x5d, 0x19, 0x78, 0xd2, 0xab, 0x11, 0xba, 0x2a },
+ { 0xa2, 0xf6, 0x75, 0xdc, 0x73, 0x02, 0x63, 0x8c,
+ 0xb6, 0x02, 0x01, 0x06, 0x4c, 0xa5, 0x50, 0x77,
+ 0x71, 0x4d, 0x71, 0xfe, 0x09, 0x6a, 0x31, 0x5f,
+ 0x2f, 0xe7, 0x40, 0x12, 0x77, 0xca, 0xa5, 0xaf },
+ { 0xc8, 0xaa, 0xb5, 0xcd, 0x01, 0x60, 0xae, 0x78,
+ 0xcd, 0x2e, 0x8a, 0xc5, 0xfb, 0x0e, 0x09, 0x3c,
+ 0xdb, 0x5c, 0x4b, 0x60, 0x52, 0xa0, 0xa9, 0x7b,
+ 0xb0, 0x42, 0x16, 0x82, 0x6f, 0xa7, 0xa4, 0x37 },
+ { 0xff, 0x68, 0xca, 0x40, 0x35, 0xbf, 0xeb, 0x43,
+ 0xfb, 0xf1, 0x45, 0xfd, 0xdd, 0x5e, 0x43, 0xf1,
+ 0xce, 0xa5, 0x4f, 0x11, 0xf7, 0xbe, 0xe1, 0x30,
+ 0x58, 0xf0, 0x27, 0x32, 0x9a, 0x4a, 0x5f, 0xa4 },
+ { 0x1d, 0x4e, 0x54, 0x87, 0xae, 0x3c, 0x74, 0x0f,
+ 0x2b, 0xa6, 0xe5, 0x41, 0xac, 0x91, 0xbc, 0x2b,
+ 0xfc, 0xd2, 0x99, 0x9c, 0x51, 0x8d, 0x80, 0x7b,
+ 0x42, 0x67, 0x48, 0x80, 0x3a, 0x35, 0x0f, 0xd4 },
+ { 0x6d, 0x24, 0x4e, 0x1a, 0x06, 0xce, 0x4e, 0xf5,
+ 0x78, 0xdd, 0x0f, 0x63, 0xaf, 0xf0, 0x93, 0x67,
+ 0x06, 0x73, 0x51, 0x19, 0xca, 0x9c, 0x8d, 0x22,
+ 0xd8, 0x6c, 0x80, 0x14, 0x14, 0xab, 0x97, 0x41 },
+ { 0xde, 0xcf, 0x73, 0x29, 0xdb, 0xcc, 0x82, 0x7b,
+ 0x8f, 0xc5, 0x24, 0xc9, 0x43, 0x1e, 0x89, 0x98,
+ 0x02, 0x9e, 0xce, 0x12, 0xce, 0x93, 0xb7, 0xb2,
+ 0xf3, 0xe7, 0x69, 0xa9, 0x41, 0xfb, 0x8c, 0xea },
+ { 0x2f, 0xaf, 0xcc, 0x0f, 0x2e, 0x63, 0xcb, 0xd0,
+ 0x77, 0x55, 0xbe, 0x7b, 0x75, 0xec, 0xea, 0x0a,
+ 0xdf, 0xf9, 0xaa, 0x5e, 0xde, 0x2a, 0x52, 0xfd,
+ 0xab, 0x4d, 0xfd, 0x03, 0x74, 0xcd, 0x48, 0x3f },
+ { 0xaa, 0x85, 0x01, 0x0d, 0xd4, 0x6a, 0x54, 0x6b,
+ 0x53, 0x5e, 0xf4, 0xcf, 0x5f, 0x07, 0xd6, 0x51,
+ 0x61, 0xe8, 0x98, 0x28, 0xf3, 0xa7, 0x7d, 0xb7,
+ 0xb9, 0xb5, 0x6f, 0x0d, 0xf5, 0x9a, 0xae, 0x45 },
+ { 0x07, 0xe8, 0xe1, 0xee, 0x73, 0x2c, 0xb0, 0xd3,
+ 0x56, 0xc9, 0xc0, 0xd1, 0x06, 0x9c, 0x89, 0xd1,
+ 0x7a, 0xdf, 0x6a, 0x9a, 0x33, 0x4f, 0x74, 0x5e,
+ 0xc7, 0x86, 0x73, 0x32, 0x54, 0x8c, 0xa8, 0xe9 },
+ { 0x0e, 0x01, 0xe8, 0x1c, 0xad, 0xa8, 0x16, 0x2b,
+ 0xfd, 0x5f, 0x8a, 0x8c, 0x81, 0x8a, 0x6c, 0x69,
+ 0xfe, 0xdf, 0x02, 0xce, 0xb5, 0x20, 0x85, 0x23,
+ 0xcb, 0xe5, 0x31, 0x3b, 0x89, 0xca, 0x10, 0x53 },
+ { 0x6b, 0xb6, 0xc6, 0x47, 0x26, 0x55, 0x08, 0x43,
+ 0x99, 0x85, 0x2e, 0x00, 0x24, 0x9f, 0x8c, 0xb2,
+ 0x47, 0x89, 0x6d, 0x39, 0x2b, 0x02, 0xd7, 0x3b,
+ 0x7f, 0x0d, 0xd8, 0x18, 0xe1, 0xe2, 0x9b, 0x07 },
+ { 0x42, 0xd4, 0x63, 0x6e, 0x20, 0x60, 0xf0, 0x8f,
+ 0x41, 0xc8, 0x82, 0xe7, 0x6b, 0x39, 0x6b, 0x11,
+ 0x2e, 0xf6, 0x27, 0xcc, 0x24, 0xc4, 0x3d, 0xd5,
+ 0xf8, 0x3a, 0x1d, 0x1a, 0x7e, 0xad, 0x71, 0x1a },
+ { 0x48, 0x58, 0xc9, 0xa1, 0x88, 0xb0, 0x23, 0x4f,
+ 0xb9, 0xa8, 0xd4, 0x7d, 0x0b, 0x41, 0x33, 0x65,
+ 0x0a, 0x03, 0x0b, 0xd0, 0x61, 0x1b, 0x87, 0xc3,
+ 0x89, 0x2e, 0x94, 0x95, 0x1f, 0x8d, 0xf8, 0x52 },
+ { 0x3f, 0xab, 0x3e, 0x36, 0x98, 0x8d, 0x44, 0x5a,
+ 0x51, 0xc8, 0x78, 0x3e, 0x53, 0x1b, 0xe3, 0xa0,
+ 0x2b, 0xe4, 0x0c, 0xd0, 0x47, 0x96, 0xcf, 0xb6,
+ 0x1d, 0x40, 0x34, 0x74, 0x42, 0xd3, 0xf7, 0x94 },
+ { 0xeb, 0xab, 0xc4, 0x96, 0x36, 0xbd, 0x43, 0x3d,
+ 0x2e, 0xc8, 0xf0, 0xe5, 0x18, 0x73, 0x2e, 0xf8,
+ 0xfa, 0x21, 0xd4, 0xd0, 0x71, 0xcc, 0x3b, 0xc4,
+ 0x6c, 0xd7, 0x9f, 0xa3, 0x8a, 0x28, 0xb8, 0x10 },
+ { 0xa1, 0xd0, 0x34, 0x35, 0x23, 0xb8, 0x93, 0xfc,
+ 0xa8, 0x4f, 0x47, 0xfe, 0xb4, 0xa6, 0x4d, 0x35,
+ 0x0a, 0x17, 0xd8, 0xee, 0xf5, 0x49, 0x7e, 0xce,
+ 0x69, 0x7d, 0x02, 0xd7, 0x91, 0x78, 0xb5, 0x91 },
+ { 0x26, 0x2e, 0xbf, 0xd9, 0x13, 0x0b, 0x7d, 0x28,
+ 0x76, 0x0d, 0x08, 0xef, 0x8b, 0xfd, 0x3b, 0x86,
+ 0xcd, 0xd3, 0xb2, 0x11, 0x3d, 0x2c, 0xae, 0xf7,
+ 0xea, 0x95, 0x1a, 0x30, 0x3d, 0xfa, 0x38, 0x46 },
+ { 0xf7, 0x61, 0x58, 0xed, 0xd5, 0x0a, 0x15, 0x4f,
+ 0xa7, 0x82, 0x03, 0xed, 0x23, 0x62, 0x93, 0x2f,
+ 0xcb, 0x82, 0x53, 0xaa, 0xe3, 0x78, 0x90, 0x3e,
+ 0xde, 0xd1, 0xe0, 0x3f, 0x70, 0x21, 0xa2, 0x57 },
+ { 0x26, 0x17, 0x8e, 0x95, 0x0a, 0xc7, 0x22, 0xf6,
+ 0x7a, 0xe5, 0x6e, 0x57, 0x1b, 0x28, 0x4c, 0x02,
+ 0x07, 0x68, 0x4a, 0x63, 0x34, 0xa1, 0x77, 0x48,
+ 0xa9, 0x4d, 0x26, 0x0b, 0xc5, 0xf5, 0x52, 0x74 },
+ { 0xc3, 0x78, 0xd1, 0xe4, 0x93, 0xb4, 0x0e, 0xf1,
+ 0x1f, 0xe6, 0xa1, 0x5d, 0x9c, 0x27, 0x37, 0xa3,
+ 0x78, 0x09, 0x63, 0x4c, 0x5a, 0xba, 0xd5, 0xb3,
+ 0x3d, 0x7e, 0x39, 0x3b, 0x4a, 0xe0, 0x5d, 0x03 },
+ { 0x98, 0x4b, 0xd8, 0x37, 0x91, 0x01, 0xbe, 0x8f,
+ 0xd8, 0x06, 0x12, 0xd8, 0xea, 0x29, 0x59, 0xa7,
+ 0x86, 0x5e, 0xc9, 0x71, 0x85, 0x23, 0x55, 0x01,
+ 0x07, 0xae, 0x39, 0x38, 0xdf, 0x32, 0x01, 0x1b },
+ { 0xc6, 0xf2, 0x5a, 0x81, 0x2a, 0x14, 0x48, 0x58,
+ 0xac, 0x5c, 0xed, 0x37, 0xa9, 0x3a, 0x9f, 0x47,
+ 0x59, 0xba, 0x0b, 0x1c, 0x0f, 0xdc, 0x43, 0x1d,
+ 0xce, 0x35, 0xf9, 0xec, 0x1f, 0x1f, 0x4a, 0x99 },
+ { 0x92, 0x4c, 0x75, 0xc9, 0x44, 0x24, 0xff, 0x75,
+ 0xe7, 0x4b, 0x8b, 0x4e, 0x94, 0x35, 0x89, 0x58,
+ 0xb0, 0x27, 0xb1, 0x71, 0xdf, 0x5e, 0x57, 0x89,
+ 0x9a, 0xd0, 0xd4, 0xda, 0xc3, 0x73, 0x53, 0xb6 },
+ { 0x0a, 0xf3, 0x58, 0x92, 0xa6, 0x3f, 0x45, 0x93,
+ 0x1f, 0x68, 0x46, 0xed, 0x19, 0x03, 0x61, 0xcd,
+ 0x07, 0x30, 0x89, 0xe0, 0x77, 0x16, 0x57, 0x14,
+ 0xb5, 0x0b, 0x81, 0xa2, 0xe3, 0xdd, 0x9b, 0xa1 },
+ { 0xcc, 0x80, 0xce, 0xfb, 0x26, 0xc3, 0xb2, 0xb0,
+ 0xda, 0xef, 0x23, 0x3e, 0x60, 0x6d, 0x5f, 0xfc,
+ 0x80, 0xfa, 0x17, 0x42, 0x7d, 0x18, 0xe3, 0x04,
+ 0x89, 0x67, 0x3e, 0x06, 0xef, 0x4b, 0x87, 0xf7 },
+ { 0xc2, 0xf8, 0xc8, 0x11, 0x74, 0x47, 0xf3, 0x97,
+ 0x8b, 0x08, 0x18, 0xdc, 0xf6, 0xf7, 0x01, 0x16,
+ 0xac, 0x56, 0xfd, 0x18, 0x4d, 0xd1, 0x27, 0x84,
+ 0x94, 0xe1, 0x03, 0xfc, 0x6d, 0x74, 0xa8, 0x87 },
+ { 0xbd, 0xec, 0xf6, 0xbf, 0xc1, 0xba, 0x0d, 0xf6,
+ 0xe8, 0x62, 0xc8, 0x31, 0x99, 0x22, 0x07, 0x79,
+ 0x6a, 0xcc, 0x79, 0x79, 0x68, 0x35, 0x88, 0x28,
+ 0xc0, 0x6e, 0x7a, 0x51, 0xe0, 0x90, 0x09, 0x8f },
+ { 0x24, 0xd1, 0xa2, 0x6e, 0x3d, 0xab, 0x02, 0xfe,
+ 0x45, 0x72, 0xd2, 0xaa, 0x7d, 0xbd, 0x3e, 0xc3,
+ 0x0f, 0x06, 0x93, 0xdb, 0x26, 0xf2, 0x73, 0xd0,
+ 0xab, 0x2c, 0xb0, 0xc1, 0x3b, 0x5e, 0x64, 0x51 },
+ { 0xec, 0x56, 0xf5, 0x8b, 0x09, 0x29, 0x9a, 0x30,
+ 0x0b, 0x14, 0x05, 0x65, 0xd7, 0xd3, 0xe6, 0x87,
+ 0x82, 0xb6, 0xe2, 0xfb, 0xeb, 0x4b, 0x7e, 0xa9,
+ 0x7a, 0xc0, 0x57, 0x98, 0x90, 0x61, 0xdd, 0x3f },
+ { 0x11, 0xa4, 0x37, 0xc1, 0xab, 0xa3, 0xc1, 0x19,
+ 0xdd, 0xfa, 0xb3, 0x1b, 0x3e, 0x8c, 0x84, 0x1d,
+ 0xee, 0xeb, 0x91, 0x3e, 0xf5, 0x7f, 0x7e, 0x48,
+ 0xf2, 0xc9, 0xcf, 0x5a, 0x28, 0xfa, 0x42, 0xbc },
+ { 0x53, 0xc7, 0xe6, 0x11, 0x4b, 0x85, 0x0a, 0x2c,
+ 0xb4, 0x96, 0xc9, 0xb3, 0xc6, 0x9a, 0x62, 0x3e,
+ 0xae, 0xa2, 0xcb, 0x1d, 0x33, 0xdd, 0x81, 0x7e,
+ 0x47, 0x65, 0xed, 0xaa, 0x68, 0x23, 0xc2, 0x28 },
+ { 0x15, 0x4c, 0x3e, 0x96, 0xfe, 0xe5, 0xdb, 0x14,
+ 0xf8, 0x77, 0x3e, 0x18, 0xaf, 0x14, 0x85, 0x79,
+ 0x13, 0x50, 0x9d, 0xa9, 0x99, 0xb4, 0x6c, 0xdd,
+ 0x3d, 0x4c, 0x16, 0x97, 0x60, 0xc8, 0x3a, 0xd2 },
+ { 0x40, 0xb9, 0x91, 0x6f, 0x09, 0x3e, 0x02, 0x7a,
+ 0x87, 0x86, 0x64, 0x18, 0x18, 0x92, 0x06, 0x20,
+ 0x47, 0x2f, 0xbc, 0xf6, 0x8f, 0x70, 0x1d, 0x1b,
+ 0x68, 0x06, 0x32, 0xe6, 0x99, 0x6b, 0xde, 0xd3 },
+ { 0x24, 0xc4, 0xcb, 0xba, 0x07, 0x11, 0x98, 0x31,
+ 0xa7, 0x26, 0xb0, 0x53, 0x05, 0xd9, 0x6d, 0xa0,
+ 0x2f, 0xf8, 0xb1, 0x48, 0xf0, 0xda, 0x44, 0x0f,
+ 0xe2, 0x33, 0xbc, 0xaa, 0x32, 0xc7, 0x2f, 0x6f },
+ { 0x5d, 0x20, 0x15, 0x10, 0x25, 0x00, 0x20, 0xb7,
+ 0x83, 0x68, 0x96, 0x88, 0xab, 0xbf, 0x8e, 0xcf,
+ 0x25, 0x94, 0xa9, 0x6a, 0x08, 0xf2, 0xbf, 0xec,
+ 0x6c, 0xe0, 0x57, 0x44, 0x65, 0xdd, 0xed, 0x71 },
+ { 0x04, 0x3b, 0x97, 0xe3, 0x36, 0xee, 0x6f, 0xdb,
+ 0xbe, 0x2b, 0x50, 0xf2, 0x2a, 0xf8, 0x32, 0x75,
+ 0xa4, 0x08, 0x48, 0x05, 0xd2, 0xd5, 0x64, 0x59,
+ 0x62, 0x45, 0x4b, 0x6c, 0x9b, 0x80, 0x53, 0xa0 },
+ { 0x56, 0x48, 0x35, 0xcb, 0xae, 0xa7, 0x74, 0x94,
+ 0x85, 0x68, 0xbe, 0x36, 0xcf, 0x52, 0xfc, 0xdd,
+ 0x83, 0x93, 0x4e, 0xb0, 0xa2, 0x75, 0x12, 0xdb,
+ 0xe3, 0xe2, 0xdb, 0x47, 0xb9, 0xe6, 0x63, 0x5a },
+ { 0xf2, 0x1c, 0x33, 0xf4, 0x7b, 0xde, 0x40, 0xa2,
+ 0xa1, 0x01, 0xc9, 0xcd, 0xe8, 0x02, 0x7a, 0xaf,
+ 0x61, 0xa3, 0x13, 0x7d, 0xe2, 0x42, 0x2b, 0x30,
+ 0x03, 0x5a, 0x04, 0xc2, 0x70, 0x89, 0x41, 0x83 },
+ { 0x9d, 0xb0, 0xef, 0x74, 0xe6, 0x6c, 0xbb, 0x84,
+ 0x2e, 0xb0, 0xe0, 0x73, 0x43, 0xa0, 0x3c, 0x5c,
+ 0x56, 0x7e, 0x37, 0x2b, 0x3f, 0x23, 0xb9, 0x43,
+ 0xc7, 0x88, 0xa4, 0xf2, 0x50, 0xf6, 0x78, 0x91 },
+ { 0xab, 0x8d, 0x08, 0x65, 0x5f, 0xf1, 0xd3, 0xfe,
+ 0x87, 0x58, 0xd5, 0x62, 0x23, 0x5f, 0xd2, 0x3e,
+ 0x7c, 0xf9, 0xdc, 0xaa, 0xd6, 0x58, 0x87, 0x2a,
+ 0x49, 0xe5, 0xd3, 0x18, 0x3b, 0x6c, 0xce, 0xbd },
+ { 0x6f, 0x27, 0xf7, 0x7e, 0x7b, 0xcf, 0x46, 0xa1,
+ 0xe9, 0x63, 0xad, 0xe0, 0x30, 0x97, 0x33, 0x54,
+ 0x30, 0x31, 0xdc, 0xcd, 0xd4, 0x7c, 0xaa, 0xc1,
+ 0x74, 0xd7, 0xd2, 0x7c, 0xe8, 0x07, 0x7e, 0x8b },
+ { 0xe3, 0xcd, 0x54, 0xda, 0x7e, 0x44, 0x4c, 0xaa,
+ 0x62, 0x07, 0x56, 0x95, 0x25, 0xa6, 0x70, 0xeb,
+ 0xae, 0x12, 0x78, 0xde, 0x4e, 0x3f, 0xe2, 0x68,
+ 0x4b, 0x3e, 0x33, 0xf5, 0xef, 0x90, 0xcc, 0x1b },
+ { 0xb2, 0xc3, 0xe3, 0x3a, 0x51, 0xd2, 0x2c, 0x4c,
+ 0x08, 0xfc, 0x09, 0x89, 0xc8, 0x73, 0xc9, 0xcc,
+ 0x41, 0x50, 0x57, 0x9b, 0x1e, 0x61, 0x63, 0xfa,
+ 0x69, 0x4a, 0xd5, 0x1d, 0x53, 0xd7, 0x12, 0xdc },
+ { 0xbe, 0x7f, 0xda, 0x98, 0x3e, 0x13, 0x18, 0x9b,
+ 0x4c, 0x77, 0xe0, 0xa8, 0x09, 0x20, 0xb6, 0xe0,
+ 0xe0, 0xea, 0x80, 0xc3, 0xb8, 0x4d, 0xbe, 0x7e,
+ 0x71, 0x17, 0xd2, 0x53, 0xf4, 0x81, 0x12, 0xf4 },
+ { 0xb6, 0x00, 0x8c, 0x28, 0xfa, 0xe0, 0x8a, 0xa4,
+ 0x27, 0xe5, 0xbd, 0x3a, 0xad, 0x36, 0xf1, 0x00,
+ 0x21, 0xf1, 0x6c, 0x77, 0xcf, 0xea, 0xbe, 0xd0,
+ 0x7f, 0x97, 0xcc, 0x7d, 0xc1, 0xf1, 0x28, 0x4a },
+ { 0x6e, 0x4e, 0x67, 0x60, 0xc5, 0x38, 0xf2, 0xe9,
+ 0x7b, 0x3a, 0xdb, 0xfb, 0xbc, 0xde, 0x57, 0xf8,
+ 0x96, 0x6b, 0x7e, 0xa8, 0xfc, 0xb5, 0xbf, 0x7e,
+ 0xfe, 0xc9, 0x13, 0xfd, 0x2a, 0x2b, 0x0c, 0x55 },
+ { 0x4a, 0xe5, 0x1f, 0xd1, 0x83, 0x4a, 0xa5, 0xbd,
+ 0x9a, 0x6f, 0x7e, 0xc3, 0x9f, 0xc6, 0x63, 0x33,
+ 0x8d, 0xc5, 0xd2, 0xe2, 0x07, 0x61, 0x56, 0x6d,
+ 0x90, 0xcc, 0x68, 0xb1, 0xcb, 0x87, 0x5e, 0xd8 },
+ { 0xb6, 0x73, 0xaa, 0xd7, 0x5a, 0xb1, 0xfd, 0xb5,
+ 0x40, 0x1a, 0xbf, 0xa1, 0xbf, 0x89, 0xf3, 0xad,
+ 0xd2, 0xeb, 0xc4, 0x68, 0xdf, 0x36, 0x24, 0xa4,
+ 0x78, 0xf4, 0xfe, 0x85, 0x9d, 0x8d, 0x55, 0xe2 },
+ { 0x13, 0xc9, 0x47, 0x1a, 0x98, 0x55, 0x91, 0x35,
+ 0x39, 0x83, 0x66, 0x60, 0x39, 0x8d, 0xa0, 0xf3,
+ 0xf9, 0x9a, 0xda, 0x08, 0x47, 0x9c, 0x69, 0xd1,
+ 0xb7, 0xfc, 0xaa, 0x34, 0x61, 0xdd, 0x7e, 0x59 },
+ { 0x2c, 0x11, 0xf4, 0xa7, 0xf9, 0x9a, 0x1d, 0x23,
+ 0xa5, 0x8b, 0xb6, 0x36, 0x35, 0x0f, 0xe8, 0x49,
+ 0xf2, 0x9c, 0xba, 0xc1, 0xb2, 0xa1, 0x11, 0x2d,
+ 0x9f, 0x1e, 0xd5, 0xbc, 0x5b, 0x31, 0x3c, 0xcd },
+ { 0xc7, 0xd3, 0xc0, 0x70, 0x6b, 0x11, 0xae, 0x74,
+ 0x1c, 0x05, 0xa1, 0xef, 0x15, 0x0d, 0xd6, 0x5b,
+ 0x54, 0x94, 0xd6, 0xd5, 0x4c, 0x9a, 0x86, 0xe2,
+ 0x61, 0x78, 0x54, 0xe6, 0xae, 0xee, 0xbb, 0xd9 },
+ { 0x19, 0x4e, 0x10, 0xc9, 0x38, 0x93, 0xaf, 0xa0,
+ 0x64, 0xc3, 0xac, 0x04, 0xc0, 0xdd, 0x80, 0x8d,
+ 0x79, 0x1c, 0x3d, 0x4b, 0x75, 0x56, 0xe8, 0x9d,
+ 0x8d, 0x9c, 0xb2, 0x25, 0xc4, 0xb3, 0x33, 0x39 },
+ { 0x6f, 0xc4, 0x98, 0x8b, 0x8f, 0x78, 0x54, 0x6b,
+ 0x16, 0x88, 0x99, 0x18, 0x45, 0x90, 0x8f, 0x13,
+ 0x4b, 0x6a, 0x48, 0x2e, 0x69, 0x94, 0xb3, 0xd4,
+ 0x83, 0x17, 0xbf, 0x08, 0xdb, 0x29, 0x21, 0x85 },
+ { 0x56, 0x65, 0xbe, 0xb8, 0xb0, 0x95, 0x55, 0x25,
+ 0x81, 0x3b, 0x59, 0x81, 0xcd, 0x14, 0x2e, 0xd4,
+ 0xd0, 0x3f, 0xba, 0x38, 0xa6, 0xf3, 0xe5, 0xad,
+ 0x26, 0x8e, 0x0c, 0xc2, 0x70, 0xd1, 0xcd, 0x11 },
+ { 0xb8, 0x83, 0xd6, 0x8f, 0x5f, 0xe5, 0x19, 0x36,
+ 0x43, 0x1b, 0xa4, 0x25, 0x67, 0x38, 0x05, 0x3b,
+ 0x1d, 0x04, 0x26, 0xd4, 0xcb, 0x64, 0xb1, 0x6e,
+ 0x83, 0xba, 0xdc, 0x5e, 0x9f, 0xbe, 0x3b, 0x81 },
+ { 0x53, 0xe7, 0xb2, 0x7e, 0xa5, 0x9c, 0x2f, 0x6d,
+ 0xbb, 0x50, 0x76, 0x9e, 0x43, 0x55, 0x4d, 0xf3,
+ 0x5a, 0xf8, 0x9f, 0x48, 0x22, 0xd0, 0x46, 0x6b,
+ 0x00, 0x7d, 0xd6, 0xf6, 0xde, 0xaf, 0xff, 0x02 },
+ { 0x1f, 0x1a, 0x02, 0x29, 0xd4, 0x64, 0x0f, 0x01,
+ 0x90, 0x15, 0x88, 0xd9, 0xde, 0xc2, 0x2d, 0x13,
+ 0xfc, 0x3e, 0xb3, 0x4a, 0x61, 0xb3, 0x29, 0x38,
+ 0xef, 0xbf, 0x53, 0x34, 0xb2, 0x80, 0x0a, 0xfa },
+ { 0xc2, 0xb4, 0x05, 0xaf, 0xa0, 0xfa, 0x66, 0x68,
+ 0x85, 0x2a, 0xee, 0x4d, 0x88, 0x04, 0x08, 0x53,
+ 0xfa, 0xb8, 0x00, 0xe7, 0x2b, 0x57, 0x58, 0x14,
+ 0x18, 0xe5, 0x50, 0x6f, 0x21, 0x4c, 0x7d, 0x1f },
+ { 0xc0, 0x8a, 0xa1, 0xc2, 0x86, 0xd7, 0x09, 0xfd,
+ 0xc7, 0x47, 0x37, 0x44, 0x97, 0x71, 0x88, 0xc8,
+ 0x95, 0xba, 0x01, 0x10, 0x14, 0x24, 0x7e, 0x4e,
+ 0xfa, 0x8d, 0x07, 0xe7, 0x8f, 0xec, 0x69, 0x5c },
+ { 0xf0, 0x3f, 0x57, 0x89, 0xd3, 0x33, 0x6b, 0x80,
+ 0xd0, 0x02, 0xd5, 0x9f, 0xdf, 0x91, 0x8b, 0xdb,
+ 0x77, 0x5b, 0x00, 0x95, 0x6e, 0xd5, 0x52, 0x8e,
+ 0x86, 0xaa, 0x99, 0x4a, 0xcb, 0x38, 0xfe, 0x2d }
+};
+
+static const u8 blake2s_keyed_testvecs[][BLAKE2S_HASH_SIZE] __initconst = {
+ { 0x48, 0xa8, 0x99, 0x7d, 0xa4, 0x07, 0x87, 0x6b,
+ 0x3d, 0x79, 0xc0, 0xd9, 0x23, 0x25, 0xad, 0x3b,
+ 0x89, 0xcb, 0xb7, 0x54, 0xd8, 0x6a, 0xb7, 0x1a,
+ 0xee, 0x04, 0x7a, 0xd3, 0x45, 0xfd, 0x2c, 0x49 },
+ { 0x40, 0xd1, 0x5f, 0xee, 0x7c, 0x32, 0x88, 0x30,
+ 0x16, 0x6a, 0xc3, 0xf9, 0x18, 0x65, 0x0f, 0x80,
+ 0x7e, 0x7e, 0x01, 0xe1, 0x77, 0x25, 0x8c, 0xdc,
+ 0x0a, 0x39, 0xb1, 0x1f, 0x59, 0x80, 0x66, 0xf1 },
+ { 0x6b, 0xb7, 0x13, 0x00, 0x64, 0x4c, 0xd3, 0x99,
+ 0x1b, 0x26, 0xcc, 0xd4, 0xd2, 0x74, 0xac, 0xd1,
+ 0xad, 0xea, 0xb8, 0xb1, 0xd7, 0x91, 0x45, 0x46,
+ 0xc1, 0x19, 0x8b, 0xbe, 0x9f, 0xc9, 0xd8, 0x03 },
+ { 0x1d, 0x22, 0x0d, 0xbe, 0x2e, 0xe1, 0x34, 0x66,
+ 0x1f, 0xdf, 0x6d, 0x9e, 0x74, 0xb4, 0x17, 0x04,
+ 0x71, 0x05, 0x56, 0xf2, 0xf6, 0xe5, 0xa0, 0x91,
+ 0xb2, 0x27, 0x69, 0x74, 0x45, 0xdb, 0xea, 0x6b },
+ { 0xf6, 0xc3, 0xfb, 0xad, 0xb4, 0xcc, 0x68, 0x7a,
+ 0x00, 0x64, 0xa5, 0xbe, 0x6e, 0x79, 0x1b, 0xec,
+ 0x63, 0xb8, 0x68, 0xad, 0x62, 0xfb, 0xa6, 0x1b,
+ 0x37, 0x57, 0xef, 0x9c, 0xa5, 0x2e, 0x05, 0xb2 },
+ { 0x49, 0xc1, 0xf2, 0x11, 0x88, 0xdf, 0xd7, 0x69,
+ 0xae, 0xa0, 0xe9, 0x11, 0xdd, 0x6b, 0x41, 0xf1,
+ 0x4d, 0xab, 0x10, 0x9d, 0x2b, 0x85, 0x97, 0x7a,
+ 0xa3, 0x08, 0x8b, 0x5c, 0x70, 0x7e, 0x85, 0x98 },
+ { 0xfd, 0xd8, 0x99, 0x3d, 0xcd, 0x43, 0xf6, 0x96,
+ 0xd4, 0x4f, 0x3c, 0xea, 0x0f, 0xf3, 0x53, 0x45,
+ 0x23, 0x4e, 0xc8, 0xee, 0x08, 0x3e, 0xb3, 0xca,
+ 0xda, 0x01, 0x7c, 0x7f, 0x78, 0xc1, 0x71, 0x43 },
+ { 0xe6, 0xc8, 0x12, 0x56, 0x37, 0x43, 0x8d, 0x09,
+ 0x05, 0xb7, 0x49, 0xf4, 0x65, 0x60, 0xac, 0x89,
+ 0xfd, 0x47, 0x1c, 0xf8, 0x69, 0x2e, 0x28, 0xfa,
+ 0xb9, 0x82, 0xf7, 0x3f, 0x01, 0x9b, 0x83, 0xa9 },
+ { 0x19, 0xfc, 0x8c, 0xa6, 0x97, 0x9d, 0x60, 0xe6,
+ 0xed, 0xd3, 0xb4, 0x54, 0x1e, 0x2f, 0x96, 0x7c,
+ 0xed, 0x74, 0x0d, 0xf6, 0xec, 0x1e, 0xae, 0xbb,
+ 0xfe, 0x81, 0x38, 0x32, 0xe9, 0x6b, 0x29, 0x74 },
+ { 0xa6, 0xad, 0x77, 0x7c, 0xe8, 0x81, 0xb5, 0x2b,
+ 0xb5, 0xa4, 0x42, 0x1a, 0xb6, 0xcd, 0xd2, 0xdf,
+ 0xba, 0x13, 0xe9, 0x63, 0x65, 0x2d, 0x4d, 0x6d,
+ 0x12, 0x2a, 0xee, 0x46, 0x54, 0x8c, 0x14, 0xa7 },
+ { 0xf5, 0xc4, 0xb2, 0xba, 0x1a, 0x00, 0x78, 0x1b,
+ 0x13, 0xab, 0xa0, 0x42, 0x52, 0x42, 0xc6, 0x9c,
+ 0xb1, 0x55, 0x2f, 0x3f, 0x71, 0xa9, 0xa3, 0xbb,
+ 0x22, 0xb4, 0xa6, 0xb4, 0x27, 0x7b, 0x46, 0xdd },
+ { 0xe3, 0x3c, 0x4c, 0x9b, 0xd0, 0xcc, 0x7e, 0x45,
+ 0xc8, 0x0e, 0x65, 0xc7, 0x7f, 0xa5, 0x99, 0x7f,
+ 0xec, 0x70, 0x02, 0x73, 0x85, 0x41, 0x50, 0x9e,
+ 0x68, 0xa9, 0x42, 0x38, 0x91, 0xe8, 0x22, 0xa3 },
+ { 0xfb, 0xa1, 0x61, 0x69, 0xb2, 0xc3, 0xee, 0x10,
+ 0x5b, 0xe6, 0xe1, 0xe6, 0x50, 0xe5, 0xcb, 0xf4,
+ 0x07, 0x46, 0xb6, 0x75, 0x3d, 0x03, 0x6a, 0xb5,
+ 0x51, 0x79, 0x01, 0x4a, 0xd7, 0xef, 0x66, 0x51 },
+ { 0xf5, 0xc4, 0xbe, 0xc6, 0xd6, 0x2f, 0xc6, 0x08,
+ 0xbf, 0x41, 0xcc, 0x11, 0x5f, 0x16, 0xd6, 0x1c,
+ 0x7e, 0xfd, 0x3f, 0xf6, 0xc6, 0x56, 0x92, 0xbb,
+ 0xe0, 0xaf, 0xff, 0xb1, 0xfe, 0xde, 0x74, 0x75 },
+ { 0xa4, 0x86, 0x2e, 0x76, 0xdb, 0x84, 0x7f, 0x05,
+ 0xba, 0x17, 0xed, 0xe5, 0xda, 0x4e, 0x7f, 0x91,
+ 0xb5, 0x92, 0x5c, 0xf1, 0xad, 0x4b, 0xa1, 0x27,
+ 0x32, 0xc3, 0x99, 0x57, 0x42, 0xa5, 0xcd, 0x6e },
+ { 0x65, 0xf4, 0xb8, 0x60, 0xcd, 0x15, 0xb3, 0x8e,
+ 0xf8, 0x14, 0xa1, 0xa8, 0x04, 0x31, 0x4a, 0x55,
+ 0xbe, 0x95, 0x3c, 0xaa, 0x65, 0xfd, 0x75, 0x8a,
+ 0xd9, 0x89, 0xff, 0x34, 0xa4, 0x1c, 0x1e, 0xea },
+ { 0x19, 0xba, 0x23, 0x4f, 0x0a, 0x4f, 0x38, 0x63,
+ 0x7d, 0x18, 0x39, 0xf9, 0xd9, 0xf7, 0x6a, 0xd9,
+ 0x1c, 0x85, 0x22, 0x30, 0x71, 0x43, 0xc9, 0x7d,
+ 0x5f, 0x93, 0xf6, 0x92, 0x74, 0xce, 0xc9, 0xa7 },
+ { 0x1a, 0x67, 0x18, 0x6c, 0xa4, 0xa5, 0xcb, 0x8e,
+ 0x65, 0xfc, 0xa0, 0xe2, 0xec, 0xbc, 0x5d, 0xdc,
+ 0x14, 0xae, 0x38, 0x1b, 0xb8, 0xbf, 0xfe, 0xb9,
+ 0xe0, 0xa1, 0x03, 0x44, 0x9e, 0x3e, 0xf0, 0x3c },
+ { 0xaf, 0xbe, 0xa3, 0x17, 0xb5, 0xa2, 0xe8, 0x9c,
+ 0x0b, 0xd9, 0x0c, 0xcf, 0x5d, 0x7f, 0xd0, 0xed,
+ 0x57, 0xfe, 0x58, 0x5e, 0x4b, 0xe3, 0x27, 0x1b,
+ 0x0a, 0x6b, 0xf0, 0xf5, 0x78, 0x6b, 0x0f, 0x26 },
+ { 0xf1, 0xb0, 0x15, 0x58, 0xce, 0x54, 0x12, 0x62,
+ 0xf5, 0xec, 0x34, 0x29, 0x9d, 0x6f, 0xb4, 0x09,
+ 0x00, 0x09, 0xe3, 0x43, 0x4b, 0xe2, 0xf4, 0x91,
+ 0x05, 0xcf, 0x46, 0xaf, 0x4d, 0x2d, 0x41, 0x24 },
+ { 0x13, 0xa0, 0xa0, 0xc8, 0x63, 0x35, 0x63, 0x5e,
+ 0xaa, 0x74, 0xca, 0x2d, 0x5d, 0x48, 0x8c, 0x79,
+ 0x7b, 0xbb, 0x4f, 0x47, 0xdc, 0x07, 0x10, 0x50,
+ 0x15, 0xed, 0x6a, 0x1f, 0x33, 0x09, 0xef, 0xce },
+ { 0x15, 0x80, 0xaf, 0xee, 0xbe, 0xbb, 0x34, 0x6f,
+ 0x94, 0xd5, 0x9f, 0xe6, 0x2d, 0xa0, 0xb7, 0x92,
+ 0x37, 0xea, 0xd7, 0xb1, 0x49, 0x1f, 0x56, 0x67,
+ 0xa9, 0x0e, 0x45, 0xed, 0xf6, 0xca, 0x8b, 0x03 },
+ { 0x20, 0xbe, 0x1a, 0x87, 0x5b, 0x38, 0xc5, 0x73,
+ 0xdd, 0x7f, 0xaa, 0xa0, 0xde, 0x48, 0x9d, 0x65,
+ 0x5c, 0x11, 0xef, 0xb6, 0xa5, 0x52, 0x69, 0x8e,
+ 0x07, 0xa2, 0xd3, 0x31, 0xb5, 0xf6, 0x55, 0xc3 },
+ { 0xbe, 0x1f, 0xe3, 0xc4, 0xc0, 0x40, 0x18, 0xc5,
+ 0x4c, 0x4a, 0x0f, 0x6b, 0x9a, 0x2e, 0xd3, 0xc5,
+ 0x3a, 0xbe, 0x3a, 0x9f, 0x76, 0xb4, 0xd2, 0x6d,
+ 0xe5, 0x6f, 0xc9, 0xae, 0x95, 0x05, 0x9a, 0x99 },
+ { 0xe3, 0xe3, 0xac, 0xe5, 0x37, 0xeb, 0x3e, 0xdd,
+ 0x84, 0x63, 0xd9, 0xad, 0x35, 0x82, 0xe1, 0x3c,
+ 0xf8, 0x65, 0x33, 0xff, 0xde, 0x43, 0xd6, 0x68,
+ 0xdd, 0x2e, 0x93, 0xbb, 0xdb, 0xd7, 0x19, 0x5a },
+ { 0x11, 0x0c, 0x50, 0xc0, 0xbf, 0x2c, 0x6e, 0x7a,
+ 0xeb, 0x7e, 0x43, 0x5d, 0x92, 0xd1, 0x32, 0xab,
+ 0x66, 0x55, 0x16, 0x8e, 0x78, 0xa2, 0xde, 0xcd,
+ 0xec, 0x33, 0x30, 0x77, 0x76, 0x84, 0xd9, 0xc1 },
+ { 0xe9, 0xba, 0x8f, 0x50, 0x5c, 0x9c, 0x80, 0xc0,
+ 0x86, 0x66, 0xa7, 0x01, 0xf3, 0x36, 0x7e, 0x6c,
+ 0xc6, 0x65, 0xf3, 0x4b, 0x22, 0xe7, 0x3c, 0x3c,
+ 0x04, 0x17, 0xeb, 0x1c, 0x22, 0x06, 0x08, 0x2f },
+ { 0x26, 0xcd, 0x66, 0xfc, 0xa0, 0x23, 0x79, 0xc7,
+ 0x6d, 0xf1, 0x23, 0x17, 0x05, 0x2b, 0xca, 0xfd,
+ 0x6c, 0xd8, 0xc3, 0xa7, 0xb8, 0x90, 0xd8, 0x05,
+ 0xf3, 0x6c, 0x49, 0x98, 0x97, 0x82, 0x43, 0x3a },
+ { 0x21, 0x3f, 0x35, 0x96, 0xd6, 0xe3, 0xa5, 0xd0,
+ 0xe9, 0x93, 0x2c, 0xd2, 0x15, 0x91, 0x46, 0x01,
+ 0x5e, 0x2a, 0xbc, 0x94, 0x9f, 0x47, 0x29, 0xee,
+ 0x26, 0x32, 0xfe, 0x1e, 0xdb, 0x78, 0xd3, 0x37 },
+ { 0x10, 0x15, 0xd7, 0x01, 0x08, 0xe0, 0x3b, 0xe1,
+ 0xc7, 0x02, 0xfe, 0x97, 0x25, 0x36, 0x07, 0xd1,
+ 0x4a, 0xee, 0x59, 0x1f, 0x24, 0x13, 0xea, 0x67,
+ 0x87, 0x42, 0x7b, 0x64, 0x59, 0xff, 0x21, 0x9a },
+ { 0x3c, 0xa9, 0x89, 0xde, 0x10, 0xcf, 0xe6, 0x09,
+ 0x90, 0x94, 0x72, 0xc8, 0xd3, 0x56, 0x10, 0x80,
+ 0x5b, 0x2f, 0x97, 0x77, 0x34, 0xcf, 0x65, 0x2c,
+ 0xc6, 0x4b, 0x3b, 0xfc, 0x88, 0x2d, 0x5d, 0x89 },
+ { 0xb6, 0x15, 0x6f, 0x72, 0xd3, 0x80, 0xee, 0x9e,
+ 0xa6, 0xac, 0xd1, 0x90, 0x46, 0x4f, 0x23, 0x07,
+ 0xa5, 0xc1, 0x79, 0xef, 0x01, 0xfd, 0x71, 0xf9,
+ 0x9f, 0x2d, 0x0f, 0x7a, 0x57, 0x36, 0x0a, 0xea },
+ { 0xc0, 0x3b, 0xc6, 0x42, 0xb2, 0x09, 0x59, 0xcb,
+ 0xe1, 0x33, 0xa0, 0x30, 0x3e, 0x0c, 0x1a, 0xbf,
+ 0xf3, 0xe3, 0x1e, 0xc8, 0xe1, 0xa3, 0x28, 0xec,
+ 0x85, 0x65, 0xc3, 0x6d, 0xec, 0xff, 0x52, 0x65 },
+ { 0x2c, 0x3e, 0x08, 0x17, 0x6f, 0x76, 0x0c, 0x62,
+ 0x64, 0xc3, 0xa2, 0xcd, 0x66, 0xfe, 0xc6, 0xc3,
+ 0xd7, 0x8d, 0xe4, 0x3f, 0xc1, 0x92, 0x45, 0x7b,
+ 0x2a, 0x4a, 0x66, 0x0a, 0x1e, 0x0e, 0xb2, 0x2b },
+ { 0xf7, 0x38, 0xc0, 0x2f, 0x3c, 0x1b, 0x19, 0x0c,
+ 0x51, 0x2b, 0x1a, 0x32, 0xde, 0xab, 0xf3, 0x53,
+ 0x72, 0x8e, 0x0e, 0x9a, 0xb0, 0x34, 0x49, 0x0e,
+ 0x3c, 0x34, 0x09, 0x94, 0x6a, 0x97, 0xae, 0xec },
+ { 0x8b, 0x18, 0x80, 0xdf, 0x30, 0x1c, 0xc9, 0x63,
+ 0x41, 0x88, 0x11, 0x08, 0x89, 0x64, 0x83, 0x92,
+ 0x87, 0xff, 0x7f, 0xe3, 0x1c, 0x49, 0xea, 0x6e,
+ 0xbd, 0x9e, 0x48, 0xbd, 0xee, 0xe4, 0x97, 0xc5 },
+ { 0x1e, 0x75, 0xcb, 0x21, 0xc6, 0x09, 0x89, 0x02,
+ 0x03, 0x75, 0xf1, 0xa7, 0xa2, 0x42, 0x83, 0x9f,
+ 0x0b, 0x0b, 0x68, 0x97, 0x3a, 0x4c, 0x2a, 0x05,
+ 0xcf, 0x75, 0x55, 0xed, 0x5a, 0xae, 0xc4, 0xc1 },
+ { 0x62, 0xbf, 0x8a, 0x9c, 0x32, 0xa5, 0xbc, 0xcf,
+ 0x29, 0x0b, 0x6c, 0x47, 0x4d, 0x75, 0xb2, 0xa2,
+ 0xa4, 0x09, 0x3f, 0x1a, 0x9e, 0x27, 0x13, 0x94,
+ 0x33, 0xa8, 0xf2, 0xb3, 0xbc, 0xe7, 0xb8, 0xd7 },
+ { 0x16, 0x6c, 0x83, 0x50, 0xd3, 0x17, 0x3b, 0x5e,
+ 0x70, 0x2b, 0x78, 0x3d, 0xfd, 0x33, 0xc6, 0x6e,
+ 0xe0, 0x43, 0x27, 0x42, 0xe9, 0xb9, 0x2b, 0x99,
+ 0x7f, 0xd2, 0x3c, 0x60, 0xdc, 0x67, 0x56, 0xca },
+ { 0x04, 0x4a, 0x14, 0xd8, 0x22, 0xa9, 0x0c, 0xac,
+ 0xf2, 0xf5, 0xa1, 0x01, 0x42, 0x8a, 0xdc, 0x8f,
+ 0x41, 0x09, 0x38, 0x6c, 0xcb, 0x15, 0x8b, 0xf9,
+ 0x05, 0xc8, 0x61, 0x8b, 0x8e, 0xe2, 0x4e, 0xc3 },
+ { 0x38, 0x7d, 0x39, 0x7e, 0xa4, 0x3a, 0x99, 0x4b,
+ 0xe8, 0x4d, 0x2d, 0x54, 0x4a, 0xfb, 0xe4, 0x81,
+ 0xa2, 0x00, 0x0f, 0x55, 0x25, 0x26, 0x96, 0xbb,
+ 0xa2, 0xc5, 0x0c, 0x8e, 0xbd, 0x10, 0x13, 0x47 },
+ { 0x56, 0xf8, 0xcc, 0xf1, 0xf8, 0x64, 0x09, 0xb4,
+ 0x6c, 0xe3, 0x61, 0x66, 0xae, 0x91, 0x65, 0x13,
+ 0x84, 0x41, 0x57, 0x75, 0x89, 0xdb, 0x08, 0xcb,
+ 0xc5, 0xf6, 0x6c, 0xa2, 0x97, 0x43, 0xb9, 0xfd },
+ { 0x97, 0x06, 0xc0, 0x92, 0xb0, 0x4d, 0x91, 0xf5,
+ 0x3d, 0xff, 0x91, 0xfa, 0x37, 0xb7, 0x49, 0x3d,
+ 0x28, 0xb5, 0x76, 0xb5, 0xd7, 0x10, 0x46, 0x9d,
+ 0xf7, 0x94, 0x01, 0x66, 0x22, 0x36, 0xfc, 0x03 },
+ { 0x87, 0x79, 0x68, 0x68, 0x6c, 0x06, 0x8c, 0xe2,
+ 0xf7, 0xe2, 0xad, 0xcf, 0xf6, 0x8b, 0xf8, 0x74,
+ 0x8e, 0xdf, 0x3c, 0xf8, 0x62, 0xcf, 0xb4, 0xd3,
+ 0x94, 0x7a, 0x31, 0x06, 0x95, 0x80, 0x54, 0xe3 },
+ { 0x88, 0x17, 0xe5, 0x71, 0x98, 0x79, 0xac, 0xf7,
+ 0x02, 0x47, 0x87, 0xec, 0xcd, 0xb2, 0x71, 0x03,
+ 0x55, 0x66, 0xcf, 0xa3, 0x33, 0xe0, 0x49, 0x40,
+ 0x7c, 0x01, 0x78, 0xcc, 0xc5, 0x7a, 0x5b, 0x9f },
+ { 0x89, 0x38, 0x24, 0x9e, 0x4b, 0x50, 0xca, 0xda,
+ 0xcc, 0xdf, 0x5b, 0x18, 0x62, 0x13, 0x26, 0xcb,
+ 0xb1, 0x52, 0x53, 0xe3, 0x3a, 0x20, 0xf5, 0x63,
+ 0x6e, 0x99, 0x5d, 0x72, 0x47, 0x8d, 0xe4, 0x72 },
+ { 0xf1, 0x64, 0xab, 0xba, 0x49, 0x63, 0xa4, 0x4d,
+ 0x10, 0x72, 0x57, 0xe3, 0x23, 0x2d, 0x90, 0xac,
+ 0xa5, 0xe6, 0x6a, 0x14, 0x08, 0x24, 0x8c, 0x51,
+ 0x74, 0x1e, 0x99, 0x1d, 0xb5, 0x22, 0x77, 0x56 },
+ { 0xd0, 0x55, 0x63, 0xe2, 0xb1, 0xcb, 0xa0, 0xc4,
+ 0xa2, 0xa1, 0xe8, 0xbd, 0xe3, 0xa1, 0xa0, 0xd9,
+ 0xf5, 0xb4, 0x0c, 0x85, 0xa0, 0x70, 0xd6, 0xf5,
+ 0xfb, 0x21, 0x06, 0x6e, 0xad, 0x5d, 0x06, 0x01 },
+ { 0x03, 0xfb, 0xb1, 0x63, 0x84, 0xf0, 0xa3, 0x86,
+ 0x6f, 0x4c, 0x31, 0x17, 0x87, 0x76, 0x66, 0xef,
+ 0xbf, 0x12, 0x45, 0x97, 0x56, 0x4b, 0x29, 0x3d,
+ 0x4a, 0xab, 0x0d, 0x26, 0x9f, 0xab, 0xdd, 0xfa },
+ { 0x5f, 0xa8, 0x48, 0x6a, 0xc0, 0xe5, 0x29, 0x64,
+ 0xd1, 0x88, 0x1b, 0xbe, 0x33, 0x8e, 0xb5, 0x4b,
+ 0xe2, 0xf7, 0x19, 0x54, 0x92, 0x24, 0x89, 0x20,
+ 0x57, 0xb4, 0xda, 0x04, 0xba, 0x8b, 0x34, 0x75 },
+ { 0xcd, 0xfa, 0xbc, 0xee, 0x46, 0x91, 0x11, 0x11,
+ 0x23, 0x6a, 0x31, 0x70, 0x8b, 0x25, 0x39, 0xd7,
+ 0x1f, 0xc2, 0x11, 0xd9, 0xb0, 0x9c, 0x0d, 0x85,
+ 0x30, 0xa1, 0x1e, 0x1d, 0xbf, 0x6e, 0xed, 0x01 },
+ { 0x4f, 0x82, 0xde, 0x03, 0xb9, 0x50, 0x47, 0x93,
+ 0xb8, 0x2a, 0x07, 0xa0, 0xbd, 0xcd, 0xff, 0x31,
+ 0x4d, 0x75, 0x9e, 0x7b, 0x62, 0xd2, 0x6b, 0x78,
+ 0x49, 0x46, 0xb0, 0xd3, 0x6f, 0x91, 0x6f, 0x52 },
+ { 0x25, 0x9e, 0xc7, 0xf1, 0x73, 0xbc, 0xc7, 0x6a,
+ 0x09, 0x94, 0xc9, 0x67, 0xb4, 0xf5, 0xf0, 0x24,
+ 0xc5, 0x60, 0x57, 0xfb, 0x79, 0xc9, 0x65, 0xc4,
+ 0xfa, 0xe4, 0x18, 0x75, 0xf0, 0x6a, 0x0e, 0x4c },
+ { 0x19, 0x3c, 0xc8, 0xe7, 0xc3, 0xe0, 0x8b, 0xb3,
+ 0x0f, 0x54, 0x37, 0xaa, 0x27, 0xad, 0xe1, 0xf1,
+ 0x42, 0x36, 0x9b, 0x24, 0x6a, 0x67, 0x5b, 0x23,
+ 0x83, 0xe6, 0xda, 0x9b, 0x49, 0xa9, 0x80, 0x9e },
+ { 0x5c, 0x10, 0x89, 0x6f, 0x0e, 0x28, 0x56, 0xb2,
+ 0xa2, 0xee, 0xe0, 0xfe, 0x4a, 0x2c, 0x16, 0x33,
+ 0x56, 0x5d, 0x18, 0xf0, 0xe9, 0x3e, 0x1f, 0xab,
+ 0x26, 0xc3, 0x73, 0xe8, 0xf8, 0x29, 0x65, 0x4d },
+ { 0xf1, 0x60, 0x12, 0xd9, 0x3f, 0x28, 0x85, 0x1a,
+ 0x1e, 0xb9, 0x89, 0xf5, 0xd0, 0xb4, 0x3f, 0x3f,
+ 0x39, 0xca, 0x73, 0xc9, 0xa6, 0x2d, 0x51, 0x81,
+ 0xbf, 0xf2, 0x37, 0x53, 0x6b, 0xd3, 0x48, 0xc3 },
+ { 0x29, 0x66, 0xb3, 0xcf, 0xae, 0x1e, 0x44, 0xea,
+ 0x99, 0x6d, 0xc5, 0xd6, 0x86, 0xcf, 0x25, 0xfa,
+ 0x05, 0x3f, 0xb6, 0xf6, 0x72, 0x01, 0xb9, 0xe4,
+ 0x6e, 0xad, 0xe8, 0x5d, 0x0a, 0xd6, 0xb8, 0x06 },
+ { 0xdd, 0xb8, 0x78, 0x24, 0x85, 0xe9, 0x00, 0xbc,
+ 0x60, 0xbc, 0xf4, 0xc3, 0x3a, 0x6f, 0xd5, 0x85,
+ 0x68, 0x0c, 0xc6, 0x83, 0xd5, 0x16, 0xef, 0xa0,
+ 0x3e, 0xb9, 0x98, 0x5f, 0xad, 0x87, 0x15, 0xfb },
+ { 0x4c, 0x4d, 0x6e, 0x71, 0xae, 0xa0, 0x57, 0x86,
+ 0x41, 0x31, 0x48, 0xfc, 0x7a, 0x78, 0x6b, 0x0e,
+ 0xca, 0xf5, 0x82, 0xcf, 0xf1, 0x20, 0x9f, 0x5a,
+ 0x80, 0x9f, 0xba, 0x85, 0x04, 0xce, 0x66, 0x2c },
+ { 0xfb, 0x4c, 0x5e, 0x86, 0xd7, 0xb2, 0x22, 0x9b,
+ 0x99, 0xb8, 0xba, 0x6d, 0x94, 0xc2, 0x47, 0xef,
+ 0x96, 0x4a, 0xa3, 0xa2, 0xba, 0xe8, 0xed, 0xc7,
+ 0x75, 0x69, 0xf2, 0x8d, 0xbb, 0xff, 0x2d, 0x4e },
+ { 0xe9, 0x4f, 0x52, 0x6d, 0xe9, 0x01, 0x96, 0x33,
+ 0xec, 0xd5, 0x4a, 0xc6, 0x12, 0x0f, 0x23, 0x95,
+ 0x8d, 0x77, 0x18, 0xf1, 0xe7, 0x71, 0x7b, 0xf3,
+ 0x29, 0x21, 0x1a, 0x4f, 0xae, 0xed, 0x4e, 0x6d },
+ { 0xcb, 0xd6, 0x66, 0x0a, 0x10, 0xdb, 0x3f, 0x23,
+ 0xf7, 0xa0, 0x3d, 0x4b, 0x9d, 0x40, 0x44, 0xc7,
+ 0x93, 0x2b, 0x28, 0x01, 0xac, 0x89, 0xd6, 0x0b,
+ 0xc9, 0xeb, 0x92, 0xd6, 0x5a, 0x46, 0xc2, 0xa0 },
+ { 0x88, 0x18, 0xbb, 0xd3, 0xdb, 0x4d, 0xc1, 0x23,
+ 0xb2, 0x5c, 0xbb, 0xa5, 0xf5, 0x4c, 0x2b, 0xc4,
+ 0xb3, 0xfc, 0xf9, 0xbf, 0x7d, 0x7a, 0x77, 0x09,
+ 0xf4, 0xae, 0x58, 0x8b, 0x26, 0x7c, 0x4e, 0xce },
+ { 0xc6, 0x53, 0x82, 0x51, 0x3f, 0x07, 0x46, 0x0d,
+ 0xa3, 0x98, 0x33, 0xcb, 0x66, 0x6c, 0x5e, 0xd8,
+ 0x2e, 0x61, 0xb9, 0xe9, 0x98, 0xf4, 0xb0, 0xc4,
+ 0x28, 0x7c, 0xee, 0x56, 0xc3, 0xcc, 0x9b, 0xcd },
+ { 0x89, 0x75, 0xb0, 0x57, 0x7f, 0xd3, 0x55, 0x66,
+ 0xd7, 0x50, 0xb3, 0x62, 0xb0, 0x89, 0x7a, 0x26,
+ 0xc3, 0x99, 0x13, 0x6d, 0xf0, 0x7b, 0xab, 0xab,
+ 0xbd, 0xe6, 0x20, 0x3f, 0xf2, 0x95, 0x4e, 0xd4 },
+ { 0x21, 0xfe, 0x0c, 0xeb, 0x00, 0x52, 0xbe, 0x7f,
+ 0xb0, 0xf0, 0x04, 0x18, 0x7c, 0xac, 0xd7, 0xde,
+ 0x67, 0xfa, 0x6e, 0xb0, 0x93, 0x8d, 0x92, 0x76,
+ 0x77, 0xf2, 0x39, 0x8c, 0x13, 0x23, 0x17, 0xa8 },
+ { 0x2e, 0xf7, 0x3f, 0x3c, 0x26, 0xf1, 0x2d, 0x93,
+ 0x88, 0x9f, 0x3c, 0x78, 0xb6, 0xa6, 0x6c, 0x1d,
+ 0x52, 0xb6, 0x49, 0xdc, 0x9e, 0x85, 0x6e, 0x2c,
+ 0x17, 0x2e, 0xa7, 0xc5, 0x8a, 0xc2, 0xb5, 0xe3 },
+ { 0x38, 0x8a, 0x3c, 0xd5, 0x6d, 0x73, 0x86, 0x7a,
+ 0xbb, 0x5f, 0x84, 0x01, 0x49, 0x2b, 0x6e, 0x26,
+ 0x81, 0xeb, 0x69, 0x85, 0x1e, 0x76, 0x7f, 0xd8,
+ 0x42, 0x10, 0xa5, 0x60, 0x76, 0xfb, 0x3d, 0xd3 },
+ { 0xaf, 0x53, 0x3e, 0x02, 0x2f, 0xc9, 0x43, 0x9e,
+ 0x4e, 0x3c, 0xb8, 0x38, 0xec, 0xd1, 0x86, 0x92,
+ 0x23, 0x2a, 0xdf, 0x6f, 0xe9, 0x83, 0x95, 0x26,
+ 0xd3, 0xc3, 0xdd, 0x1b, 0x71, 0x91, 0x0b, 0x1a },
+ { 0x75, 0x1c, 0x09, 0xd4, 0x1a, 0x93, 0x43, 0x88,
+ 0x2a, 0x81, 0xcd, 0x13, 0xee, 0x40, 0x81, 0x8d,
+ 0x12, 0xeb, 0x44, 0xc6, 0xc7, 0xf4, 0x0d, 0xf1,
+ 0x6e, 0x4a, 0xea, 0x8f, 0xab, 0x91, 0x97, 0x2a },
+ { 0x5b, 0x73, 0xdd, 0xb6, 0x8d, 0x9d, 0x2b, 0x0a,
+ 0xa2, 0x65, 0xa0, 0x79, 0x88, 0xd6, 0xb8, 0x8a,
+ 0xe9, 0xaa, 0xc5, 0x82, 0xaf, 0x83, 0x03, 0x2f,
+ 0x8a, 0x9b, 0x21, 0xa2, 0xe1, 0xb7, 0xbf, 0x18 },
+ { 0x3d, 0xa2, 0x91, 0x26, 0xc7, 0xc5, 0xd7, 0xf4,
+ 0x3e, 0x64, 0x24, 0x2a, 0x79, 0xfe, 0xaa, 0x4e,
+ 0xf3, 0x45, 0x9c, 0xde, 0xcc, 0xc8, 0x98, 0xed,
+ 0x59, 0xa9, 0x7f, 0x6e, 0xc9, 0x3b, 0x9d, 0xab },
+ { 0x56, 0x6d, 0xc9, 0x20, 0x29, 0x3d, 0xa5, 0xcb,
+ 0x4f, 0xe0, 0xaa, 0x8a, 0xbd, 0xa8, 0xbb, 0xf5,
+ 0x6f, 0x55, 0x23, 0x13, 0xbf, 0xf1, 0x90, 0x46,
+ 0x64, 0x1e, 0x36, 0x15, 0xc1, 0xe3, 0xed, 0x3f },
+ { 0x41, 0x15, 0xbe, 0xa0, 0x2f, 0x73, 0xf9, 0x7f,
+ 0x62, 0x9e, 0x5c, 0x55, 0x90, 0x72, 0x0c, 0x01,
+ 0xe7, 0xe4, 0x49, 0xae, 0x2a, 0x66, 0x97, 0xd4,
+ 0xd2, 0x78, 0x33, 0x21, 0x30, 0x36, 0x92, 0xf9 },
+ { 0x4c, 0xe0, 0x8f, 0x47, 0x62, 0x46, 0x8a, 0x76,
+ 0x70, 0x01, 0x21, 0x64, 0x87, 0x8d, 0x68, 0x34,
+ 0x0c, 0x52, 0xa3, 0x5e, 0x66, 0xc1, 0x88, 0x4d,
+ 0x5c, 0x86, 0x48, 0x89, 0xab, 0xc9, 0x66, 0x77 },
+ { 0x81, 0xea, 0x0b, 0x78, 0x04, 0x12, 0x4e, 0x0c,
+ 0x22, 0xea, 0x5f, 0xc7, 0x11, 0x04, 0xa2, 0xaf,
+ 0xcb, 0x52, 0xa1, 0xfa, 0x81, 0x6f, 0x3e, 0xcb,
+ 0x7d, 0xcb, 0x5d, 0x9d, 0xea, 0x17, 0x86, 0xd0 },
+ { 0xfe, 0x36, 0x27, 0x33, 0xb0, 0x5f, 0x6b, 0xed,
+ 0xaf, 0x93, 0x79, 0xd7, 0xf7, 0x93, 0x6e, 0xde,
+ 0x20, 0x9b, 0x1f, 0x83, 0x23, 0xc3, 0x92, 0x25,
+ 0x49, 0xd9, 0xe7, 0x36, 0x81, 0xb5, 0xdb, 0x7b },
+ { 0xef, 0xf3, 0x7d, 0x30, 0xdf, 0xd2, 0x03, 0x59,
+ 0xbe, 0x4e, 0x73, 0xfd, 0xf4, 0x0d, 0x27, 0x73,
+ 0x4b, 0x3d, 0xf9, 0x0a, 0x97, 0xa5, 0x5e, 0xd7,
+ 0x45, 0x29, 0x72, 0x94, 0xca, 0x85, 0xd0, 0x9f },
+ { 0x17, 0x2f, 0xfc, 0x67, 0x15, 0x3d, 0x12, 0xe0,
+ 0xca, 0x76, 0xa8, 0xb6, 0xcd, 0x5d, 0x47, 0x31,
+ 0x88, 0x5b, 0x39, 0xce, 0x0c, 0xac, 0x93, 0xa8,
+ 0x97, 0x2a, 0x18, 0x00, 0x6c, 0x8b, 0x8b, 0xaf },
+ { 0xc4, 0x79, 0x57, 0xf1, 0xcc, 0x88, 0xe8, 0x3e,
+ 0xf9, 0x44, 0x58, 0x39, 0x70, 0x9a, 0x48, 0x0a,
+ 0x03, 0x6b, 0xed, 0x5f, 0x88, 0xac, 0x0f, 0xcc,
+ 0x8e, 0x1e, 0x70, 0x3f, 0xfa, 0xac, 0x13, 0x2c },
+ { 0x30, 0xf3, 0x54, 0x83, 0x70, 0xcf, 0xdc, 0xed,
+ 0xa5, 0xc3, 0x7b, 0x56, 0x9b, 0x61, 0x75, 0xe7,
+ 0x99, 0xee, 0xf1, 0xa6, 0x2a, 0xaa, 0x94, 0x32,
+ 0x45, 0xae, 0x76, 0x69, 0xc2, 0x27, 0xa7, 0xb5 },
+ { 0xc9, 0x5d, 0xcb, 0x3c, 0xf1, 0xf2, 0x7d, 0x0e,
+ 0xef, 0x2f, 0x25, 0xd2, 0x41, 0x38, 0x70, 0x90,
+ 0x4a, 0x87, 0x7c, 0x4a, 0x56, 0xc2, 0xde, 0x1e,
+ 0x83, 0xe2, 0xbc, 0x2a, 0xe2, 0xe4, 0x68, 0x21 },
+ { 0xd5, 0xd0, 0xb5, 0xd7, 0x05, 0x43, 0x4c, 0xd4,
+ 0x6b, 0x18, 0x57, 0x49, 0xf6, 0x6b, 0xfb, 0x58,
+ 0x36, 0xdc, 0xdf, 0x6e, 0xe5, 0x49, 0xa2, 0xb7,
+ 0xa4, 0xae, 0xe7, 0xf5, 0x80, 0x07, 0xca, 0xaf },
+ { 0xbb, 0xc1, 0x24, 0xa7, 0x12, 0xf1, 0x5d, 0x07,
+ 0xc3, 0x00, 0xe0, 0x5b, 0x66, 0x83, 0x89, 0xa4,
+ 0x39, 0xc9, 0x17, 0x77, 0xf7, 0x21, 0xf8, 0x32,
+ 0x0c, 0x1c, 0x90, 0x78, 0x06, 0x6d, 0x2c, 0x7e },
+ { 0xa4, 0x51, 0xb4, 0x8c, 0x35, 0xa6, 0xc7, 0x85,
+ 0x4c, 0xfa, 0xae, 0x60, 0x26, 0x2e, 0x76, 0x99,
+ 0x08, 0x16, 0x38, 0x2a, 0xc0, 0x66, 0x7e, 0x5a,
+ 0x5c, 0x9e, 0x1b, 0x46, 0xc4, 0x34, 0x2d, 0xdf },
+ { 0xb0, 0xd1, 0x50, 0xfb, 0x55, 0xe7, 0x78, 0xd0,
+ 0x11, 0x47, 0xf0, 0xb5, 0xd8, 0x9d, 0x99, 0xec,
+ 0xb2, 0x0f, 0xf0, 0x7e, 0x5e, 0x67, 0x60, 0xd6,
+ 0xb6, 0x45, 0xeb, 0x5b, 0x65, 0x4c, 0x62, 0x2b },
+ { 0x34, 0xf7, 0x37, 0xc0, 0xab, 0x21, 0x99, 0x51,
+ 0xee, 0xe8, 0x9a, 0x9f, 0x8d, 0xac, 0x29, 0x9c,
+ 0x9d, 0x4c, 0x38, 0xf3, 0x3f, 0xa4, 0x94, 0xc5,
+ 0xc6, 0xee, 0xfc, 0x92, 0xb6, 0xdb, 0x08, 0xbc },
+ { 0x1a, 0x62, 0xcc, 0x3a, 0x00, 0x80, 0x0d, 0xcb,
+ 0xd9, 0x98, 0x91, 0x08, 0x0c, 0x1e, 0x09, 0x84,
+ 0x58, 0x19, 0x3a, 0x8c, 0xc9, 0xf9, 0x70, 0xea,
+ 0x99, 0xfb, 0xef, 0xf0, 0x03, 0x18, 0xc2, 0x89 },
+ { 0xcf, 0xce, 0x55, 0xeb, 0xaf, 0xc8, 0x40, 0xd7,
+ 0xae, 0x48, 0x28, 0x1c, 0x7f, 0xd5, 0x7e, 0xc8,
+ 0xb4, 0x82, 0xd4, 0xb7, 0x04, 0x43, 0x74, 0x95,
+ 0x49, 0x5a, 0xc4, 0x14, 0xcf, 0x4a, 0x37, 0x4b },
+ { 0x67, 0x46, 0xfa, 0xcf, 0x71, 0x14, 0x6d, 0x99,
+ 0x9d, 0xab, 0xd0, 0x5d, 0x09, 0x3a, 0xe5, 0x86,
+ 0x64, 0x8d, 0x1e, 0xe2, 0x8e, 0x72, 0x61, 0x7b,
+ 0x99, 0xd0, 0xf0, 0x08, 0x6e, 0x1e, 0x45, 0xbf },
+ { 0x57, 0x1c, 0xed, 0x28, 0x3b, 0x3f, 0x23, 0xb4,
+ 0xe7, 0x50, 0xbf, 0x12, 0xa2, 0xca, 0xf1, 0x78,
+ 0x18, 0x47, 0xbd, 0x89, 0x0e, 0x43, 0x60, 0x3c,
+ 0xdc, 0x59, 0x76, 0x10, 0x2b, 0x7b, 0xb1, 0x1b },
+ { 0xcf, 0xcb, 0x76, 0x5b, 0x04, 0x8e, 0x35, 0x02,
+ 0x2c, 0x5d, 0x08, 0x9d, 0x26, 0xe8, 0x5a, 0x36,
+ 0xb0, 0x05, 0xa2, 0xb8, 0x04, 0x93, 0xd0, 0x3a,
+ 0x14, 0x4e, 0x09, 0xf4, 0x09, 0xb6, 0xaf, 0xd1 },
+ { 0x40, 0x50, 0xc7, 0xa2, 0x77, 0x05, 0xbb, 0x27,
+ 0xf4, 0x20, 0x89, 0xb2, 0x99, 0xf3, 0xcb, 0xe5,
+ 0x05, 0x4e, 0xad, 0x68, 0x72, 0x7e, 0x8e, 0xf9,
+ 0x31, 0x8c, 0xe6, 0xf2, 0x5c, 0xd6, 0xf3, 0x1d },
+ { 0x18, 0x40, 0x70, 0xbd, 0x5d, 0x26, 0x5f, 0xbd,
+ 0xc1, 0x42, 0xcd, 0x1c, 0x5c, 0xd0, 0xd7, 0xe4,
+ 0x14, 0xe7, 0x03, 0x69, 0xa2, 0x66, 0xd6, 0x27,
+ 0xc8, 0xfb, 0xa8, 0x4f, 0xa5, 0xe8, 0x4c, 0x34 },
+ { 0x9e, 0xdd, 0xa9, 0xa4, 0x44, 0x39, 0x02, 0xa9,
+ 0x58, 0x8c, 0x0d, 0x0c, 0xcc, 0x62, 0xb9, 0x30,
+ 0x21, 0x84, 0x79, 0xa6, 0x84, 0x1e, 0x6f, 0xe7,
+ 0xd4, 0x30, 0x03, 0xf0, 0x4b, 0x1f, 0xd6, 0x43 },
+ { 0xe4, 0x12, 0xfe, 0xef, 0x79, 0x08, 0x32, 0x4a,
+ 0x6d, 0xa1, 0x84, 0x16, 0x29, 0xf3, 0x5d, 0x3d,
+ 0x35, 0x86, 0x42, 0x01, 0x93, 0x10, 0xec, 0x57,
+ 0xc6, 0x14, 0x83, 0x6b, 0x63, 0xd3, 0x07, 0x63 },
+ { 0x1a, 0x2b, 0x8e, 0xdf, 0xf3, 0xf9, 0xac, 0xc1,
+ 0x55, 0x4f, 0xcb, 0xae, 0x3c, 0xf1, 0xd6, 0x29,
+ 0x8c, 0x64, 0x62, 0xe2, 0x2e, 0x5e, 0xb0, 0x25,
+ 0x96, 0x84, 0xf8, 0x35, 0x01, 0x2b, 0xd1, 0x3f },
+ { 0x28, 0x8c, 0x4a, 0xd9, 0xb9, 0x40, 0x97, 0x62,
+ 0xea, 0x07, 0xc2, 0x4a, 0x41, 0xf0, 0x4f, 0x69,
+ 0xa7, 0xd7, 0x4b, 0xee, 0x2d, 0x95, 0x43, 0x53,
+ 0x74, 0xbd, 0xe9, 0x46, 0xd7, 0x24, 0x1c, 0x7b },
+ { 0x80, 0x56, 0x91, 0xbb, 0x28, 0x67, 0x48, 0xcf,
+ 0xb5, 0x91, 0xd3, 0xae, 0xbe, 0x7e, 0x6f, 0x4e,
+ 0x4d, 0xc6, 0xe2, 0x80, 0x8c, 0x65, 0x14, 0x3c,
+ 0xc0, 0x04, 0xe4, 0xeb, 0x6f, 0xd0, 0x9d, 0x43 },
+ { 0xd4, 0xac, 0x8d, 0x3a, 0x0a, 0xfc, 0x6c, 0xfa,
+ 0x7b, 0x46, 0x0a, 0xe3, 0x00, 0x1b, 0xae, 0xb3,
+ 0x6d, 0xad, 0xb3, 0x7d, 0xa0, 0x7d, 0x2e, 0x8a,
+ 0xc9, 0x18, 0x22, 0xdf, 0x34, 0x8a, 0xed, 0x3d },
+ { 0xc3, 0x76, 0x61, 0x70, 0x14, 0xd2, 0x01, 0x58,
+ 0xbc, 0xed, 0x3d, 0x3b, 0xa5, 0x52, 0xb6, 0xec,
+ 0xcf, 0x84, 0xe6, 0x2a, 0xa3, 0xeb, 0x65, 0x0e,
+ 0x90, 0x02, 0x9c, 0x84, 0xd1, 0x3e, 0xea, 0x69 },
+ { 0xc4, 0x1f, 0x09, 0xf4, 0x3c, 0xec, 0xae, 0x72,
+ 0x93, 0xd6, 0x00, 0x7c, 0xa0, 0xa3, 0x57, 0x08,
+ 0x7d, 0x5a, 0xe5, 0x9b, 0xe5, 0x00, 0xc1, 0xcd,
+ 0x5b, 0x28, 0x9e, 0xe8, 0x10, 0xc7, 0xb0, 0x82 },
+ { 0x03, 0xd1, 0xce, 0xd1, 0xfb, 0xa5, 0xc3, 0x91,
+ 0x55, 0xc4, 0x4b, 0x77, 0x65, 0xcb, 0x76, 0x0c,
+ 0x78, 0x70, 0x8d, 0xcf, 0xc8, 0x0b, 0x0b, 0xd8,
+ 0xad, 0xe3, 0xa5, 0x6d, 0xa8, 0x83, 0x0b, 0x29 },
+ { 0x09, 0xbd, 0xe6, 0xf1, 0x52, 0x21, 0x8d, 0xc9,
+ 0x2c, 0x41, 0xd7, 0xf4, 0x53, 0x87, 0xe6, 0x3e,
+ 0x58, 0x69, 0xd8, 0x07, 0xec, 0x70, 0xb8, 0x21,
+ 0x40, 0x5d, 0xbd, 0x88, 0x4b, 0x7f, 0xcf, 0x4b },
+ { 0x71, 0xc9, 0x03, 0x6e, 0x18, 0x17, 0x9b, 0x90,
+ 0xb3, 0x7d, 0x39, 0xe9, 0xf0, 0x5e, 0xb8, 0x9c,
+ 0xc5, 0xfc, 0x34, 0x1f, 0xd7, 0xc4, 0x77, 0xd0,
+ 0xd7, 0x49, 0x32, 0x85, 0xfa, 0xca, 0x08, 0xa4 },
+ { 0x59, 0x16, 0x83, 0x3e, 0xbb, 0x05, 0xcd, 0x91,
+ 0x9c, 0xa7, 0xfe, 0x83, 0xb6, 0x92, 0xd3, 0x20,
+ 0x5b, 0xef, 0x72, 0x39, 0x2b, 0x2c, 0xf6, 0xbb,
+ 0x0a, 0x6d, 0x43, 0xf9, 0x94, 0xf9, 0x5f, 0x11 },
+ { 0xf6, 0x3a, 0xab, 0x3e, 0xc6, 0x41, 0xb3, 0xb0,
+ 0x24, 0x96, 0x4c, 0x2b, 0x43, 0x7c, 0x04, 0xf6,
+ 0x04, 0x3c, 0x4c, 0x7e, 0x02, 0x79, 0x23, 0x99,
+ 0x95, 0x40, 0x19, 0x58, 0xf8, 0x6b, 0xbe, 0x54 },
+ { 0xf1, 0x72, 0xb1, 0x80, 0xbf, 0xb0, 0x97, 0x40,
+ 0x49, 0x31, 0x20, 0xb6, 0x32, 0x6c, 0xbd, 0xc5,
+ 0x61, 0xe4, 0x77, 0xde, 0xf9, 0xbb, 0xcf, 0xd2,
+ 0x8c, 0xc8, 0xc1, 0xc5, 0xe3, 0x37, 0x9a, 0x31 },
+ { 0xcb, 0x9b, 0x89, 0xcc, 0x18, 0x38, 0x1d, 0xd9,
+ 0x14, 0x1a, 0xde, 0x58, 0x86, 0x54, 0xd4, 0xe6,
+ 0xa2, 0x31, 0xd5, 0xbf, 0x49, 0xd4, 0xd5, 0x9a,
+ 0xc2, 0x7d, 0x86, 0x9c, 0xbe, 0x10, 0x0c, 0xf3 },
+ { 0x7b, 0xd8, 0x81, 0x50, 0x46, 0xfd, 0xd8, 0x10,
+ 0xa9, 0x23, 0xe1, 0x98, 0x4a, 0xae, 0xbd, 0xcd,
+ 0xf8, 0x4d, 0x87, 0xc8, 0x99, 0x2d, 0x68, 0xb5,
+ 0xee, 0xb4, 0x60, 0xf9, 0x3e, 0xb3, 0xc8, 0xd7 },
+ { 0x60, 0x7b, 0xe6, 0x68, 0x62, 0xfd, 0x08, 0xee,
+ 0x5b, 0x19, 0xfa, 0xca, 0xc0, 0x9d, 0xfd, 0xbc,
+ 0xd4, 0x0c, 0x31, 0x21, 0x01, 0xd6, 0x6e, 0x6e,
+ 0xbd, 0x2b, 0x84, 0x1f, 0x1b, 0x9a, 0x93, 0x25 },
+ { 0x9f, 0xe0, 0x3b, 0xbe, 0x69, 0xab, 0x18, 0x34,
+ 0xf5, 0x21, 0x9b, 0x0d, 0xa8, 0x8a, 0x08, 0xb3,
+ 0x0a, 0x66, 0xc5, 0x91, 0x3f, 0x01, 0x51, 0x96,
+ 0x3c, 0x36, 0x05, 0x60, 0xdb, 0x03, 0x87, 0xb3 },
+ { 0x90, 0xa8, 0x35, 0x85, 0x71, 0x7b, 0x75, 0xf0,
+ 0xe9, 0xb7, 0x25, 0xe0, 0x55, 0xee, 0xee, 0xb9,
+ 0xe7, 0xa0, 0x28, 0xea, 0x7e, 0x6c, 0xbc, 0x07,
+ 0xb2, 0x09, 0x17, 0xec, 0x03, 0x63, 0xe3, 0x8c },
+ { 0x33, 0x6e, 0xa0, 0x53, 0x0f, 0x4a, 0x74, 0x69,
+ 0x12, 0x6e, 0x02, 0x18, 0x58, 0x7e, 0xbb, 0xde,
+ 0x33, 0x58, 0xa0, 0xb3, 0x1c, 0x29, 0xd2, 0x00,
+ 0xf7, 0xdc, 0x7e, 0xb1, 0x5c, 0x6a, 0xad, 0xd8 },
+ { 0xa7, 0x9e, 0x76, 0xdc, 0x0a, 0xbc, 0xa4, 0x39,
+ 0x6f, 0x07, 0x47, 0xcd, 0x7b, 0x74, 0x8d, 0xf9,
+ 0x13, 0x00, 0x76, 0x26, 0xb1, 0xd6, 0x59, 0xda,
+ 0x0c, 0x1f, 0x78, 0xb9, 0x30, 0x3d, 0x01, 0xa3 },
+ { 0x44, 0xe7, 0x8a, 0x77, 0x37, 0x56, 0xe0, 0x95,
+ 0x15, 0x19, 0x50, 0x4d, 0x70, 0x38, 0xd2, 0x8d,
+ 0x02, 0x13, 0xa3, 0x7e, 0x0c, 0xe3, 0x75, 0x37,
+ 0x17, 0x57, 0xbc, 0x99, 0x63, 0x11, 0xe3, 0xb8 },
+ { 0x77, 0xac, 0x01, 0x2a, 0x3f, 0x75, 0x4d, 0xcf,
+ 0xea, 0xb5, 0xeb, 0x99, 0x6b, 0xe9, 0xcd, 0x2d,
+ 0x1f, 0x96, 0x11, 0x1b, 0x6e, 0x49, 0xf3, 0x99,
+ 0x4d, 0xf1, 0x81, 0xf2, 0x85, 0x69, 0xd8, 0x25 },
+ { 0xce, 0x5a, 0x10, 0xdb, 0x6f, 0xcc, 0xda, 0xf1,
+ 0x40, 0xaa, 0xa4, 0xde, 0xd6, 0x25, 0x0a, 0x9c,
+ 0x06, 0xe9, 0x22, 0x2b, 0xc9, 0xf9, 0xf3, 0x65,
+ 0x8a, 0x4a, 0xff, 0x93, 0x5f, 0x2b, 0x9f, 0x3a },
+ { 0xec, 0xc2, 0x03, 0xa7, 0xfe, 0x2b, 0xe4, 0xab,
+ 0xd5, 0x5b, 0xb5, 0x3e, 0x6e, 0x67, 0x35, 0x72,
+ 0xe0, 0x07, 0x8d, 0xa8, 0xcd, 0x37, 0x5e, 0xf4,
+ 0x30, 0xcc, 0x97, 0xf9, 0xf8, 0x00, 0x83, 0xaf },
+ { 0x14, 0xa5, 0x18, 0x6d, 0xe9, 0xd7, 0xa1, 0x8b,
+ 0x04, 0x12, 0xb8, 0x56, 0x3e, 0x51, 0xcc, 0x54,
+ 0x33, 0x84, 0x0b, 0x4a, 0x12, 0x9a, 0x8f, 0xf9,
+ 0x63, 0xb3, 0x3a, 0x3c, 0x4a, 0xfe, 0x8e, 0xbb },
+ { 0x13, 0xf8, 0xef, 0x95, 0xcb, 0x86, 0xe6, 0xa6,
+ 0x38, 0x93, 0x1c, 0x8e, 0x10, 0x76, 0x73, 0xeb,
+ 0x76, 0xba, 0x10, 0xd7, 0xc2, 0xcd, 0x70, 0xb9,
+ 0xd9, 0x92, 0x0b, 0xbe, 0xed, 0x92, 0x94, 0x09 },
+ { 0x0b, 0x33, 0x8f, 0x4e, 0xe1, 0x2f, 0x2d, 0xfc,
+ 0xb7, 0x87, 0x13, 0x37, 0x79, 0x41, 0xe0, 0xb0,
+ 0x63, 0x21, 0x52, 0x58, 0x1d, 0x13, 0x32, 0x51,
+ 0x6e, 0x4a, 0x2c, 0xab, 0x19, 0x42, 0xcc, 0xa4 },
+ { 0xea, 0xab, 0x0e, 0xc3, 0x7b, 0x3b, 0x8a, 0xb7,
+ 0x96, 0xe9, 0xf5, 0x72, 0x38, 0xde, 0x14, 0xa2,
+ 0x64, 0xa0, 0x76, 0xf3, 0x88, 0x7d, 0x86, 0xe2,
+ 0x9b, 0xb5, 0x90, 0x6d, 0xb5, 0xa0, 0x0e, 0x02 },
+ { 0x23, 0xcb, 0x68, 0xb8, 0xc0, 0xe6, 0xdc, 0x26,
+ 0xdc, 0x27, 0x76, 0x6d, 0xdc, 0x0a, 0x13, 0xa9,
+ 0x94, 0x38, 0xfd, 0x55, 0x61, 0x7a, 0xa4, 0x09,
+ 0x5d, 0x8f, 0x96, 0x97, 0x20, 0xc8, 0x72, 0xdf },
+ { 0x09, 0x1d, 0x8e, 0xe3, 0x0d, 0x6f, 0x29, 0x68,
+ 0xd4, 0x6b, 0x68, 0x7d, 0xd6, 0x52, 0x92, 0x66,
+ 0x57, 0x42, 0xde, 0x0b, 0xb8, 0x3d, 0xcc, 0x00,
+ 0x04, 0xc7, 0x2c, 0xe1, 0x00, 0x07, 0xa5, 0x49 },
+ { 0x7f, 0x50, 0x7a, 0xbc, 0x6d, 0x19, 0xba, 0x00,
+ 0xc0, 0x65, 0xa8, 0x76, 0xec, 0x56, 0x57, 0x86,
+ 0x88, 0x82, 0xd1, 0x8a, 0x22, 0x1b, 0xc4, 0x6c,
+ 0x7a, 0x69, 0x12, 0x54, 0x1f, 0x5b, 0xc7, 0xba },
+ { 0xa0, 0x60, 0x7c, 0x24, 0xe1, 0x4e, 0x8c, 0x22,
+ 0x3d, 0xb0, 0xd7, 0x0b, 0x4d, 0x30, 0xee, 0x88,
+ 0x01, 0x4d, 0x60, 0x3f, 0x43, 0x7e, 0x9e, 0x02,
+ 0xaa, 0x7d, 0xaf, 0xa3, 0xcd, 0xfb, 0xad, 0x94 },
+ { 0xdd, 0xbf, 0xea, 0x75, 0xcc, 0x46, 0x78, 0x82,
+ 0xeb, 0x34, 0x83, 0xce, 0x5e, 0x2e, 0x75, 0x6a,
+ 0x4f, 0x47, 0x01, 0xb7, 0x6b, 0x44, 0x55, 0x19,
+ 0xe8, 0x9f, 0x22, 0xd6, 0x0f, 0xa8, 0x6e, 0x06 },
+ { 0x0c, 0x31, 0x1f, 0x38, 0xc3, 0x5a, 0x4f, 0xb9,
+ 0x0d, 0x65, 0x1c, 0x28, 0x9d, 0x48, 0x68, 0x56,
+ 0xcd, 0x14, 0x13, 0xdf, 0x9b, 0x06, 0x77, 0xf5,
+ 0x3e, 0xce, 0x2c, 0xd9, 0xe4, 0x77, 0xc6, 0x0a },
+ { 0x46, 0xa7, 0x3a, 0x8d, 0xd3, 0xe7, 0x0f, 0x59,
+ 0xd3, 0x94, 0x2c, 0x01, 0xdf, 0x59, 0x9d, 0xef,
+ 0x78, 0x3c, 0x9d, 0xa8, 0x2f, 0xd8, 0x32, 0x22,
+ 0xcd, 0x66, 0x2b, 0x53, 0xdc, 0xe7, 0xdb, 0xdf },
+ { 0xad, 0x03, 0x8f, 0xf9, 0xb1, 0x4d, 0xe8, 0x4a,
+ 0x80, 0x1e, 0x4e, 0x62, 0x1c, 0xe5, 0xdf, 0x02,
+ 0x9d, 0xd9, 0x35, 0x20, 0xd0, 0xc2, 0xfa, 0x38,
+ 0xbf, 0xf1, 0x76, 0xa8, 0xb1, 0xd1, 0x69, 0x8c },
+ { 0xab, 0x70, 0xc5, 0xdf, 0xbd, 0x1e, 0xa8, 0x17,
+ 0xfe, 0xd0, 0xcd, 0x06, 0x72, 0x93, 0xab, 0xf3,
+ 0x19, 0xe5, 0xd7, 0x90, 0x1c, 0x21, 0x41, 0xd5,
+ 0xd9, 0x9b, 0x23, 0xf0, 0x3a, 0x38, 0xe7, 0x48 },
+ { 0x1f, 0xff, 0xda, 0x67, 0x93, 0x2b, 0x73, 0xc8,
+ 0xec, 0xaf, 0x00, 0x9a, 0x34, 0x91, 0xa0, 0x26,
+ 0x95, 0x3b, 0xab, 0xfe, 0x1f, 0x66, 0x3b, 0x06,
+ 0x97, 0xc3, 0xc4, 0xae, 0x8b, 0x2e, 0x7d, 0xcb },
+ { 0xb0, 0xd2, 0xcc, 0x19, 0x47, 0x2d, 0xd5, 0x7f,
+ 0x2b, 0x17, 0xef, 0xc0, 0x3c, 0x8d, 0x58, 0xc2,
+ 0x28, 0x3d, 0xbb, 0x19, 0xda, 0x57, 0x2f, 0x77,
+ 0x55, 0x85, 0x5a, 0xa9, 0x79, 0x43, 0x17, 0xa0 },
+ { 0xa0, 0xd1, 0x9a, 0x6e, 0xe3, 0x39, 0x79, 0xc3,
+ 0x25, 0x51, 0x0e, 0x27, 0x66, 0x22, 0xdf, 0x41,
+ 0xf7, 0x15, 0x83, 0xd0, 0x75, 0x01, 0xb8, 0x70,
+ 0x71, 0x12, 0x9a, 0x0a, 0xd9, 0x47, 0x32, 0xa5 },
+ { 0x72, 0x46, 0x42, 0xa7, 0x03, 0x2d, 0x10, 0x62,
+ 0xb8, 0x9e, 0x52, 0xbe, 0xa3, 0x4b, 0x75, 0xdf,
+ 0x7d, 0x8f, 0xe7, 0x72, 0xd9, 0xfe, 0x3c, 0x93,
+ 0xdd, 0xf3, 0xc4, 0x54, 0x5a, 0xb5, 0xa9, 0x9b },
+ { 0xad, 0xe5, 0xea, 0xa7, 0xe6, 0x1f, 0x67, 0x2d,
+ 0x58, 0x7e, 0xa0, 0x3d, 0xae, 0x7d, 0x7b, 0x55,
+ 0x22, 0x9c, 0x01, 0xd0, 0x6b, 0xc0, 0xa5, 0x70,
+ 0x14, 0x36, 0xcb, 0xd1, 0x83, 0x66, 0xa6, 0x26 },
+ { 0x01, 0x3b, 0x31, 0xeb, 0xd2, 0x28, 0xfc, 0xdd,
+ 0xa5, 0x1f, 0xab, 0xb0, 0x3b, 0xb0, 0x2d, 0x60,
+ 0xac, 0x20, 0xca, 0x21, 0x5a, 0xaf, 0xa8, 0x3b,
+ 0xdd, 0x85, 0x5e, 0x37, 0x55, 0xa3, 0x5f, 0x0b },
+ { 0x33, 0x2e, 0xd4, 0x0b, 0xb1, 0x0d, 0xde, 0x3c,
+ 0x95, 0x4a, 0x75, 0xd7, 0xb8, 0x99, 0x9d, 0x4b,
+ 0x26, 0xa1, 0xc0, 0x63, 0xc1, 0xdc, 0x6e, 0x32,
+ 0xc1, 0xd9, 0x1b, 0xab, 0x7b, 0xbb, 0x7d, 0x16 },
+ { 0xc7, 0xa1, 0x97, 0xb3, 0xa0, 0x5b, 0x56, 0x6b,
+ 0xcc, 0x9f, 0xac, 0xd2, 0x0e, 0x44, 0x1d, 0x6f,
+ 0x6c, 0x28, 0x60, 0xac, 0x96, 0x51, 0xcd, 0x51,
+ 0xd6, 0xb9, 0xd2, 0xcd, 0xee, 0xea, 0x03, 0x90 },
+ { 0xbd, 0x9c, 0xf6, 0x4e, 0xa8, 0x95, 0x3c, 0x03,
+ 0x71, 0x08, 0xe6, 0xf6, 0x54, 0x91, 0x4f, 0x39,
+ 0x58, 0xb6, 0x8e, 0x29, 0xc1, 0x67, 0x00, 0xdc,
+ 0x18, 0x4d, 0x94, 0xa2, 0x17, 0x08, 0xff, 0x60 },
+ { 0x88, 0x35, 0xb0, 0xac, 0x02, 0x11, 0x51, 0xdf,
+ 0x71, 0x64, 0x74, 0xce, 0x27, 0xce, 0x4d, 0x3c,
+ 0x15, 0xf0, 0xb2, 0xda, 0xb4, 0x80, 0x03, 0xcf,
+ 0x3f, 0x3e, 0xfd, 0x09, 0x45, 0x10, 0x6b, 0x9a },
+ { 0x3b, 0xfe, 0xfa, 0x33, 0x01, 0xaa, 0x55, 0xc0,
+ 0x80, 0x19, 0x0c, 0xff, 0xda, 0x8e, 0xae, 0x51,
+ 0xd9, 0xaf, 0x48, 0x8b, 0x4c, 0x1f, 0x24, 0xc3,
+ 0xd9, 0xa7, 0x52, 0x42, 0xfd, 0x8e, 0xa0, 0x1d },
+ { 0x08, 0x28, 0x4d, 0x14, 0x99, 0x3c, 0xd4, 0x7d,
+ 0x53, 0xeb, 0xae, 0xcf, 0x0d, 0xf0, 0x47, 0x8c,
+ 0xc1, 0x82, 0xc8, 0x9c, 0x00, 0xe1, 0x85, 0x9c,
+ 0x84, 0x85, 0x16, 0x86, 0xdd, 0xf2, 0xc1, 0xb7 },
+ { 0x1e, 0xd7, 0xef, 0x9f, 0x04, 0xc2, 0xac, 0x8d,
+ 0xb6, 0xa8, 0x64, 0xdb, 0x13, 0x10, 0x87, 0xf2,
+ 0x70, 0x65, 0x09, 0x8e, 0x69, 0xc3, 0xfe, 0x78,
+ 0x71, 0x8d, 0x9b, 0x94, 0x7f, 0x4a, 0x39, 0xd0 },
+ { 0xc1, 0x61, 0xf2, 0xdc, 0xd5, 0x7e, 0x9c, 0x14,
+ 0x39, 0xb3, 0x1a, 0x9d, 0xd4, 0x3d, 0x8f, 0x3d,
+ 0x7d, 0xd8, 0xf0, 0xeb, 0x7c, 0xfa, 0xc6, 0xfb,
+ 0x25, 0xa0, 0xf2, 0x8e, 0x30, 0x6f, 0x06, 0x61 },
+ { 0xc0, 0x19, 0x69, 0xad, 0x34, 0xc5, 0x2c, 0xaf,
+ 0x3d, 0xc4, 0xd8, 0x0d, 0x19, 0x73, 0x5c, 0x29,
+ 0x73, 0x1a, 0xc6, 0xe7, 0xa9, 0x20, 0x85, 0xab,
+ 0x92, 0x50, 0xc4, 0x8d, 0xea, 0x48, 0xa3, 0xfc },
+ { 0x17, 0x20, 0xb3, 0x65, 0x56, 0x19, 0xd2, 0xa5,
+ 0x2b, 0x35, 0x21, 0xae, 0x0e, 0x49, 0xe3, 0x45,
+ 0xcb, 0x33, 0x89, 0xeb, 0xd6, 0x20, 0x8a, 0xca,
+ 0xf9, 0xf1, 0x3f, 0xda, 0xcc, 0xa8, 0xbe, 0x49 },
+ { 0x75, 0x62, 0x88, 0x36, 0x1c, 0x83, 0xe2, 0x4c,
+ 0x61, 0x7c, 0xf9, 0x5c, 0x90, 0x5b, 0x22, 0xd0,
+ 0x17, 0xcd, 0xc8, 0x6f, 0x0b, 0xf1, 0xd6, 0x58,
+ 0xf4, 0x75, 0x6c, 0x73, 0x79, 0x87, 0x3b, 0x7f },
+ { 0xe7, 0xd0, 0xed, 0xa3, 0x45, 0x26, 0x93, 0xb7,
+ 0x52, 0xab, 0xcd, 0xa1, 0xb5, 0x5e, 0x27, 0x6f,
+ 0x82, 0x69, 0x8f, 0x5f, 0x16, 0x05, 0x40, 0x3e,
+ 0xff, 0x83, 0x0b, 0xea, 0x00, 0x71, 0xa3, 0x94 },
+ { 0x2c, 0x82, 0xec, 0xaa, 0x6b, 0x84, 0x80, 0x3e,
+ 0x04, 0x4a, 0xf6, 0x31, 0x18, 0xaf, 0xe5, 0x44,
+ 0x68, 0x7c, 0xb6, 0xe6, 0xc7, 0xdf, 0x49, 0xed,
+ 0x76, 0x2d, 0xfd, 0x7c, 0x86, 0x93, 0xa1, 0xbc },
+ { 0x61, 0x36, 0xcb, 0xf4, 0xb4, 0x41, 0x05, 0x6f,
+ 0xa1, 0xe2, 0x72, 0x24, 0x98, 0x12, 0x5d, 0x6d,
+ 0xed, 0x45, 0xe1, 0x7b, 0x52, 0x14, 0x39, 0x59,
+ 0xc7, 0xf4, 0xd4, 0xe3, 0x95, 0x21, 0x8a, 0xc2 },
+ { 0x72, 0x1d, 0x32, 0x45, 0xaa, 0xfe, 0xf2, 0x7f,
+ 0x6a, 0x62, 0x4f, 0x47, 0x95, 0x4b, 0x6c, 0x25,
+ 0x50, 0x79, 0x52, 0x6f, 0xfa, 0x25, 0xe9, 0xff,
+ 0x77, 0xe5, 0xdc, 0xff, 0x47, 0x3b, 0x15, 0x97 },
+ { 0x9d, 0xd2, 0xfb, 0xd8, 0xce, 0xf1, 0x6c, 0x35,
+ 0x3c, 0x0a, 0xc2, 0x11, 0x91, 0xd5, 0x09, 0xeb,
+ 0x28, 0xdd, 0x9e, 0x3e, 0x0d, 0x8c, 0xea, 0x5d,
+ 0x26, 0xca, 0x83, 0x93, 0x93, 0x85, 0x1c, 0x3a },
+ { 0xb2, 0x39, 0x4c, 0xea, 0xcd, 0xeb, 0xf2, 0x1b,
+ 0xf9, 0xdf, 0x2c, 0xed, 0x98, 0xe5, 0x8f, 0x1c,
+ 0x3a, 0x4b, 0xbb, 0xff, 0x66, 0x0d, 0xd9, 0x00,
+ 0xf6, 0x22, 0x02, 0xd6, 0x78, 0x5c, 0xc4, 0x6e },
+ { 0x57, 0x08, 0x9f, 0x22, 0x27, 0x49, 0xad, 0x78,
+ 0x71, 0x76, 0x5f, 0x06, 0x2b, 0x11, 0x4f, 0x43,
+ 0xba, 0x20, 0xec, 0x56, 0x42, 0x2a, 0x8b, 0x1e,
+ 0x3f, 0x87, 0x19, 0x2c, 0x0e, 0xa7, 0x18, 0xc6 },
+ { 0xe4, 0x9a, 0x94, 0x59, 0x96, 0x1c, 0xd3, 0x3c,
+ 0xdf, 0x4a, 0xae, 0x1b, 0x10, 0x78, 0xa5, 0xde,
+ 0xa7, 0xc0, 0x40, 0xe0, 0xfe, 0xa3, 0x40, 0xc9,
+ 0x3a, 0x72, 0x48, 0x72, 0xfc, 0x4a, 0xf8, 0x06 },
+ { 0xed, 0xe6, 0x7f, 0x72, 0x0e, 0xff, 0xd2, 0xca,
+ 0x9c, 0x88, 0x99, 0x41, 0x52, 0xd0, 0x20, 0x1d,
+ 0xee, 0x6b, 0x0a, 0x2d, 0x2c, 0x07, 0x7a, 0xca,
+ 0x6d, 0xae, 0x29, 0xf7, 0x3f, 0x8b, 0x63, 0x09 },
+ { 0xe0, 0xf4, 0x34, 0xbf, 0x22, 0xe3, 0x08, 0x80,
+ 0x39, 0xc2, 0x1f, 0x71, 0x9f, 0xfc, 0x67, 0xf0,
+ 0xf2, 0xcb, 0x5e, 0x98, 0xa7, 0xa0, 0x19, 0x4c,
+ 0x76, 0xe9, 0x6b, 0xf4, 0xe8, 0xe1, 0x7e, 0x61 },
+ { 0x27, 0x7c, 0x04, 0xe2, 0x85, 0x34, 0x84, 0xa4,
+ 0xeb, 0xa9, 0x10, 0xad, 0x33, 0x6d, 0x01, 0xb4,
+ 0x77, 0xb6, 0x7c, 0xc2, 0x00, 0xc5, 0x9f, 0x3c,
+ 0x8d, 0x77, 0xee, 0xf8, 0x49, 0x4f, 0x29, 0xcd },
+ { 0x15, 0x6d, 0x57, 0x47, 0xd0, 0xc9, 0x9c, 0x7f,
+ 0x27, 0x09, 0x7d, 0x7b, 0x7e, 0x00, 0x2b, 0x2e,
+ 0x18, 0x5c, 0xb7, 0x2d, 0x8d, 0xd7, 0xeb, 0x42,
+ 0x4a, 0x03, 0x21, 0x52, 0x81, 0x61, 0x21, 0x9f },
+ { 0x20, 0xdd, 0xd1, 0xed, 0x9b, 0x1c, 0xa8, 0x03,
+ 0x94, 0x6d, 0x64, 0xa8, 0x3a, 0xe4, 0x65, 0x9d,
+ 0xa6, 0x7f, 0xba, 0x7a, 0x1a, 0x3e, 0xdd, 0xb1,
+ 0xe1, 0x03, 0xc0, 0xf5, 0xe0, 0x3e, 0x3a, 0x2c },
+ { 0xf0, 0xaf, 0x60, 0x4d, 0x3d, 0xab, 0xbf, 0x9a,
+ 0x0f, 0x2a, 0x7d, 0x3d, 0xda, 0x6b, 0xd3, 0x8b,
+ 0xba, 0x72, 0xc6, 0xd0, 0x9b, 0xe4, 0x94, 0xfc,
+ 0xef, 0x71, 0x3f, 0xf1, 0x01, 0x89, 0xb6, 0xe6 },
+ { 0x98, 0x02, 0xbb, 0x87, 0xde, 0xf4, 0xcc, 0x10,
+ 0xc4, 0xa5, 0xfd, 0x49, 0xaa, 0x58, 0xdf, 0xe2,
+ 0xf3, 0xfd, 0xdb, 0x46, 0xb4, 0x70, 0x88, 0x14,
+ 0xea, 0xd8, 0x1d, 0x23, 0xba, 0x95, 0x13, 0x9b },
+ { 0x4f, 0x8c, 0xe1, 0xe5, 0x1d, 0x2f, 0xe7, 0xf2,
+ 0x40, 0x43, 0xa9, 0x04, 0xd8, 0x98, 0xeb, 0xfc,
+ 0x91, 0x97, 0x54, 0x18, 0x75, 0x34, 0x13, 0xaa,
+ 0x09, 0x9b, 0x79, 0x5e, 0xcb, 0x35, 0xce, 0xdb },
+ { 0xbd, 0xdc, 0x65, 0x14, 0xd7, 0xee, 0x6a, 0xce,
+ 0x0a, 0x4a, 0xc1, 0xd0, 0xe0, 0x68, 0x11, 0x22,
+ 0x88, 0xcb, 0xcf, 0x56, 0x04, 0x54, 0x64, 0x27,
+ 0x05, 0x63, 0x01, 0x77, 0xcb, 0xa6, 0x08, 0xbd },
+ { 0xd6, 0x35, 0x99, 0x4f, 0x62, 0x91, 0x51, 0x7b,
+ 0x02, 0x81, 0xff, 0xdd, 0x49, 0x6a, 0xfa, 0x86,
+ 0x27, 0x12, 0xe5, 0xb3, 0xc4, 0xe5, 0x2e, 0x4c,
+ 0xd5, 0xfd, 0xae, 0x8c, 0x0e, 0x72, 0xfb, 0x08 },
+ { 0x87, 0x8d, 0x9c, 0xa6, 0x00, 0xcf, 0x87, 0xe7,
+ 0x69, 0xcc, 0x30, 0x5c, 0x1b, 0x35, 0x25, 0x51,
+ 0x86, 0x61, 0x5a, 0x73, 0xa0, 0xda, 0x61, 0x3b,
+ 0x5f, 0x1c, 0x98, 0xdb, 0xf8, 0x12, 0x83, 0xea },
+ { 0xa6, 0x4e, 0xbe, 0x5d, 0xc1, 0x85, 0xde, 0x9f,
+ 0xdd, 0xe7, 0x60, 0x7b, 0x69, 0x98, 0x70, 0x2e,
+ 0xb2, 0x34, 0x56, 0x18, 0x49, 0x57, 0x30, 0x7d,
+ 0x2f, 0xa7, 0x2e, 0x87, 0xa4, 0x77, 0x02, 0xd6 },
+ { 0xce, 0x50, 0xea, 0xb7, 0xb5, 0xeb, 0x52, 0xbd,
+ 0xc9, 0xad, 0x8e, 0x5a, 0x48, 0x0a, 0xb7, 0x80,
+ 0xca, 0x93, 0x20, 0xe4, 0x43, 0x60, 0xb1, 0xfe,
+ 0x37, 0xe0, 0x3f, 0x2f, 0x7a, 0xd7, 0xde, 0x01 },
+ { 0xee, 0xdd, 0xb7, 0xc0, 0xdb, 0x6e, 0x30, 0xab,
+ 0xe6, 0x6d, 0x79, 0xe3, 0x27, 0x51, 0x1e, 0x61,
+ 0xfc, 0xeb, 0xbc, 0x29, 0xf1, 0x59, 0xb4, 0x0a,
+ 0x86, 0xb0, 0x46, 0xec, 0xf0, 0x51, 0x38, 0x23 },
+ { 0x78, 0x7f, 0xc9, 0x34, 0x40, 0xc1, 0xec, 0x96,
+ 0xb5, 0xad, 0x01, 0xc1, 0x6c, 0xf7, 0x79, 0x16,
+ 0xa1, 0x40, 0x5f, 0x94, 0x26, 0x35, 0x6e, 0xc9,
+ 0x21, 0xd8, 0xdf, 0xf3, 0xea, 0x63, 0xb7, 0xe0 },
+ { 0x7f, 0x0d, 0x5e, 0xab, 0x47, 0xee, 0xfd, 0xa6,
+ 0x96, 0xc0, 0xbf, 0x0f, 0xbf, 0x86, 0xab, 0x21,
+ 0x6f, 0xce, 0x46, 0x1e, 0x93, 0x03, 0xab, 0xa6,
+ 0xac, 0x37, 0x41, 0x20, 0xe8, 0x90, 0xe8, 0xdf },
+ { 0xb6, 0x80, 0x04, 0xb4, 0x2f, 0x14, 0xad, 0x02,
+ 0x9f, 0x4c, 0x2e, 0x03, 0xb1, 0xd5, 0xeb, 0x76,
+ 0xd5, 0x71, 0x60, 0xe2, 0x64, 0x76, 0xd2, 0x11,
+ 0x31, 0xbe, 0xf2, 0x0a, 0xda, 0x7d, 0x27, 0xf4 },
+ { 0xb0, 0xc4, 0xeb, 0x18, 0xae, 0x25, 0x0b, 0x51,
+ 0xa4, 0x13, 0x82, 0xea, 0xd9, 0x2d, 0x0d, 0xc7,
+ 0x45, 0x5f, 0x93, 0x79, 0xfc, 0x98, 0x84, 0x42,
+ 0x8e, 0x47, 0x70, 0x60, 0x8d, 0xb0, 0xfa, 0xec },
+ { 0xf9, 0x2b, 0x7a, 0x87, 0x0c, 0x05, 0x9f, 0x4d,
+ 0x46, 0x46, 0x4c, 0x82, 0x4e, 0xc9, 0x63, 0x55,
+ 0x14, 0x0b, 0xdc, 0xe6, 0x81, 0x32, 0x2c, 0xc3,
+ 0xa9, 0x92, 0xff, 0x10, 0x3e, 0x3f, 0xea, 0x52 },
+ { 0x53, 0x64, 0x31, 0x26, 0x14, 0x81, 0x33, 0x98,
+ 0xcc, 0x52, 0x5d, 0x4c, 0x4e, 0x14, 0x6e, 0xde,
+ 0xb3, 0x71, 0x26, 0x5f, 0xba, 0x19, 0x13, 0x3a,
+ 0x2c, 0x3d, 0x21, 0x59, 0x29, 0x8a, 0x17, 0x42 },
+ { 0xf6, 0x62, 0x0e, 0x68, 0xd3, 0x7f, 0xb2, 0xaf,
+ 0x50, 0x00, 0xfc, 0x28, 0xe2, 0x3b, 0x83, 0x22,
+ 0x97, 0xec, 0xd8, 0xbc, 0xe9, 0x9e, 0x8b, 0xe4,
+ 0xd0, 0x4e, 0x85, 0x30, 0x9e, 0x3d, 0x33, 0x74 },
+ { 0x53, 0x16, 0xa2, 0x79, 0x69, 0xd7, 0xfe, 0x04,
+ 0xff, 0x27, 0xb2, 0x83, 0x96, 0x1b, 0xff, 0xc3,
+ 0xbf, 0x5d, 0xfb, 0x32, 0xfb, 0x6a, 0x89, 0xd1,
+ 0x01, 0xc6, 0xc3, 0xb1, 0x93, 0x7c, 0x28, 0x71 },
+ { 0x81, 0xd1, 0x66, 0x4f, 0xdf, 0x3c, 0xb3, 0x3c,
+ 0x24, 0xee, 0xba, 0xc0, 0xbd, 0x64, 0x24, 0x4b,
+ 0x77, 0xc4, 0xab, 0xea, 0x90, 0xbb, 0xe8, 0xb5,
+ 0xee, 0x0b, 0x2a, 0xaf, 0xcf, 0x2d, 0x6a, 0x53 },
+ { 0x34, 0x57, 0x82, 0xf2, 0x95, 0xb0, 0x88, 0x03,
+ 0x52, 0xe9, 0x24, 0xa0, 0x46, 0x7b, 0x5f, 0xbc,
+ 0x3e, 0x8f, 0x3b, 0xfb, 0xc3, 0xc7, 0xe4, 0x8b,
+ 0x67, 0x09, 0x1f, 0xb5, 0xe8, 0x0a, 0x94, 0x42 },
+ { 0x79, 0x41, 0x11, 0xea, 0x6c, 0xd6, 0x5e, 0x31,
+ 0x1f, 0x74, 0xee, 0x41, 0xd4, 0x76, 0xcb, 0x63,
+ 0x2c, 0xe1, 0xe4, 0xb0, 0x51, 0xdc, 0x1d, 0x9e,
+ 0x9d, 0x06, 0x1a, 0x19, 0xe1, 0xd0, 0xbb, 0x49 },
+ { 0x2a, 0x85, 0xda, 0xf6, 0x13, 0x88, 0x16, 0xb9,
+ 0x9b, 0xf8, 0xd0, 0x8b, 0xa2, 0x11, 0x4b, 0x7a,
+ 0xb0, 0x79, 0x75, 0xa7, 0x84, 0x20, 0xc1, 0xa3,
+ 0xb0, 0x6a, 0x77, 0x7c, 0x22, 0xdd, 0x8b, 0xcb },
+ { 0x89, 0xb0, 0xd5, 0xf2, 0x89, 0xec, 0x16, 0x40,
+ 0x1a, 0x06, 0x9a, 0x96, 0x0d, 0x0b, 0x09, 0x3e,
+ 0x62, 0x5d, 0xa3, 0xcf, 0x41, 0xee, 0x29, 0xb5,
+ 0x9b, 0x93, 0x0c, 0x58, 0x20, 0x14, 0x54, 0x55 },
+ { 0xd0, 0xfd, 0xcb, 0x54, 0x39, 0x43, 0xfc, 0x27,
+ 0xd2, 0x08, 0x64, 0xf5, 0x21, 0x81, 0x47, 0x1b,
+ 0x94, 0x2c, 0xc7, 0x7c, 0xa6, 0x75, 0xbc, 0xb3,
+ 0x0d, 0xf3, 0x1d, 0x35, 0x8e, 0xf7, 0xb1, 0xeb },
+ { 0xb1, 0x7e, 0xa8, 0xd7, 0x70, 0x63, 0xc7, 0x09,
+ 0xd4, 0xdc, 0x6b, 0x87, 0x94, 0x13, 0xc3, 0x43,
+ 0xe3, 0x79, 0x0e, 0x9e, 0x62, 0xca, 0x85, 0xb7,
+ 0x90, 0x0b, 0x08, 0x6f, 0x6b, 0x75, 0xc6, 0x72 },
+ { 0xe7, 0x1a, 0x3e, 0x2c, 0x27, 0x4d, 0xb8, 0x42,
+ 0xd9, 0x21, 0x14, 0xf2, 0x17, 0xe2, 0xc0, 0xea,
+ 0xc8, 0xb4, 0x50, 0x93, 0xfd, 0xfd, 0x9d, 0xf4,
+ 0xca, 0x71, 0x62, 0x39, 0x48, 0x62, 0xd5, 0x01 },
+ { 0xc0, 0x47, 0x67, 0x59, 0xab, 0x7a, 0xa3, 0x33,
+ 0x23, 0x4f, 0x6b, 0x44, 0xf5, 0xfd, 0x85, 0x83,
+ 0x90, 0xec, 0x23, 0x69, 0x4c, 0x62, 0x2c, 0xb9,
+ 0x86, 0xe7, 0x69, 0xc7, 0x8e, 0xdd, 0x73, 0x3e },
+ { 0x9a, 0xb8, 0xea, 0xbb, 0x14, 0x16, 0x43, 0x4d,
+ 0x85, 0x39, 0x13, 0x41, 0xd5, 0x69, 0x93, 0xc5,
+ 0x54, 0x58, 0x16, 0x7d, 0x44, 0x18, 0xb1, 0x9a,
+ 0x0f, 0x2a, 0xd8, 0xb7, 0x9a, 0x83, 0xa7, 0x5b },
+ { 0x79, 0x92, 0xd0, 0xbb, 0xb1, 0x5e, 0x23, 0x82,
+ 0x6f, 0x44, 0x3e, 0x00, 0x50, 0x5d, 0x68, 0xd3,
+ 0xed, 0x73, 0x72, 0x99, 0x5a, 0x5c, 0x3e, 0x49,
+ 0x86, 0x54, 0x10, 0x2f, 0xbc, 0xd0, 0x96, 0x4e },
+ { 0xc0, 0x21, 0xb3, 0x00, 0x85, 0x15, 0x14, 0x35,
+ 0xdf, 0x33, 0xb0, 0x07, 0xcc, 0xec, 0xc6, 0x9d,
+ 0xf1, 0x26, 0x9f, 0x39, 0xba, 0x25, 0x09, 0x2b,
+ 0xed, 0x59, 0xd9, 0x32, 0xac, 0x0f, 0xdc, 0x28 },
+ { 0x91, 0xa2, 0x5e, 0xc0, 0xec, 0x0d, 0x9a, 0x56,
+ 0x7f, 0x89, 0xc4, 0xbf, 0xe1, 0xa6, 0x5a, 0x0e,
+ 0x43, 0x2d, 0x07, 0x06, 0x4b, 0x41, 0x90, 0xe2,
+ 0x7d, 0xfb, 0x81, 0x90, 0x1f, 0xd3, 0x13, 0x9b },
+ { 0x59, 0x50, 0xd3, 0x9a, 0x23, 0xe1, 0x54, 0x5f,
+ 0x30, 0x12, 0x70, 0xaa, 0x1a, 0x12, 0xf2, 0xe6,
+ 0xc4, 0x53, 0x77, 0x6e, 0x4d, 0x63, 0x55, 0xde,
+ 0x42, 0x5c, 0xc1, 0x53, 0xf9, 0x81, 0x88, 0x67 },
+ { 0xd7, 0x9f, 0x14, 0x72, 0x0c, 0x61, 0x0a, 0xf1,
+ 0x79, 0xa3, 0x76, 0x5d, 0x4b, 0x7c, 0x09, 0x68,
+ 0xf9, 0x77, 0x96, 0x2d, 0xbf, 0x65, 0x5b, 0x52,
+ 0x12, 0x72, 0xb6, 0xf1, 0xe1, 0x94, 0x48, 0x8e },
+ { 0xe9, 0x53, 0x1b, 0xfc, 0x8b, 0x02, 0x99, 0x5a,
+ 0xea, 0xa7, 0x5b, 0xa2, 0x70, 0x31, 0xfa, 0xdb,
+ 0xcb, 0xf4, 0xa0, 0xda, 0xb8, 0x96, 0x1d, 0x92,
+ 0x96, 0xcd, 0x7e, 0x84, 0xd2, 0x5d, 0x60, 0x06 },
+ { 0x34, 0xe9, 0xc2, 0x6a, 0x01, 0xd7, 0xf1, 0x61,
+ 0x81, 0xb4, 0x54, 0xa9, 0xd1, 0x62, 0x3c, 0x23,
+ 0x3c, 0xb9, 0x9d, 0x31, 0xc6, 0x94, 0x65, 0x6e,
+ 0x94, 0x13, 0xac, 0xa3, 0xe9, 0x18, 0x69, 0x2f },
+ { 0xd9, 0xd7, 0x42, 0x2f, 0x43, 0x7b, 0xd4, 0x39,
+ 0xdd, 0xd4, 0xd8, 0x83, 0xda, 0xe2, 0xa0, 0x83,
+ 0x50, 0x17, 0x34, 0x14, 0xbe, 0x78, 0x15, 0x51,
+ 0x33, 0xff, 0xf1, 0x96, 0x4c, 0x3d, 0x79, 0x72 },
+ { 0x4a, 0xee, 0x0c, 0x7a, 0xaf, 0x07, 0x54, 0x14,
+ 0xff, 0x17, 0x93, 0xea, 0xd7, 0xea, 0xca, 0x60,
+ 0x17, 0x75, 0xc6, 0x15, 0xdb, 0xd6, 0x0b, 0x64,
+ 0x0b, 0x0a, 0x9f, 0x0c, 0xe5, 0x05, 0xd4, 0x35 },
+ { 0x6b, 0xfd, 0xd1, 0x54, 0x59, 0xc8, 0x3b, 0x99,
+ 0xf0, 0x96, 0xbf, 0xb4, 0x9e, 0xe8, 0x7b, 0x06,
+ 0x3d, 0x69, 0xc1, 0x97, 0x4c, 0x69, 0x28, 0xac,
+ 0xfc, 0xfb, 0x40, 0x99, 0xf8, 0xc4, 0xef, 0x67 },
+ { 0x9f, 0xd1, 0xc4, 0x08, 0xfd, 0x75, 0xc3, 0x36,
+ 0x19, 0x3a, 0x2a, 0x14, 0xd9, 0x4f, 0x6a, 0xf5,
+ 0xad, 0xf0, 0x50, 0xb8, 0x03, 0x87, 0xb4, 0xb0,
+ 0x10, 0xfb, 0x29, 0xf4, 0xcc, 0x72, 0x70, 0x7c },
+ { 0x13, 0xc8, 0x84, 0x80, 0xa5, 0xd0, 0x0d, 0x6c,
+ 0x8c, 0x7a, 0xd2, 0x11, 0x0d, 0x76, 0xa8, 0x2d,
+ 0x9b, 0x70, 0xf4, 0xfa, 0x66, 0x96, 0xd4, 0xe5,
+ 0xdd, 0x42, 0xa0, 0x66, 0xdc, 0xaf, 0x99, 0x20 },
+ { 0x82, 0x0e, 0x72, 0x5e, 0xe2, 0x5f, 0xe8, 0xfd,
+ 0x3a, 0x8d, 0x5a, 0xbe, 0x4c, 0x46, 0xc3, 0xba,
+ 0x88, 0x9d, 0xe6, 0xfa, 0x91, 0x91, 0xaa, 0x22,
+ 0xba, 0x67, 0xd5, 0x70, 0x54, 0x21, 0x54, 0x2b },
+ { 0x32, 0xd9, 0x3a, 0x0e, 0xb0, 0x2f, 0x42, 0xfb,
+ 0xbc, 0xaf, 0x2b, 0xad, 0x00, 0x85, 0xb2, 0x82,
+ 0xe4, 0x60, 0x46, 0xa4, 0xdf, 0x7a, 0xd1, 0x06,
+ 0x57, 0xc9, 0xd6, 0x47, 0x63, 0x75, 0xb9, 0x3e },
+ { 0xad, 0xc5, 0x18, 0x79, 0x05, 0xb1, 0x66, 0x9c,
+ 0xd8, 0xec, 0x9c, 0x72, 0x1e, 0x19, 0x53, 0x78,
+ 0x6b, 0x9d, 0x89, 0xa9, 0xba, 0xe3, 0x07, 0x80,
+ 0xf1, 0xe1, 0xea, 0xb2, 0x4a, 0x00, 0x52, 0x3c },
+ { 0xe9, 0x07, 0x56, 0xff, 0x7f, 0x9a, 0xd8, 0x10,
+ 0xb2, 0x39, 0xa1, 0x0c, 0xed, 0x2c, 0xf9, 0xb2,
+ 0x28, 0x43, 0x54, 0xc1, 0xf8, 0xc7, 0xe0, 0xac,
+ 0xcc, 0x24, 0x61, 0xdc, 0x79, 0x6d, 0x6e, 0x89 },
+ { 0x12, 0x51, 0xf7, 0x6e, 0x56, 0x97, 0x84, 0x81,
+ 0x87, 0x53, 0x59, 0x80, 0x1d, 0xb5, 0x89, 0xa0,
+ 0xb2, 0x2f, 0x86, 0xd8, 0xd6, 0x34, 0xdc, 0x04,
+ 0x50, 0x6f, 0x32, 0x2e, 0xd7, 0x8f, 0x17, 0xe8 },
+ { 0x3a, 0xfa, 0x89, 0x9f, 0xd9, 0x80, 0xe7, 0x3e,
+ 0xcb, 0x7f, 0x4d, 0x8b, 0x8f, 0x29, 0x1d, 0xc9,
+ 0xaf, 0x79, 0x6b, 0xc6, 0x5d, 0x27, 0xf9, 0x74,
+ 0xc6, 0xf1, 0x93, 0xc9, 0x19, 0x1a, 0x09, 0xfd },
+ { 0xaa, 0x30, 0x5b, 0xe2, 0x6e, 0x5d, 0xed, 0xdc,
+ 0x3c, 0x10, 0x10, 0xcb, 0xc2, 0x13, 0xf9, 0x5f,
+ 0x05, 0x1c, 0x78, 0x5c, 0x5b, 0x43, 0x1e, 0x6a,
+ 0x7c, 0xd0, 0x48, 0xf1, 0x61, 0x78, 0x75, 0x28 },
+ { 0x8e, 0xa1, 0x88, 0x4f, 0xf3, 0x2e, 0x9d, 0x10,
+ 0xf0, 0x39, 0xb4, 0x07, 0xd0, 0xd4, 0x4e, 0x7e,
+ 0x67, 0x0a, 0xbd, 0x88, 0x4a, 0xee, 0xe0, 0xfb,
+ 0x75, 0x7a, 0xe9, 0x4e, 0xaa, 0x97, 0x37, 0x3d },
+ { 0xd4, 0x82, 0xb2, 0x15, 0x5d, 0x4d, 0xec, 0x6b,
+ 0x47, 0x36, 0xa1, 0xf1, 0x61, 0x7b, 0x53, 0xaa,
+ 0xa3, 0x73, 0x10, 0x27, 0x7d, 0x3f, 0xef, 0x0c,
+ 0x37, 0xad, 0x41, 0x76, 0x8f, 0xc2, 0x35, 0xb4 },
+ { 0x4d, 0x41, 0x39, 0x71, 0x38, 0x7e, 0x7a, 0x88,
+ 0x98, 0xa8, 0xdc, 0x2a, 0x27, 0x50, 0x07, 0x78,
+ 0x53, 0x9e, 0xa2, 0x14, 0xa2, 0xdf, 0xe9, 0xb3,
+ 0xd7, 0xe8, 0xeb, 0xdc, 0xe5, 0xcf, 0x3d, 0xb3 },
+ { 0x69, 0x6e, 0x5d, 0x46, 0xe6, 0xc5, 0x7e, 0x87,
+ 0x96, 0xe4, 0x73, 0x5d, 0x08, 0x91, 0x6e, 0x0b,
+ 0x79, 0x29, 0xb3, 0xcf, 0x29, 0x8c, 0x29, 0x6d,
+ 0x22, 0xe9, 0xd3, 0x01, 0x96, 0x53, 0x37, 0x1c },
+ { 0x1f, 0x56, 0x47, 0xc1, 0xd3, 0xb0, 0x88, 0x22,
+ 0x88, 0x85, 0x86, 0x5c, 0x89, 0x40, 0x90, 0x8b,
+ 0xf4, 0x0d, 0x1a, 0x82, 0x72, 0x82, 0x19, 0x73,
+ 0xb1, 0x60, 0x00, 0x8e, 0x7a, 0x3c, 0xe2, 0xeb },
+ { 0xb6, 0xe7, 0x6c, 0x33, 0x0f, 0x02, 0x1a, 0x5b,
+ 0xda, 0x65, 0x87, 0x50, 0x10, 0xb0, 0xed, 0xf0,
+ 0x91, 0x26, 0xc0, 0xf5, 0x10, 0xea, 0x84, 0x90,
+ 0x48, 0x19, 0x20, 0x03, 0xae, 0xf4, 0xc6, 0x1c },
+ { 0x3c, 0xd9, 0x52, 0xa0, 0xbe, 0xad, 0xa4, 0x1a,
+ 0xbb, 0x42, 0x4c, 0xe4, 0x7f, 0x94, 0xb4, 0x2b,
+ 0xe6, 0x4e, 0x1f, 0xfb, 0x0f, 0xd0, 0x78, 0x22,
+ 0x76, 0x80, 0x79, 0x46, 0xd0, 0xd0, 0xbc, 0x55 },
+ { 0x98, 0xd9, 0x26, 0x77, 0x43, 0x9b, 0x41, 0xb7,
+ 0xbb, 0x51, 0x33, 0x12, 0xaf, 0xb9, 0x2b, 0xcc,
+ 0x8e, 0xe9, 0x68, 0xb2, 0xe3, 0xb2, 0x38, 0xce,
+ 0xcb, 0x9b, 0x0f, 0x34, 0xc9, 0xbb, 0x63, 0xd0 },
+ { 0xec, 0xbc, 0xa2, 0xcf, 0x08, 0xae, 0x57, 0xd5,
+ 0x17, 0xad, 0x16, 0x15, 0x8a, 0x32, 0xbf, 0xa7,
+ 0xdc, 0x03, 0x82, 0xea, 0xed, 0xa1, 0x28, 0xe9,
+ 0x18, 0x86, 0x73, 0x4c, 0x24, 0xa0, 0xb2, 0x9d },
+ { 0x94, 0x2c, 0xc7, 0xc0, 0xb5, 0x2e, 0x2b, 0x16,
+ 0xa4, 0xb8, 0x9f, 0xa4, 0xfc, 0x7e, 0x0b, 0xf6,
+ 0x09, 0xe2, 0x9a, 0x08, 0xc1, 0xa8, 0x54, 0x34,
+ 0x52, 0xb7, 0x7c, 0x7b, 0xfd, 0x11, 0xbb, 0x28 },
+ { 0x8a, 0x06, 0x5d, 0x8b, 0x61, 0xa0, 0xdf, 0xfb,
+ 0x17, 0x0d, 0x56, 0x27, 0x73, 0x5a, 0x76, 0xb0,
+ 0xe9, 0x50, 0x60, 0x37, 0x80, 0x8c, 0xba, 0x16,
+ 0xc3, 0x45, 0x00, 0x7c, 0x9f, 0x79, 0xcf, 0x8f },
+ { 0x1b, 0x9f, 0xa1, 0x97, 0x14, 0x65, 0x9c, 0x78,
+ 0xff, 0x41, 0x38, 0x71, 0x84, 0x92, 0x15, 0x36,
+ 0x10, 0x29, 0xac, 0x80, 0x2b, 0x1c, 0xbc, 0xd5,
+ 0x4e, 0x40, 0x8b, 0xd8, 0x72, 0x87, 0xf8, 0x1f },
+ { 0x8d, 0xab, 0x07, 0x1b, 0xcd, 0x6c, 0x72, 0x92,
+ 0xa9, 0xef, 0x72, 0x7b, 0x4a, 0xe0, 0xd8, 0x67,
+ 0x13, 0x30, 0x1d, 0xa8, 0x61, 0x8d, 0x9a, 0x48,
+ 0xad, 0xce, 0x55, 0xf3, 0x03, 0xa8, 0x69, 0xa1 },
+ { 0x82, 0x53, 0xe3, 0xe7, 0xc7, 0xb6, 0x84, 0xb9,
+ 0xcb, 0x2b, 0xeb, 0x01, 0x4c, 0xe3, 0x30, 0xff,
+ 0x3d, 0x99, 0xd1, 0x7a, 0xbb, 0xdb, 0xab, 0xe4,
+ 0xf4, 0xd6, 0x74, 0xde, 0xd5, 0x3f, 0xfc, 0x6b },
+ { 0xf1, 0x95, 0xf3, 0x21, 0xe9, 0xe3, 0xd6, 0xbd,
+ 0x7d, 0x07, 0x45, 0x04, 0xdd, 0x2a, 0xb0, 0xe6,
+ 0x24, 0x1f, 0x92, 0xe7, 0x84, 0xb1, 0xaa, 0x27,
+ 0x1f, 0xf6, 0x48, 0xb1, 0xca, 0xb6, 0xd7, 0xf6 },
+ { 0x27, 0xe4, 0xcc, 0x72, 0x09, 0x0f, 0x24, 0x12,
+ 0x66, 0x47, 0x6a, 0x7c, 0x09, 0x49, 0x5f, 0x2d,
+ 0xb1, 0x53, 0xd5, 0xbc, 0xbd, 0x76, 0x19, 0x03,
+ 0xef, 0x79, 0x27, 0x5e, 0xc5, 0x6b, 0x2e, 0xd8 },
+ { 0x89, 0x9c, 0x24, 0x05, 0x78, 0x8e, 0x25, 0xb9,
+ 0x9a, 0x18, 0x46, 0x35, 0x5e, 0x64, 0x6d, 0x77,
+ 0xcf, 0x40, 0x00, 0x83, 0x41, 0x5f, 0x7d, 0xc5,
+ 0xaf, 0xe6, 0x9d, 0x6e, 0x17, 0xc0, 0x00, 0x23 },
+ { 0xa5, 0x9b, 0x78, 0xc4, 0x90, 0x57, 0x44, 0x07,
+ 0x6b, 0xfe, 0xe8, 0x94, 0xde, 0x70, 0x7d, 0x4f,
+ 0x12, 0x0b, 0x5c, 0x68, 0x93, 0xea, 0x04, 0x00,
+ 0x29, 0x7d, 0x0b, 0xb8, 0x34, 0x72, 0x76, 0x32 },
+ { 0x59, 0xdc, 0x78, 0xb1, 0x05, 0x64, 0x97, 0x07,
+ 0xa2, 0xbb, 0x44, 0x19, 0xc4, 0x8f, 0x00, 0x54,
+ 0x00, 0xd3, 0x97, 0x3d, 0xe3, 0x73, 0x66, 0x10,
+ 0x23, 0x04, 0x35, 0xb1, 0x04, 0x24, 0xb2, 0x4f },
+ { 0xc0, 0x14, 0x9d, 0x1d, 0x7e, 0x7a, 0x63, 0x53,
+ 0xa6, 0xd9, 0x06, 0xef, 0xe7, 0x28, 0xf2, 0xf3,
+ 0x29, 0xfe, 0x14, 0xa4, 0x14, 0x9a, 0x3e, 0xa7,
+ 0x76, 0x09, 0xbc, 0x42, 0xb9, 0x75, 0xdd, 0xfa },
+ { 0xa3, 0x2f, 0x24, 0x14, 0x74, 0xa6, 0xc1, 0x69,
+ 0x32, 0xe9, 0x24, 0x3b, 0xe0, 0xcf, 0x09, 0xbc,
+ 0xdc, 0x7e, 0x0c, 0xa0, 0xe7, 0xa6, 0xa1, 0xb9,
+ 0xb1, 0xa0, 0xf0, 0x1e, 0x41, 0x50, 0x23, 0x77 },
+ { 0xb2, 0x39, 0xb2, 0xe4, 0xf8, 0x18, 0x41, 0x36,
+ 0x1c, 0x13, 0x39, 0xf6, 0x8e, 0x2c, 0x35, 0x9f,
+ 0x92, 0x9a, 0xf9, 0xad, 0x9f, 0x34, 0xe0, 0x1a,
+ 0xab, 0x46, 0x31, 0xad, 0x6d, 0x55, 0x00, 0xb0 },
+ { 0x85, 0xfb, 0x41, 0x9c, 0x70, 0x02, 0xa3, 0xe0,
+ 0xb4, 0xb6, 0xea, 0x09, 0x3b, 0x4c, 0x1a, 0xc6,
+ 0x93, 0x66, 0x45, 0xb6, 0x5d, 0xac, 0x5a, 0xc1,
+ 0x5a, 0x85, 0x28, 0xb7, 0xb9, 0x4c, 0x17, 0x54 },
+ { 0x96, 0x19, 0x72, 0x06, 0x25, 0xf1, 0x90, 0xb9,
+ 0x3a, 0x3f, 0xad, 0x18, 0x6a, 0xb3, 0x14, 0x18,
+ 0x96, 0x33, 0xc0, 0xd3, 0xa0, 0x1e, 0x6f, 0x9b,
+ 0xc8, 0xc4, 0xa8, 0xf8, 0x2f, 0x38, 0x3d, 0xbf },
+ { 0x7d, 0x62, 0x0d, 0x90, 0xfe, 0x69, 0xfa, 0x46,
+ 0x9a, 0x65, 0x38, 0x38, 0x89, 0x70, 0xa1, 0xaa,
+ 0x09, 0xbb, 0x48, 0xa2, 0xd5, 0x9b, 0x34, 0x7b,
+ 0x97, 0xe8, 0xce, 0x71, 0xf4, 0x8c, 0x7f, 0x46 },
+ { 0x29, 0x43, 0x83, 0x56, 0x85, 0x96, 0xfb, 0x37,
+ 0xc7, 0x5b, 0xba, 0xcd, 0x97, 0x9c, 0x5f, 0xf6,
+ 0xf2, 0x0a, 0x55, 0x6b, 0xf8, 0x87, 0x9c, 0xc7,
+ 0x29, 0x24, 0x85, 0x5d, 0xf9, 0xb8, 0x24, 0x0e },
+ { 0x16, 0xb1, 0x8a, 0xb3, 0x14, 0x35, 0x9c, 0x2b,
+ 0x83, 0x3c, 0x1c, 0x69, 0x86, 0xd4, 0x8c, 0x55,
+ 0xa9, 0xfc, 0x97, 0xcd, 0xe9, 0xa3, 0xc1, 0xf1,
+ 0x0a, 0x31, 0x77, 0x14, 0x0f, 0x73, 0xf7, 0x38 },
+ { 0x8c, 0xbb, 0xdd, 0x14, 0xbc, 0x33, 0xf0, 0x4c,
+ 0xf4, 0x58, 0x13, 0xe4, 0xa1, 0x53, 0xa2, 0x73,
+ 0xd3, 0x6a, 0xda, 0xd5, 0xce, 0x71, 0xf4, 0x99,
+ 0xee, 0xb8, 0x7f, 0xb8, 0xac, 0x63, 0xb7, 0x29 },
+ { 0x69, 0xc9, 0xa4, 0x98, 0xdb, 0x17, 0x4e, 0xca,
+ 0xef, 0xcc, 0x5a, 0x3a, 0xc9, 0xfd, 0xed, 0xf0,
+ 0xf8, 0x13, 0xa5, 0xbe, 0xc7, 0x27, 0xf1, 0xe7,
+ 0x75, 0xba, 0xbd, 0xec, 0x77, 0x18, 0x81, 0x6e },
+ { 0xb4, 0x62, 0xc3, 0xbe, 0x40, 0x44, 0x8f, 0x1d,
+ 0x4f, 0x80, 0x62, 0x62, 0x54, 0xe5, 0x35, 0xb0,
+ 0x8b, 0xc9, 0xcd, 0xcf, 0xf5, 0x99, 0xa7, 0x68,
+ 0x57, 0x8d, 0x4b, 0x28, 0x81, 0xa8, 0xe3, 0xf0 },
+ { 0x55, 0x3e, 0x9d, 0x9c, 0x5f, 0x36, 0x0a, 0xc0,
+ 0xb7, 0x4a, 0x7d, 0x44, 0xe5, 0xa3, 0x91, 0xda,
+ 0xd4, 0xce, 0xd0, 0x3e, 0x0c, 0x24, 0x18, 0x3b,
+ 0x7e, 0x8e, 0xca, 0xbd, 0xf1, 0x71, 0x5a, 0x64 },
+ { 0x7a, 0x7c, 0x55, 0xa5, 0x6f, 0xa9, 0xae, 0x51,
+ 0xe6, 0x55, 0xe0, 0x19, 0x75, 0xd8, 0xa6, 0xff,
+ 0x4a, 0xe9, 0xe4, 0xb4, 0x86, 0xfc, 0xbe, 0x4e,
+ 0xac, 0x04, 0x45, 0x88, 0xf2, 0x45, 0xeb, 0xea },
+ { 0x2a, 0xfd, 0xf3, 0xc8, 0x2a, 0xbc, 0x48, 0x67,
+ 0xf5, 0xde, 0x11, 0x12, 0x86, 0xc2, 0xb3, 0xbe,
+ 0x7d, 0x6e, 0x48, 0x65, 0x7b, 0xa9, 0x23, 0xcf,
+ 0xbf, 0x10, 0x1a, 0x6d, 0xfc, 0xf9, 0xdb, 0x9a },
+ { 0x41, 0x03, 0x7d, 0x2e, 0xdc, 0xdc, 0xe0, 0xc4,
+ 0x9b, 0x7f, 0xb4, 0xa6, 0xaa, 0x09, 0x99, 0xca,
+ 0x66, 0x97, 0x6c, 0x74, 0x83, 0xaf, 0xe6, 0x31,
+ 0xd4, 0xed, 0xa2, 0x83, 0x14, 0x4f, 0x6d, 0xfc },
+ { 0xc4, 0x46, 0x6f, 0x84, 0x97, 0xca, 0x2e, 0xeb,
+ 0x45, 0x83, 0xa0, 0xb0, 0x8e, 0x9d, 0x9a, 0xc7,
+ 0x43, 0x95, 0x70, 0x9f, 0xda, 0x10, 0x9d, 0x24,
+ 0xf2, 0xe4, 0x46, 0x21, 0x96, 0x77, 0x9c, 0x5d },
+ { 0x75, 0xf6, 0x09, 0x33, 0x8a, 0xa6, 0x7d, 0x96,
+ 0x9a, 0x2a, 0xe2, 0xa2, 0x36, 0x2b, 0x2d, 0xa9,
+ 0xd7, 0x7c, 0x69, 0x5d, 0xfd, 0x1d, 0xf7, 0x22,
+ 0x4a, 0x69, 0x01, 0xdb, 0x93, 0x2c, 0x33, 0x64 },
+ { 0x68, 0x60, 0x6c, 0xeb, 0x98, 0x9d, 0x54, 0x88,
+ 0xfc, 0x7c, 0xf6, 0x49, 0xf3, 0xd7, 0xc2, 0x72,
+ 0xef, 0x05, 0x5d, 0xa1, 0xa9, 0x3f, 0xae, 0xcd,
+ 0x55, 0xfe, 0x06, 0xf6, 0x96, 0x70, 0x98, 0xca },
+ { 0x44, 0x34, 0x6b, 0xde, 0xb7, 0xe0, 0x52, 0xf6,
+ 0x25, 0x50, 0x48, 0xf0, 0xd9, 0xb4, 0x2c, 0x42,
+ 0x5b, 0xab, 0x9c, 0x3d, 0xd2, 0x41, 0x68, 0x21,
+ 0x2c, 0x3e, 0xcf, 0x1e, 0xbf, 0x34, 0xe6, 0xae },
+ { 0x8e, 0x9c, 0xf6, 0xe1, 0xf3, 0x66, 0x47, 0x1f,
+ 0x2a, 0xc7, 0xd2, 0xee, 0x9b, 0x5e, 0x62, 0x66,
+ 0xfd, 0xa7, 0x1f, 0x8f, 0x2e, 0x41, 0x09, 0xf2,
+ 0x23, 0x7e, 0xd5, 0xf8, 0x81, 0x3f, 0xc7, 0x18 },
+ { 0x84, 0xbb, 0xeb, 0x84, 0x06, 0xd2, 0x50, 0x95,
+ 0x1f, 0x8c, 0x1b, 0x3e, 0x86, 0xa7, 0xc0, 0x10,
+ 0x08, 0x29, 0x21, 0x83, 0x3d, 0xfd, 0x95, 0x55,
+ 0xa2, 0xf9, 0x09, 0xb1, 0x08, 0x6e, 0xb4, 0xb8 },
+ { 0xee, 0x66, 0x6f, 0x3e, 0xef, 0x0f, 0x7e, 0x2a,
+ 0x9c, 0x22, 0x29, 0x58, 0xc9, 0x7e, 0xaf, 0x35,
+ 0xf5, 0x1c, 0xed, 0x39, 0x3d, 0x71, 0x44, 0x85,
+ 0xab, 0x09, 0xa0, 0x69, 0x34, 0x0f, 0xdf, 0x88 },
+ { 0xc1, 0x53, 0xd3, 0x4a, 0x65, 0xc4, 0x7b, 0x4a,
+ 0x62, 0xc5, 0xca, 0xcf, 0x24, 0x01, 0x09, 0x75,
+ 0xd0, 0x35, 0x6b, 0x2f, 0x32, 0xc8, 0xf5, 0xda,
+ 0x53, 0x0d, 0x33, 0x88, 0x16, 0xad, 0x5d, 0xe6 },
+ { 0x9f, 0xc5, 0x45, 0x01, 0x09, 0xe1, 0xb7, 0x79,
+ 0xf6, 0xc7, 0xae, 0x79, 0xd5, 0x6c, 0x27, 0x63,
+ 0x5c, 0x8d, 0xd4, 0x26, 0xc5, 0xa9, 0xd5, 0x4e,
+ 0x25, 0x78, 0xdb, 0x98, 0x9b, 0x8c, 0x3b, 0x4e },
+ { 0xd1, 0x2b, 0xf3, 0x73, 0x2e, 0xf4, 0xaf, 0x5c,
+ 0x22, 0xfa, 0x90, 0x35, 0x6a, 0xf8, 0xfc, 0x50,
+ 0xfc, 0xb4, 0x0f, 0x8f, 0x2e, 0xa5, 0xc8, 0x59,
+ 0x47, 0x37, 0xa3, 0xb3, 0xd5, 0xab, 0xdb, 0xd7 },
+ { 0x11, 0x03, 0x0b, 0x92, 0x89, 0xbb, 0xa5, 0xaf,
+ 0x65, 0x26, 0x06, 0x72, 0xab, 0x6f, 0xee, 0x88,
+ 0xb8, 0x74, 0x20, 0xac, 0xef, 0x4a, 0x17, 0x89,
+ 0xa2, 0x07, 0x3b, 0x7e, 0xc2, 0xf2, 0xa0, 0x9e },
+ { 0x69, 0xcb, 0x19, 0x2b, 0x84, 0x44, 0x00, 0x5c,
+ 0x8c, 0x0c, 0xeb, 0x12, 0xc8, 0x46, 0x86, 0x07,
+ 0x68, 0x18, 0x8c, 0xda, 0x0a, 0xec, 0x27, 0xa9,
+ 0xc8, 0xa5, 0x5c, 0xde, 0xe2, 0x12, 0x36, 0x32 },
+ { 0xdb, 0x44, 0x4c, 0x15, 0x59, 0x7b, 0x5f, 0x1a,
+ 0x03, 0xd1, 0xf9, 0xed, 0xd1, 0x6e, 0x4a, 0x9f,
+ 0x43, 0xa6, 0x67, 0xcc, 0x27, 0x51, 0x75, 0xdf,
+ 0xa2, 0xb7, 0x04, 0xe3, 0xbb, 0x1a, 0x9b, 0x83 },
+ { 0x3f, 0xb7, 0x35, 0x06, 0x1a, 0xbc, 0x51, 0x9d,
+ 0xfe, 0x97, 0x9e, 0x54, 0xc1, 0xee, 0x5b, 0xfa,
+ 0xd0, 0xa9, 0xd8, 0x58, 0xb3, 0x31, 0x5b, 0xad,
+ 0x34, 0xbd, 0xe9, 0x99, 0xef, 0xd7, 0x24, 0xdd }
+};
+
+static bool __init blake2s_selftest(void)
+{
+ u8 key[BLAKE2S_KEY_SIZE];
+ u8 buf[ARRAY_SIZE(blake2s_testvecs)];
+ u8 hash[BLAKE2S_HASH_SIZE];
+ size_t i;
+ bool success = true;
+
+ for (i = 0; i < BLAKE2S_KEY_SIZE; ++i)
+ key[i] = (u8)i;
+
+ for (i = 0; i < ARRAY_SIZE(blake2s_testvecs); ++i)
+ buf[i] = (u8)i;
+
+ for (i = 0; i < ARRAY_SIZE(blake2s_keyed_testvecs); ++i) {
+ blake2s(hash, buf, key, BLAKE2S_HASH_SIZE, i, BLAKE2S_KEY_SIZE);
+ if (memcmp(hash, blake2s_keyed_testvecs[i], BLAKE2S_HASH_SIZE)) {
+ pr_err("blake2s keyed self-test %zu: FAIL\n", i + 1);
+ success = false;
+ }
+ }
+
+ for (i = 0; i < ARRAY_SIZE(blake2s_testvecs); ++i) {
+ blake2s(hash, buf, NULL, BLAKE2S_HASH_SIZE, i, 0);
+ if (memcmp(hash, blake2s_testvecs[i], BLAKE2S_HASH_SIZE)) {
+ pr_err("blake2s unkeyed self-test %zu: FAIL\n", i + i);
+ success = false;
+ }
+ }
+ return success;
+}
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 20/28] zinc: BLAKE2s x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (16 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 19/28] zinc: BLAKE2s generic C implementation " Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 21/28] zinc: Curve25519 generic C implementations and selftest Jason A. Donenfeld
` (5 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Thomas Gleixner, Ingo Molnar,
x86, Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
These implementations from Samuel Neves support AVX and AVX-512VL.
Originally this used AVX-512F, but Skylake thermal throttling made
AVX-512VL more attractive and possible to do with negligable difference.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 1 +
lib/zinc/blake2s/blake2s-x86_64-glue.c | 71 +++
lib/zinc/blake2s/blake2s-x86_64.S | 685 +++++++++++++++++++++++++
lib/zinc/blake2s/blake2s.c | 4 +
4 files changed, 761 insertions(+)
create mode 100644 lib/zinc/blake2s/blake2s-x86_64-glue.c
create mode 100644 lib/zinc/blake2s/blake2s-x86_64.S
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index d2ec55c33ef0..67ad837c822c 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -23,4 +23,5 @@ zinc_chacha20poly1305-y := chacha20poly1305.o
obj-$(CONFIG_ZINC_CHACHA20POLY1305) += zinc_chacha20poly1305.o
zinc_blake2s-y := blake2s/blake2s.o
+zinc_blake2s-$(CONFIG_ZINC_ARCH_X86_64) += blake2s/blake2s-x86_64.o
obj-$(CONFIG_ZINC_BLAKE2S) += zinc_blake2s.o
diff --git a/lib/zinc/blake2s/blake2s-x86_64-glue.c b/lib/zinc/blake2s/blake2s-x86_64-glue.c
new file mode 100644
index 000000000000..615c38861579
--- /dev/null
+++ b/lib/zinc/blake2s/blake2s-x86_64-glue.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <linux/simd.h>
+#include <asm/cpufeature.h>
+#include <asm/processor.h>
+#include <asm/fpu/api.h>
+
+asmlinkage void blake2s_compress_avx(struct blake2s_state *state,
+ const u8 *block, const size_t nblocks,
+ const u32 inc);
+asmlinkage void blake2s_compress_avx512(struct blake2s_state *state,
+ const u8 *block, const size_t nblocks,
+ const u32 inc);
+
+static bool blake2s_use_avx __ro_after_init;
+static bool blake2s_use_avx512 __ro_after_init;
+static bool *const blake2s_nobs[] __initconst = { &blake2s_use_avx512 };
+
+static void __init blake2s_fpu_init(void)
+{
+ blake2s_use_avx =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
+ blake2s_use_avx512 =
+ boot_cpu_has(X86_FEATURE_AVX) &&
+ boot_cpu_has(X86_FEATURE_AVX2) &&
+ boot_cpu_has(X86_FEATURE_AVX512F) &&
+ boot_cpu_has(X86_FEATURE_AVX512VL) &&
+ cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM |
+ XFEATURE_MASK_AVX512, NULL);
+}
+
+static inline bool blake2s_compress_arch(struct blake2s_state *state,
+ const u8 *block, size_t nblocks,
+ const u32 inc)
+{
+ simd_context_t simd_context;
+ bool used_arch = false;
+
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(PAGE_SIZE / BLAKE2S_BLOCK_SIZE < 8);
+
+ simd_get(&simd_context);
+
+ if (!IS_ENABLED(CONFIG_AS_AVX) || !blake2s_use_avx ||
+ !simd_use(&simd_context))
+ goto out;
+ used_arch = true;
+
+ for (;;) {
+ const size_t blocks = min_t(size_t, nblocks,
+ PAGE_SIZE / BLAKE2S_BLOCK_SIZE);
+
+ if (IS_ENABLED(CONFIG_AS_AVX512) && blake2s_use_avx512)
+ blake2s_compress_avx512(state, block, blocks, inc);
+ else
+ blake2s_compress_avx(state, block, blocks, inc);
+
+ nblocks -= blocks;
+ if (!nblocks)
+ break;
+ block += blocks * BLAKE2S_BLOCK_SIZE;
+ simd_relax(&simd_context);
+ }
+out:
+ simd_put(&simd_context);
+ return used_arch;
+}
diff --git a/lib/zinc/blake2s/blake2s-x86_64.S b/lib/zinc/blake2s/blake2s-x86_64.S
new file mode 100644
index 000000000000..1407a5ffb5c2
--- /dev/null
+++ b/lib/zinc/blake2s/blake2s-x86_64.S
@@ -0,0 +1,685 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ * Copyright (C) 2017 Samuel Neves <sneves@dei.uc.pt>. All Rights Reserved.
+ */
+
+#include <linux/linkage.h>
+
+.section .rodata.cst32.BLAKE2S_IV, "aM", @progbits, 32
+.align 32
+IV: .octa 0xA54FF53A3C6EF372BB67AE856A09E667
+ .octa 0x5BE0CD191F83D9AB9B05688C510E527F
+.section .rodata.cst16.ROT16, "aM", @progbits, 16
+.align 16
+ROT16: .octa 0x0D0C0F0E09080B0A0504070601000302
+.section .rodata.cst16.ROR328, "aM", @progbits, 16
+.align 16
+ROR328: .octa 0x0C0F0E0D080B0A090407060500030201
+#ifdef CONFIG_AS_AVX512
+.section .rodata.cst64.BLAKE2S_SIGMA, "aM", @progbits, 640
+.align 64
+SIGMA:
+.long 0, 2, 4, 6, 1, 3, 5, 7, 8, 10, 12, 14, 9, 11, 13, 15
+.long 11, 2, 12, 14, 9, 8, 15, 3, 4, 0, 13, 6, 10, 1, 7, 5
+.long 10, 12, 11, 6, 5, 9, 13, 3, 4, 15, 14, 2, 0, 7, 8, 1
+.long 10, 9, 7, 0, 11, 14, 1, 12, 6, 2, 15, 3, 13, 8, 5, 4
+.long 4, 9, 8, 13, 14, 0, 10, 11, 7, 3, 12, 1, 5, 6, 15, 2
+.long 2, 10, 4, 14, 13, 3, 9, 11, 6, 5, 7, 12, 15, 1, 8, 0
+.long 4, 11, 14, 8, 13, 10, 12, 5, 2, 1, 15, 3, 9, 7, 0, 6
+.long 6, 12, 0, 13, 15, 2, 1, 10, 4, 5, 11, 14, 8, 3, 9, 7
+.long 14, 5, 4, 12, 9, 7, 3, 10, 2, 0, 6, 15, 11, 1, 13, 8
+.long 11, 7, 13, 10, 12, 14, 0, 15, 4, 5, 6, 9, 2, 1, 8, 3
+#endif /* CONFIG_AS_AVX512 */
+
+.text
+#ifdef CONFIG_AS_AVX
+ENTRY(blake2s_compress_avx)
+ movl %ecx, %ecx
+ testq %rdx, %rdx
+ je .Lendofloop
+ .align 32
+.Lbeginofloop:
+ addq %rcx, 32(%rdi)
+ vmovdqu IV+16(%rip), %xmm1
+ vmovdqu (%rsi), %xmm4
+ vpxor 32(%rdi), %xmm1, %xmm1
+ vmovdqu 16(%rsi), %xmm3
+ vshufps $136, %xmm3, %xmm4, %xmm6
+ vmovdqa ROT16(%rip), %xmm7
+ vpaddd (%rdi), %xmm6, %xmm6
+ vpaddd 16(%rdi), %xmm6, %xmm6
+ vpxor %xmm6, %xmm1, %xmm1
+ vmovdqu IV(%rip), %xmm8
+ vpshufb %xmm7, %xmm1, %xmm1
+ vmovdqu 48(%rsi), %xmm5
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor 16(%rdi), %xmm8, %xmm9
+ vmovdqu 32(%rsi), %xmm2
+ vpblendw $12, %xmm3, %xmm5, %xmm13
+ vshufps $221, %xmm5, %xmm2, %xmm12
+ vpunpckhqdq %xmm2, %xmm4, %xmm14
+ vpslld $20, %xmm9, %xmm0
+ vpsrld $12, %xmm9, %xmm9
+ vpxor %xmm0, %xmm9, %xmm0
+ vshufps $221, %xmm3, %xmm4, %xmm9
+ vpaddd %xmm9, %xmm6, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vmovdqa ROR328(%rip), %xmm6
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor %xmm8, %xmm0, %xmm0
+ vpshufd $147, %xmm1, %xmm1
+ vpshufd $78, %xmm8, %xmm8
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm10, %xmm0, %xmm0
+ vshufps $136, %xmm5, %xmm2, %xmm10
+ vpshufd $57, %xmm0, %xmm0
+ vpaddd %xmm10, %xmm9, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpaddd %xmm12, %xmm9, %xmm9
+ vpblendw $12, %xmm2, %xmm3, %xmm12
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor %xmm8, %xmm0, %xmm10
+ vpslld $20, %xmm10, %xmm0
+ vpsrld $12, %xmm10, %xmm10
+ vpxor %xmm0, %xmm10, %xmm0
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor %xmm8, %xmm0, %xmm0
+ vpshufd $57, %xmm1, %xmm1
+ vpshufd $78, %xmm8, %xmm8
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm10, %xmm0, %xmm0
+ vpslldq $4, %xmm5, %xmm10
+ vpblendw $240, %xmm10, %xmm12, %xmm12
+ vpshufd $147, %xmm0, %xmm0
+ vpshufd $147, %xmm12, %xmm12
+ vpaddd %xmm9, %xmm12, %xmm12
+ vpaddd %xmm0, %xmm12, %xmm12
+ vpxor %xmm12, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor %xmm8, %xmm0, %xmm11
+ vpslld $20, %xmm11, %xmm9
+ vpsrld $12, %xmm11, %xmm11
+ vpxor %xmm9, %xmm11, %xmm0
+ vpshufd $8, %xmm2, %xmm9
+ vpblendw $192, %xmm5, %xmm3, %xmm11
+ vpblendw $240, %xmm11, %xmm9, %xmm9
+ vpshufd $177, %xmm9, %xmm9
+ vpaddd %xmm12, %xmm9, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm11
+ vpxor %xmm11, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm8
+ vpxor %xmm8, %xmm0, %xmm9
+ vpshufd $147, %xmm1, %xmm1
+ vpshufd $78, %xmm8, %xmm8
+ vpslld $25, %xmm9, %xmm0
+ vpsrld $7, %xmm9, %xmm9
+ vpxor %xmm0, %xmm9, %xmm0
+ vpslldq $4, %xmm3, %xmm9
+ vpblendw $48, %xmm9, %xmm2, %xmm9
+ vpblendw $240, %xmm9, %xmm4, %xmm9
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $177, %xmm9, %xmm9
+ vpaddd %xmm11, %xmm9, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm8, %xmm11
+ vpxor %xmm11, %xmm0, %xmm0
+ vpslld $20, %xmm0, %xmm8
+ vpsrld $12, %xmm0, %xmm0
+ vpxor %xmm8, %xmm0, %xmm0
+ vpunpckhdq %xmm3, %xmm4, %xmm8
+ vpblendw $12, %xmm10, %xmm8, %xmm12
+ vpshufd $177, %xmm12, %xmm12
+ vpaddd %xmm9, %xmm12, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm0
+ vpshufd $57, %xmm1, %xmm1
+ vpshufd $78, %xmm11, %xmm11
+ vpslld $25, %xmm0, %xmm12
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm12, %xmm0, %xmm0
+ vpunpckhdq %xmm5, %xmm2, %xmm12
+ vpshufd $147, %xmm0, %xmm0
+ vpblendw $15, %xmm13, %xmm12, %xmm12
+ vpslldq $8, %xmm5, %xmm13
+ vpshufd $210, %xmm12, %xmm12
+ vpaddd %xmm9, %xmm12, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm0
+ vpslld $20, %xmm0, %xmm12
+ vpsrld $12, %xmm0, %xmm0
+ vpxor %xmm12, %xmm0, %xmm0
+ vpunpckldq %xmm4, %xmm2, %xmm12
+ vpblendw $240, %xmm4, %xmm12, %xmm12
+ vpblendw $192, %xmm13, %xmm12, %xmm12
+ vpsrldq $12, %xmm3, %xmm13
+ vpaddd %xmm12, %xmm9, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm0
+ vpshufd $147, %xmm1, %xmm1
+ vpshufd $78, %xmm11, %xmm11
+ vpslld $25, %xmm0, %xmm12
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm12, %xmm0, %xmm0
+ vpblendw $60, %xmm2, %xmm4, %xmm12
+ vpblendw $3, %xmm13, %xmm12, %xmm12
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $78, %xmm12, %xmm12
+ vpaddd %xmm9, %xmm12, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm12
+ vpslld $20, %xmm12, %xmm13
+ vpsrld $12, %xmm12, %xmm0
+ vpblendw $51, %xmm3, %xmm4, %xmm12
+ vpxor %xmm13, %xmm0, %xmm0
+ vpblendw $192, %xmm10, %xmm12, %xmm10
+ vpslldq $8, %xmm2, %xmm12
+ vpshufd $27, %xmm10, %xmm10
+ vpaddd %xmm9, %xmm10, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm0
+ vpshufd $57, %xmm1, %xmm1
+ vpshufd $78, %xmm11, %xmm11
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm10, %xmm0, %xmm0
+ vpunpckhdq %xmm2, %xmm8, %xmm10
+ vpshufd $147, %xmm0, %xmm0
+ vpblendw $12, %xmm5, %xmm10, %xmm10
+ vpshufd $210, %xmm10, %xmm10
+ vpaddd %xmm9, %xmm10, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm11
+ vpxor %xmm11, %xmm0, %xmm10
+ vpslld $20, %xmm10, %xmm0
+ vpsrld $12, %xmm10, %xmm10
+ vpxor %xmm0, %xmm10, %xmm0
+ vpblendw $12, %xmm4, %xmm5, %xmm10
+ vpblendw $192, %xmm12, %xmm10, %xmm10
+ vpunpckldq %xmm2, %xmm4, %xmm12
+ vpshufd $135, %xmm10, %xmm10
+ vpaddd %xmm9, %xmm10, %xmm9
+ vpaddd %xmm0, %xmm9, %xmm9
+ vpxor %xmm9, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm11, %xmm13
+ vpxor %xmm13, %xmm0, %xmm0
+ vpshufd $147, %xmm1, %xmm1
+ vpshufd $78, %xmm13, %xmm13
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm10, %xmm0, %xmm0
+ vpblendw $15, %xmm3, %xmm4, %xmm10
+ vpblendw $192, %xmm5, %xmm10, %xmm10
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $198, %xmm10, %xmm10
+ vpaddd %xmm9, %xmm10, %xmm10
+ vpaddd %xmm0, %xmm10, %xmm10
+ vpxor %xmm10, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm13, %xmm13
+ vpxor %xmm13, %xmm0, %xmm9
+ vpslld $20, %xmm9, %xmm0
+ vpsrld $12, %xmm9, %xmm9
+ vpxor %xmm0, %xmm9, %xmm0
+ vpunpckhdq %xmm2, %xmm3, %xmm9
+ vpunpcklqdq %xmm12, %xmm9, %xmm15
+ vpunpcklqdq %xmm12, %xmm8, %xmm12
+ vpblendw $15, %xmm5, %xmm8, %xmm8
+ vpaddd %xmm15, %xmm10, %xmm15
+ vpaddd %xmm0, %xmm15, %xmm15
+ vpxor %xmm15, %xmm1, %xmm1
+ vpshufd $141, %xmm8, %xmm8
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm13, %xmm13
+ vpxor %xmm13, %xmm0, %xmm0
+ vpshufd $57, %xmm1, %xmm1
+ vpshufd $78, %xmm13, %xmm13
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm10, %xmm0, %xmm0
+ vpunpcklqdq %xmm2, %xmm3, %xmm10
+ vpshufd $147, %xmm0, %xmm0
+ vpblendw $51, %xmm14, %xmm10, %xmm14
+ vpshufd $135, %xmm14, %xmm14
+ vpaddd %xmm15, %xmm14, %xmm14
+ vpaddd %xmm0, %xmm14, %xmm14
+ vpxor %xmm14, %xmm1, %xmm1
+ vpunpcklqdq %xmm3, %xmm4, %xmm15
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm13, %xmm13
+ vpxor %xmm13, %xmm0, %xmm0
+ vpslld $20, %xmm0, %xmm11
+ vpsrld $12, %xmm0, %xmm0
+ vpxor %xmm11, %xmm0, %xmm0
+ vpunpckhqdq %xmm5, %xmm3, %xmm11
+ vpblendw $51, %xmm15, %xmm11, %xmm11
+ vpunpckhqdq %xmm3, %xmm5, %xmm15
+ vpaddd %xmm11, %xmm14, %xmm11
+ vpaddd %xmm0, %xmm11, %xmm11
+ vpxor %xmm11, %xmm1, %xmm1
+ vpshufb %xmm6, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm13, %xmm13
+ vpxor %xmm13, %xmm0, %xmm0
+ vpshufd $147, %xmm1, %xmm1
+ vpshufd $78, %xmm13, %xmm13
+ vpslld $25, %xmm0, %xmm14
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm14, %xmm0, %xmm14
+ vpunpckhqdq %xmm4, %xmm2, %xmm0
+ vpshufd $57, %xmm14, %xmm14
+ vpblendw $51, %xmm15, %xmm0, %xmm15
+ vpaddd %xmm15, %xmm11, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm1, %xmm1
+ vpshufb %xmm7, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm13, %xmm13
+ vpxor %xmm13, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm11
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm11, %xmm14, %xmm14
+ vpblendw $3, %xmm2, %xmm4, %xmm11
+ vpslldq $8, %xmm11, %xmm0
+ vpblendw $15, %xmm5, %xmm0, %xmm0
+ vpshufd $99, %xmm0, %xmm0
+ vpaddd %xmm15, %xmm0, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm1, %xmm0
+ vpaddd %xmm12, %xmm15, %xmm15
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm13, %xmm13
+ vpxor %xmm13, %xmm14, %xmm14
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $78, %xmm13, %xmm13
+ vpslld $25, %xmm14, %xmm1
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm1, %xmm14, %xmm14
+ vpblendw $3, %xmm5, %xmm4, %xmm1
+ vpshufd $147, %xmm14, %xmm14
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm13, %xmm13
+ vpxor %xmm13, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm12
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm12, %xmm14, %xmm14
+ vpsrldq $4, %xmm2, %xmm12
+ vpblendw $60, %xmm12, %xmm1, %xmm1
+ vpaddd %xmm1, %xmm15, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpblendw $12, %xmm4, %xmm3, %xmm1
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm13, %xmm13
+ vpxor %xmm13, %xmm14, %xmm14
+ vpshufd $147, %xmm0, %xmm0
+ vpshufd $78, %xmm13, %xmm13
+ vpslld $25, %xmm14, %xmm12
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm12, %xmm14, %xmm14
+ vpsrldq $4, %xmm5, %xmm12
+ vpblendw $48, %xmm12, %xmm1, %xmm1
+ vpshufd $33, %xmm5, %xmm12
+ vpshufd $57, %xmm14, %xmm14
+ vpshufd $108, %xmm1, %xmm1
+ vpblendw $51, %xmm12, %xmm10, %xmm12
+ vpaddd %xmm15, %xmm1, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpaddd %xmm12, %xmm15, %xmm15
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm13, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm13
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm13, %xmm14, %xmm14
+ vpslldq $12, %xmm3, %xmm13
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm14, %xmm12
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm12, %xmm14, %xmm14
+ vpblendw $51, %xmm5, %xmm4, %xmm12
+ vpshufd $147, %xmm14, %xmm14
+ vpblendw $192, %xmm13, %xmm12, %xmm12
+ vpaddd %xmm12, %xmm15, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpsrldq $4, %xmm3, %xmm12
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm13
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm13, %xmm14, %xmm14
+ vpblendw $48, %xmm2, %xmm5, %xmm13
+ vpblendw $3, %xmm12, %xmm13, %xmm13
+ vpshufd $156, %xmm13, %xmm13
+ vpaddd %xmm15, %xmm13, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpshufd $147, %xmm0, %xmm0
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm14, %xmm13
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm13, %xmm14, %xmm14
+ vpunpcklqdq %xmm2, %xmm4, %xmm13
+ vpshufd $57, %xmm14, %xmm14
+ vpblendw $12, %xmm12, %xmm13, %xmm12
+ vpshufd $180, %xmm12, %xmm12
+ vpaddd %xmm15, %xmm12, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm12
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm12, %xmm14, %xmm14
+ vpunpckhqdq %xmm9, %xmm4, %xmm12
+ vpshufd $198, %xmm12, %xmm12
+ vpaddd %xmm15, %xmm12, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpaddd %xmm15, %xmm8, %xmm15
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpshufd $57, %xmm0, %xmm0
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm14, %xmm12
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm12, %xmm14, %xmm14
+ vpsrldq $4, %xmm4, %xmm12
+ vpshufd $147, %xmm14, %xmm14
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm15, %xmm0, %xmm0
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpslld $20, %xmm14, %xmm8
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm14, %xmm8, %xmm14
+ vpblendw $48, %xmm5, %xmm2, %xmm8
+ vpblendw $3, %xmm12, %xmm8, %xmm8
+ vpunpckhqdq %xmm5, %xmm4, %xmm12
+ vpshufd $75, %xmm8, %xmm8
+ vpblendw $60, %xmm10, %xmm12, %xmm10
+ vpaddd %xmm15, %xmm8, %xmm15
+ vpaddd %xmm14, %xmm15, %xmm15
+ vpxor %xmm0, %xmm15, %xmm0
+ vpshufd $45, %xmm10, %xmm10
+ vpshufb %xmm6, %xmm0, %xmm0
+ vpaddd %xmm15, %xmm10, %xmm15
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm1, %xmm14, %xmm14
+ vpshufd $147, %xmm0, %xmm0
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm14, %xmm8
+ vpsrld $7, %xmm14, %xmm14
+ vpxor %xmm14, %xmm8, %xmm8
+ vpshufd $57, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm15, %xmm15
+ vpxor %xmm0, %xmm15, %xmm0
+ vpshufb %xmm7, %xmm0, %xmm0
+ vpaddd %xmm0, %xmm1, %xmm1
+ vpxor %xmm8, %xmm1, %xmm8
+ vpslld $20, %xmm8, %xmm10
+ vpsrld $12, %xmm8, %xmm8
+ vpxor %xmm8, %xmm10, %xmm10
+ vpunpckldq %xmm3, %xmm4, %xmm8
+ vpunpcklqdq %xmm9, %xmm8, %xmm9
+ vpaddd %xmm9, %xmm15, %xmm9
+ vpaddd %xmm10, %xmm9, %xmm9
+ vpxor %xmm0, %xmm9, %xmm8
+ vpshufb %xmm6, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm1, %xmm10, %xmm10
+ vpshufd $57, %xmm8, %xmm8
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm10, %xmm12
+ vpsrld $7, %xmm10, %xmm10
+ vpxor %xmm10, %xmm12, %xmm10
+ vpblendw $48, %xmm4, %xmm3, %xmm12
+ vpshufd $147, %xmm10, %xmm0
+ vpunpckhdq %xmm5, %xmm3, %xmm10
+ vpshufd $78, %xmm12, %xmm12
+ vpunpcklqdq %xmm4, %xmm10, %xmm10
+ vpblendw $192, %xmm2, %xmm10, %xmm10
+ vpshufhw $78, %xmm10, %xmm10
+ vpaddd %xmm10, %xmm9, %xmm10
+ vpaddd %xmm0, %xmm10, %xmm10
+ vpxor %xmm8, %xmm10, %xmm8
+ vpshufb %xmm7, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm0, %xmm1, %xmm9
+ vpslld $20, %xmm9, %xmm0
+ vpsrld $12, %xmm9, %xmm9
+ vpxor %xmm9, %xmm0, %xmm0
+ vpunpckhdq %xmm5, %xmm4, %xmm9
+ vpblendw $240, %xmm9, %xmm2, %xmm13
+ vpshufd $39, %xmm13, %xmm13
+ vpaddd %xmm10, %xmm13, %xmm10
+ vpaddd %xmm0, %xmm10, %xmm10
+ vpxor %xmm8, %xmm10, %xmm8
+ vpblendw $12, %xmm4, %xmm2, %xmm13
+ vpshufb %xmm6, %xmm8, %xmm8
+ vpslldq $4, %xmm13, %xmm13
+ vpblendw $15, %xmm5, %xmm13, %xmm13
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm1, %xmm0, %xmm0
+ vpaddd %xmm13, %xmm10, %xmm13
+ vpshufd $147, %xmm8, %xmm8
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm0, %xmm14
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm0, %xmm14, %xmm14
+ vpshufd $57, %xmm14, %xmm14
+ vpaddd %xmm14, %xmm13, %xmm13
+ vpxor %xmm8, %xmm13, %xmm8
+ vpaddd %xmm13, %xmm12, %xmm12
+ vpshufb %xmm7, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm14, %xmm1, %xmm14
+ vpslld $20, %xmm14, %xmm10
+ vpsrld $12, %xmm14, %xmm14
+ vpxor %xmm14, %xmm10, %xmm10
+ vpaddd %xmm10, %xmm12, %xmm12
+ vpxor %xmm8, %xmm12, %xmm8
+ vpshufb %xmm6, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm1, %xmm10, %xmm0
+ vpshufd $57, %xmm8, %xmm8
+ vpshufd $78, %xmm1, %xmm1
+ vpslld $25, %xmm0, %xmm10
+ vpsrld $7, %xmm0, %xmm0
+ vpxor %xmm0, %xmm10, %xmm10
+ vpblendw $48, %xmm2, %xmm3, %xmm0
+ vpblendw $15, %xmm11, %xmm0, %xmm0
+ vpshufd $147, %xmm10, %xmm10
+ vpshufd $114, %xmm0, %xmm0
+ vpaddd %xmm12, %xmm0, %xmm0
+ vpaddd %xmm10, %xmm0, %xmm0
+ vpxor %xmm8, %xmm0, %xmm8
+ vpshufb %xmm7, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm1, %xmm1
+ vpxor %xmm10, %xmm1, %xmm10
+ vpslld $20, %xmm10, %xmm11
+ vpsrld $12, %xmm10, %xmm10
+ vpxor %xmm10, %xmm11, %xmm10
+ vpslldq $4, %xmm4, %xmm11
+ vpblendw $192, %xmm11, %xmm3, %xmm3
+ vpunpckldq %xmm5, %xmm4, %xmm4
+ vpshufd $99, %xmm3, %xmm3
+ vpaddd %xmm0, %xmm3, %xmm3
+ vpaddd %xmm10, %xmm3, %xmm3
+ vpxor %xmm8, %xmm3, %xmm11
+ vpunpckldq %xmm5, %xmm2, %xmm0
+ vpblendw $192, %xmm2, %xmm5, %xmm2
+ vpshufb %xmm6, %xmm11, %xmm11
+ vpunpckhqdq %xmm0, %xmm9, %xmm0
+ vpblendw $15, %xmm4, %xmm2, %xmm4
+ vpaddd %xmm11, %xmm1, %xmm1
+ vpxor %xmm1, %xmm10, %xmm10
+ vpshufd $147, %xmm11, %xmm11
+ vpshufd $201, %xmm0, %xmm0
+ vpslld $25, %xmm10, %xmm8
+ vpsrld $7, %xmm10, %xmm10
+ vpxor %xmm10, %xmm8, %xmm10
+ vpshufd $78, %xmm1, %xmm1
+ vpaddd %xmm3, %xmm0, %xmm0
+ vpshufd $27, %xmm4, %xmm4
+ vpshufd $57, %xmm10, %xmm10
+ vpaddd %xmm10, %xmm0, %xmm0
+ vpxor %xmm11, %xmm0, %xmm11
+ vpaddd %xmm0, %xmm4, %xmm0
+ vpshufb %xmm7, %xmm11, %xmm7
+ vpaddd %xmm7, %xmm1, %xmm1
+ vpxor %xmm10, %xmm1, %xmm10
+ vpslld $20, %xmm10, %xmm8
+ vpsrld $12, %xmm10, %xmm10
+ vpxor %xmm10, %xmm8, %xmm8
+ vpaddd %xmm8, %xmm0, %xmm0
+ vpxor %xmm7, %xmm0, %xmm7
+ vpshufb %xmm6, %xmm7, %xmm6
+ vpaddd %xmm6, %xmm1, %xmm1
+ vpxor %xmm1, %xmm8, %xmm8
+ vpshufd $78, %xmm1, %xmm1
+ vpshufd $57, %xmm6, %xmm6
+ vpslld $25, %xmm8, %xmm2
+ vpsrld $7, %xmm8, %xmm8
+ vpxor %xmm8, %xmm2, %xmm8
+ vpxor (%rdi), %xmm1, %xmm1
+ vpshufd $147, %xmm8, %xmm8
+ vpxor %xmm0, %xmm1, %xmm0
+ vmovups %xmm0, (%rdi)
+ vpxor 16(%rdi), %xmm8, %xmm0
+ vpxor %xmm6, %xmm0, %xmm6
+ vmovups %xmm6, 16(%rdi)
+ addq $64, %rsi
+ decq %rdx
+ jnz .Lbeginofloop
+.Lendofloop:
+ ret
+ENDPROC(blake2s_compress_avx)
+#endif /* CONFIG_AS_AVX */
+
+#ifdef CONFIG_AS_AVX512
+ENTRY(blake2s_compress_avx512)
+ vmovdqu (%rdi),%xmm0
+ vmovdqu 0x10(%rdi),%xmm1
+ vmovdqu 0x20(%rdi),%xmm4
+ vmovq %rcx,%xmm5
+ vmovdqa IV(%rip),%xmm14
+ vmovdqa IV+16(%rip),%xmm15
+ jmp .Lblake2s_compress_avx512_mainloop
+.align 32
+.Lblake2s_compress_avx512_mainloop:
+ vmovdqa %xmm0,%xmm10
+ vmovdqa %xmm1,%xmm11
+ vpaddq %xmm5,%xmm4,%xmm4
+ vmovdqa %xmm14,%xmm2
+ vpxor %xmm15,%xmm4,%xmm3
+ vmovdqu (%rsi),%ymm6
+ vmovdqu 0x20(%rsi),%ymm7
+ addq $0x40,%rsi
+ leaq SIGMA(%rip),%rax
+ movb $0xa,%cl
+.Lblake2s_compress_avx512_roundloop:
+ addq $0x40,%rax
+ vmovdqa -0x40(%rax),%ymm8
+ vmovdqa -0x20(%rax),%ymm9
+ vpermi2d %ymm7,%ymm6,%ymm8
+ vpermi2d %ymm7,%ymm6,%ymm9
+ vmovdqa %ymm8,%ymm6
+ vmovdqa %ymm9,%ymm7
+ vpaddd %xmm8,%xmm0,%xmm0
+ vpaddd %xmm1,%xmm0,%xmm0
+ vpxor %xmm0,%xmm3,%xmm3
+ vprord $0x10,%xmm3,%xmm3
+ vpaddd %xmm3,%xmm2,%xmm2
+ vpxor %xmm2,%xmm1,%xmm1
+ vprord $0xc,%xmm1,%xmm1
+ vextracti128 $0x1,%ymm8,%xmm8
+ vpaddd %xmm8,%xmm0,%xmm0
+ vpaddd %xmm1,%xmm0,%xmm0
+ vpxor %xmm0,%xmm3,%xmm3
+ vprord $0x8,%xmm3,%xmm3
+ vpaddd %xmm3,%xmm2,%xmm2
+ vpxor %xmm2,%xmm1,%xmm1
+ vprord $0x7,%xmm1,%xmm1
+ vpshufd $0x39,%xmm1,%xmm1
+ vpshufd $0x4e,%xmm2,%xmm2
+ vpshufd $0x93,%xmm3,%xmm3
+ vpaddd %xmm9,%xmm0,%xmm0
+ vpaddd %xmm1,%xmm0,%xmm0
+ vpxor %xmm0,%xmm3,%xmm3
+ vprord $0x10,%xmm3,%xmm3
+ vpaddd %xmm3,%xmm2,%xmm2
+ vpxor %xmm2,%xmm1,%xmm1
+ vprord $0xc,%xmm1,%xmm1
+ vextracti128 $0x1,%ymm9,%xmm9
+ vpaddd %xmm9,%xmm0,%xmm0
+ vpaddd %xmm1,%xmm0,%xmm0
+ vpxor %xmm0,%xmm3,%xmm3
+ vprord $0x8,%xmm3,%xmm3
+ vpaddd %xmm3,%xmm2,%xmm2
+ vpxor %xmm2,%xmm1,%xmm1
+ vprord $0x7,%xmm1,%xmm1
+ vpshufd $0x93,%xmm1,%xmm1
+ vpshufd $0x4e,%xmm2,%xmm2
+ vpshufd $0x39,%xmm3,%xmm3
+ decb %cl
+ jne .Lblake2s_compress_avx512_roundloop
+ vpxor %xmm10,%xmm0,%xmm0
+ vpxor %xmm11,%xmm1,%xmm1
+ vpxor %xmm2,%xmm0,%xmm0
+ vpxor %xmm3,%xmm1,%xmm1
+ decq %rdx
+ jne .Lblake2s_compress_avx512_mainloop
+ vmovdqu %xmm0,(%rdi)
+ vmovdqu %xmm1,0x10(%rdi)
+ vmovdqu %xmm4,0x20(%rdi)
+ vzeroupper
+ retq
+ENDPROC(blake2s_compress_avx512)
+#endif /* CONFIG_AS_AVX512 */
diff --git a/lib/zinc/blake2s/blake2s.c b/lib/zinc/blake2s/blake2s.c
index 58d7e9378bd4..59db1ce2f7ef 100644
--- a/lib/zinc/blake2s/blake2s.c
+++ b/lib/zinc/blake2s/blake2s.c
@@ -110,6 +110,9 @@ void blake2s_init_key(struct blake2s_state *state, const size_t outlen,
}
EXPORT_SYMBOL(blake2s_init_key);
+#if defined(CONFIG_ZINC_ARCH_X86_64)
+#include "blake2s-x86_64-glue.c"
+#else
static bool *const blake2s_nobs[] __initconst = { };
static void __init blake2s_fpu_init(void)
{
@@ -120,6 +123,7 @@ static inline bool blake2s_compress_arch(struct blake2s_state *state,
{
return false;
}
+#endif
static inline void blake2s_compress(struct blake2s_state *state,
const u8 *block, size_t nblocks,
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 21/28] zinc: Curve25519 generic C implementations and selftest
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (17 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 20/28] zinc: BLAKE2s x86_64 implementation Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 22/28] zinc: Curve25519 x86_64 implementation Jason A. Donenfeld
` (4 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Karthikeyan Bhargavan, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This contains two formally verified C implementations of the Curve25519
scalar multiplication function, one for 32-bit systems, and one for
64-bit systems whose compiler supports efficient 128-bit integer types.
Not only are these implementations formally verified, but they are also
the fastest available C implementations. They have been modified to be
friendly to kernel space and to be generally less horrendous looking,
but still an effort has been made to retain their formally verified
characteristic, and so the C might look slightly unidiomatic.
The 64-bit version comes from HACL*: https://github.com/project-everest/hacl-star
The 32-bit version comes from Fiat: https://github.com/mit-plv/fiat-crypto
Information: https://cr.yp.to/ecdh.html
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Karthikeyan Bhargavan <karthik.bhargavan@gmail.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
include/zinc/curve25519.h | 22 +
lib/zinc/Kconfig | 4 +
lib/zinc/Makefile | 3 +
lib/zinc/curve25519/curve25519-fiat32.h | 860 +++++++++++++++
lib/zinc/curve25519/curve25519-hacl64.h | 784 ++++++++++++++
lib/zinc/curve25519/curve25519.c | 108 ++
lib/zinc/selftest/curve25519.c | 1315 +++++++++++++++++++++++
7 files changed, 3096 insertions(+)
create mode 100644 include/zinc/curve25519.h
create mode 100644 lib/zinc/curve25519/curve25519-fiat32.h
create mode 100644 lib/zinc/curve25519/curve25519-hacl64.h
create mode 100644 lib/zinc/curve25519/curve25519.c
create mode 100644 lib/zinc/selftest/curve25519.c
diff --git a/include/zinc/curve25519.h b/include/zinc/curve25519.h
new file mode 100644
index 000000000000..def173e736fc
--- /dev/null
+++ b/include/zinc/curve25519.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _ZINC_CURVE25519_H
+#define _ZINC_CURVE25519_H
+
+#include <linux/types.h>
+
+enum curve25519_lengths {
+ CURVE25519_KEY_SIZE = 32
+};
+
+bool __must_check curve25519(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE]);
+void curve25519_generate_secret(u8 secret[CURVE25519_KEY_SIZE]);
+bool __must_check curve25519_generate_public(
+ u8 pub[CURVE25519_KEY_SIZE], const u8 secret[CURVE25519_KEY_SIZE]);
+
+#endif /* _ZINC_CURVE25519_H */
diff --git a/lib/zinc/Kconfig b/lib/zinc/Kconfig
index 9fc21f93ee9f..f1840c4e9fde 100644
--- a/lib/zinc/Kconfig
+++ b/lib/zinc/Kconfig
@@ -14,6 +14,10 @@ config ZINC_CHACHA20POLY1305
config ZINC_BLAKE2S
tristate
+config ZINC_CURVE25519
+ tristate
+ select CONFIG_CRYPTO
+
config ZINC_SELFTEST
bool "Zinc cryptography library self-tests"
help
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 67ad837c822c..65440438c6e5 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -25,3 +25,6 @@ obj-$(CONFIG_ZINC_CHACHA20POLY1305) += zinc_chacha20poly1305.o
zinc_blake2s-y := blake2s/blake2s.o
zinc_blake2s-$(CONFIG_ZINC_ARCH_X86_64) += blake2s/blake2s-x86_64.o
obj-$(CONFIG_ZINC_BLAKE2S) += zinc_blake2s.o
+
+zinc_curve25519-y := curve25519/curve25519.o
+obj-$(CONFIG_ZINC_CURVE25519) += zinc_curve25519.o
diff --git a/lib/zinc/curve25519/curve25519-fiat32.h b/lib/zinc/curve25519/curve25519-fiat32.h
new file mode 100644
index 000000000000..32b5ec7aa040
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-fiat32.h
@@ -0,0 +1,860 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2015-2016 The fiat-crypto Authors.
+ * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is a machine-generated formally verified implementation of Curve25519
+ * ECDH from: <https://github.com/mit-plv/fiat-crypto>. Though originally
+ * machine generated, it has been tweaked to be suitable for use in the kernel.
+ * It is optimized for 32-bit machines and machines that cannot work efficiently
+ * with 128-bit integer types.
+ */
+
+/* fe means field element. Here the field is \Z/(2^255-19). An element t,
+ * entries t[0]...t[9], represents the integer t[0]+2^26 t[1]+2^51 t[2]+2^77
+ * t[3]+2^102 t[4]+...+2^230 t[9].
+ * fe limbs are bounded by 1.125*2^26,1.125*2^25,1.125*2^26,1.125*2^25,etc.
+ * Multiplication and carrying produce fe from fe_loose.
+ */
+typedef struct fe { u32 v[10]; } fe;
+
+/* fe_loose limbs are bounded by 3.375*2^26,3.375*2^25,3.375*2^26,3.375*2^25,etc
+ * Addition and subtraction produce fe_loose from (fe, fe).
+ */
+typedef struct fe_loose { u32 v[10]; } fe_loose;
+
+static __always_inline void fe_frombytes_impl(u32 h[10], const u8 *s)
+{
+ /* Ignores top bit of s. */
+ u32 a0 = get_unaligned_le32(s);
+ u32 a1 = get_unaligned_le32(s+4);
+ u32 a2 = get_unaligned_le32(s+8);
+ u32 a3 = get_unaligned_le32(s+12);
+ u32 a4 = get_unaligned_le32(s+16);
+ u32 a5 = get_unaligned_le32(s+20);
+ u32 a6 = get_unaligned_le32(s+24);
+ u32 a7 = get_unaligned_le32(s+28);
+ h[0] = a0&((1<<26)-1); /* 26 used, 32-26 left. 26 */
+ h[1] = (a0>>26) | ((a1&((1<<19)-1))<< 6); /* (32-26) + 19 = 6+19 = 25 */
+ h[2] = (a1>>19) | ((a2&((1<<13)-1))<<13); /* (32-19) + 13 = 13+13 = 26 */
+ h[3] = (a2>>13) | ((a3&((1<< 6)-1))<<19); /* (32-13) + 6 = 19+ 6 = 25 */
+ h[4] = (a3>> 6); /* (32- 6) = 26 */
+ h[5] = a4&((1<<25)-1); /* 25 */
+ h[6] = (a4>>25) | ((a5&((1<<19)-1))<< 7); /* (32-25) + 19 = 7+19 = 26 */
+ h[7] = (a5>>19) | ((a6&((1<<12)-1))<<13); /* (32-19) + 12 = 13+12 = 25 */
+ h[8] = (a6>>12) | ((a7&((1<< 6)-1))<<20); /* (32-12) + 6 = 20+ 6 = 26 */
+ h[9] = (a7>> 6)&((1<<25)-1); /* 25 */
+}
+
+static __always_inline void fe_frombytes(fe *h, const u8 *s)
+{
+ fe_frombytes_impl(h->v, s);
+}
+
+static __always_inline u8 /*bool*/
+addcarryx_u25(u8 /*bool*/ c, u32 a, u32 b, u32 *low)
+{
+ /* This function extracts 25 bits of result and 1 bit of carry
+ * (26 total), so a 32-bit intermediate is sufficient.
+ */
+ u32 x = a + b + c;
+ *low = x & ((1 << 25) - 1);
+ return (x >> 25) & 1;
+}
+
+static __always_inline u8 /*bool*/
+addcarryx_u26(u8 /*bool*/ c, u32 a, u32 b, u32 *low)
+{
+ /* This function extracts 26 bits of result and 1 bit of carry
+ * (27 total), so a 32-bit intermediate is sufficient.
+ */
+ u32 x = a + b + c;
+ *low = x & ((1 << 26) - 1);
+ return (x >> 26) & 1;
+}
+
+static __always_inline u8 /*bool*/
+subborrow_u25(u8 /*bool*/ c, u32 a, u32 b, u32 *low)
+{
+ /* This function extracts 25 bits of result and 1 bit of borrow
+ * (26 total), so a 32-bit intermediate is sufficient.
+ */
+ u32 x = a - b - c;
+ *low = x & ((1 << 25) - 1);
+ return x >> 31;
+}
+
+static __always_inline u8 /*bool*/
+subborrow_u26(u8 /*bool*/ c, u32 a, u32 b, u32 *low)
+{
+ /* This function extracts 26 bits of result and 1 bit of borrow
+ *(27 total), so a 32-bit intermediate is sufficient.
+ */
+ u32 x = a - b - c;
+ *low = x & ((1 << 26) - 1);
+ return x >> 31;
+}
+
+static __always_inline u32 cmovznz32(u32 t, u32 z, u32 nz)
+{
+ t = -!!t; /* all set if nonzero, 0 if 0 */
+ return (t&nz) | ((~t)&z);
+}
+
+static __always_inline void fe_freeze(u32 out[10], const u32 in1[10])
+{
+ { const u32 x17 = in1[9];
+ { const u32 x18 = in1[8];
+ { const u32 x16 = in1[7];
+ { const u32 x14 = in1[6];
+ { const u32 x12 = in1[5];
+ { const u32 x10 = in1[4];
+ { const u32 x8 = in1[3];
+ { const u32 x6 = in1[2];
+ { const u32 x4 = in1[1];
+ { const u32 x2 = in1[0];
+ { u32 x20; u8/*bool*/ x21 = subborrow_u26(0x0, x2, 0x3ffffed, &x20);
+ { u32 x23; u8/*bool*/ x24 = subborrow_u25(x21, x4, 0x1ffffff, &x23);
+ { u32 x26; u8/*bool*/ x27 = subborrow_u26(x24, x6, 0x3ffffff, &x26);
+ { u32 x29; u8/*bool*/ x30 = subborrow_u25(x27, x8, 0x1ffffff, &x29);
+ { u32 x32; u8/*bool*/ x33 = subborrow_u26(x30, x10, 0x3ffffff, &x32);
+ { u32 x35; u8/*bool*/ x36 = subborrow_u25(x33, x12, 0x1ffffff, &x35);
+ { u32 x38; u8/*bool*/ x39 = subborrow_u26(x36, x14, 0x3ffffff, &x38);
+ { u32 x41; u8/*bool*/ x42 = subborrow_u25(x39, x16, 0x1ffffff, &x41);
+ { u32 x44; u8/*bool*/ x45 = subborrow_u26(x42, x18, 0x3ffffff, &x44);
+ { u32 x47; u8/*bool*/ x48 = subborrow_u25(x45, x17, 0x1ffffff, &x47);
+ { u32 x49 = cmovznz32(x48, 0x0, 0xffffffff);
+ { u32 x50 = (x49 & 0x3ffffed);
+ { u32 x52; u8/*bool*/ x53 = addcarryx_u26(0x0, x20, x50, &x52);
+ { u32 x54 = (x49 & 0x1ffffff);
+ { u32 x56; u8/*bool*/ x57 = addcarryx_u25(x53, x23, x54, &x56);
+ { u32 x58 = (x49 & 0x3ffffff);
+ { u32 x60; u8/*bool*/ x61 = addcarryx_u26(x57, x26, x58, &x60);
+ { u32 x62 = (x49 & 0x1ffffff);
+ { u32 x64; u8/*bool*/ x65 = addcarryx_u25(x61, x29, x62, &x64);
+ { u32 x66 = (x49 & 0x3ffffff);
+ { u32 x68; u8/*bool*/ x69 = addcarryx_u26(x65, x32, x66, &x68);
+ { u32 x70 = (x49 & 0x1ffffff);
+ { u32 x72; u8/*bool*/ x73 = addcarryx_u25(x69, x35, x70, &x72);
+ { u32 x74 = (x49 & 0x3ffffff);
+ { u32 x76; u8/*bool*/ x77 = addcarryx_u26(x73, x38, x74, &x76);
+ { u32 x78 = (x49 & 0x1ffffff);
+ { u32 x80; u8/*bool*/ x81 = addcarryx_u25(x77, x41, x78, &x80);
+ { u32 x82 = (x49 & 0x3ffffff);
+ { u32 x84; u8/*bool*/ x85 = addcarryx_u26(x81, x44, x82, &x84);
+ { u32 x86 = (x49 & 0x1ffffff);
+ { u32 x88; addcarryx_u25(x85, x47, x86, &x88);
+ out[0] = x52;
+ out[1] = x56;
+ out[2] = x60;
+ out[3] = x64;
+ out[4] = x68;
+ out[5] = x72;
+ out[6] = x76;
+ out[7] = x80;
+ out[8] = x84;
+ out[9] = x88;
+ }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
+}
+
+static __always_inline void fe_tobytes(u8 s[32], const fe *f)
+{
+ u32 h[10];
+ fe_freeze(h, f->v);
+ s[0] = h[0] >> 0;
+ s[1] = h[0] >> 8;
+ s[2] = h[0] >> 16;
+ s[3] = (h[0] >> 24) | (h[1] << 2);
+ s[4] = h[1] >> 6;
+ s[5] = h[1] >> 14;
+ s[6] = (h[1] >> 22) | (h[2] << 3);
+ s[7] = h[2] >> 5;
+ s[8] = h[2] >> 13;
+ s[9] = (h[2] >> 21) | (h[3] << 5);
+ s[10] = h[3] >> 3;
+ s[11] = h[3] >> 11;
+ s[12] = (h[3] >> 19) | (h[4] << 6);
+ s[13] = h[4] >> 2;
+ s[14] = h[4] >> 10;
+ s[15] = h[4] >> 18;
+ s[16] = h[5] >> 0;
+ s[17] = h[5] >> 8;
+ s[18] = h[5] >> 16;
+ s[19] = (h[5] >> 24) | (h[6] << 1);
+ s[20] = h[6] >> 7;
+ s[21] = h[6] >> 15;
+ s[22] = (h[6] >> 23) | (h[7] << 3);
+ s[23] = h[7] >> 5;
+ s[24] = h[7] >> 13;
+ s[25] = (h[7] >> 21) | (h[8] << 4);
+ s[26] = h[8] >> 4;
+ s[27] = h[8] >> 12;
+ s[28] = (h[8] >> 20) | (h[9] << 6);
+ s[29] = h[9] >> 2;
+ s[30] = h[9] >> 10;
+ s[31] = h[9] >> 18;
+}
+
+/* h = f */
+static __always_inline void fe_copy(fe *h, const fe *f)
+{
+ memmove(h, f, sizeof(u32) * 10);
+}
+
+static __always_inline void fe_copy_lt(fe_loose *h, const fe *f)
+{
+ memmove(h, f, sizeof(u32) * 10);
+}
+
+/* h = 0 */
+static __always_inline void fe_0(fe *h)
+{
+ memset(h, 0, sizeof(u32) * 10);
+}
+
+/* h = 1 */
+static __always_inline void fe_1(fe *h)
+{
+ memset(h, 0, sizeof(u32) * 10);
+ h->v[0] = 1;
+}
+
+static void fe_add_impl(u32 out[10], const u32 in1[10], const u32 in2[10])
+{
+ { const u32 x20 = in1[9];
+ { const u32 x21 = in1[8];
+ { const u32 x19 = in1[7];
+ { const u32 x17 = in1[6];
+ { const u32 x15 = in1[5];
+ { const u32 x13 = in1[4];
+ { const u32 x11 = in1[3];
+ { const u32 x9 = in1[2];
+ { const u32 x7 = in1[1];
+ { const u32 x5 = in1[0];
+ { const u32 x38 = in2[9];
+ { const u32 x39 = in2[8];
+ { const u32 x37 = in2[7];
+ { const u32 x35 = in2[6];
+ { const u32 x33 = in2[5];
+ { const u32 x31 = in2[4];
+ { const u32 x29 = in2[3];
+ { const u32 x27 = in2[2];
+ { const u32 x25 = in2[1];
+ { const u32 x23 = in2[0];
+ out[0] = (x5 + x23);
+ out[1] = (x7 + x25);
+ out[2] = (x9 + x27);
+ out[3] = (x11 + x29);
+ out[4] = (x13 + x31);
+ out[5] = (x15 + x33);
+ out[6] = (x17 + x35);
+ out[7] = (x19 + x37);
+ out[8] = (x21 + x39);
+ out[9] = (x20 + x38);
+ }}}}}}}}}}}}}}}}}}}}
+}
+
+/* h = f + g
+ * Can overlap h with f or g.
+ */
+static __always_inline void fe_add(fe_loose *h, const fe *f, const fe *g)
+{
+ fe_add_impl(h->v, f->v, g->v);
+}
+
+static void fe_sub_impl(u32 out[10], const u32 in1[10], const u32 in2[10])
+{
+ { const u32 x20 = in1[9];
+ { const u32 x21 = in1[8];
+ { const u32 x19 = in1[7];
+ { const u32 x17 = in1[6];
+ { const u32 x15 = in1[5];
+ { const u32 x13 = in1[4];
+ { const u32 x11 = in1[3];
+ { const u32 x9 = in1[2];
+ { const u32 x7 = in1[1];
+ { const u32 x5 = in1[0];
+ { const u32 x38 = in2[9];
+ { const u32 x39 = in2[8];
+ { const u32 x37 = in2[7];
+ { const u32 x35 = in2[6];
+ { const u32 x33 = in2[5];
+ { const u32 x31 = in2[4];
+ { const u32 x29 = in2[3];
+ { const u32 x27 = in2[2];
+ { const u32 x25 = in2[1];
+ { const u32 x23 = in2[0];
+ out[0] = ((0x7ffffda + x5) - x23);
+ out[1] = ((0x3fffffe + x7) - x25);
+ out[2] = ((0x7fffffe + x9) - x27);
+ out[3] = ((0x3fffffe + x11) - x29);
+ out[4] = ((0x7fffffe + x13) - x31);
+ out[5] = ((0x3fffffe + x15) - x33);
+ out[6] = ((0x7fffffe + x17) - x35);
+ out[7] = ((0x3fffffe + x19) - x37);
+ out[8] = ((0x7fffffe + x21) - x39);
+ out[9] = ((0x3fffffe + x20) - x38);
+ }}}}}}}}}}}}}}}}}}}}
+}
+
+/* h = f - g
+ * Can overlap h with f or g.
+ */
+static __always_inline void fe_sub(fe_loose *h, const fe *f, const fe *g)
+{
+ fe_sub_impl(h->v, f->v, g->v);
+}
+
+static void fe_mul_impl(u32 out[10], const u32 in1[10], const u32 in2[10])
+{
+ { const u32 x20 = in1[9];
+ { const u32 x21 = in1[8];
+ { const u32 x19 = in1[7];
+ { const u32 x17 = in1[6];
+ { const u32 x15 = in1[5];
+ { const u32 x13 = in1[4];
+ { const u32 x11 = in1[3];
+ { const u32 x9 = in1[2];
+ { const u32 x7 = in1[1];
+ { const u32 x5 = in1[0];
+ { const u32 x38 = in2[9];
+ { const u32 x39 = in2[8];
+ { const u32 x37 = in2[7];
+ { const u32 x35 = in2[6];
+ { const u32 x33 = in2[5];
+ { const u32 x31 = in2[4];
+ { const u32 x29 = in2[3];
+ { const u32 x27 = in2[2];
+ { const u32 x25 = in2[1];
+ { const u32 x23 = in2[0];
+ { u64 x40 = ((u64)x23 * x5);
+ { u64 x41 = (((u64)x23 * x7) + ((u64)x25 * x5));
+ { u64 x42 = ((((u64)(0x2 * x25) * x7) + ((u64)x23 * x9)) + ((u64)x27 * x5));
+ { u64 x43 = (((((u64)x25 * x9) + ((u64)x27 * x7)) + ((u64)x23 * x11)) + ((u64)x29 * x5));
+ { u64 x44 = (((((u64)x27 * x9) + (0x2 * (((u64)x25 * x11) + ((u64)x29 * x7)))) + ((u64)x23 * x13)) + ((u64)x31 * x5));
+ { u64 x45 = (((((((u64)x27 * x11) + ((u64)x29 * x9)) + ((u64)x25 * x13)) + ((u64)x31 * x7)) + ((u64)x23 * x15)) + ((u64)x33 * x5));
+ { u64 x46 = (((((0x2 * ((((u64)x29 * x11) + ((u64)x25 * x15)) + ((u64)x33 * x7))) + ((u64)x27 * x13)) + ((u64)x31 * x9)) + ((u64)x23 * x17)) + ((u64)x35 * x5));
+ { u64 x47 = (((((((((u64)x29 * x13) + ((u64)x31 * x11)) + ((u64)x27 * x15)) + ((u64)x33 * x9)) + ((u64)x25 * x17)) + ((u64)x35 * x7)) + ((u64)x23 * x19)) + ((u64)x37 * x5));
+ { u64 x48 = (((((((u64)x31 * x13) + (0x2 * (((((u64)x29 * x15) + ((u64)x33 * x11)) + ((u64)x25 * x19)) + ((u64)x37 * x7)))) + ((u64)x27 * x17)) + ((u64)x35 * x9)) + ((u64)x23 * x21)) + ((u64)x39 * x5));
+ { u64 x49 = (((((((((((u64)x31 * x15) + ((u64)x33 * x13)) + ((u64)x29 * x17)) + ((u64)x35 * x11)) + ((u64)x27 * x19)) + ((u64)x37 * x9)) + ((u64)x25 * x21)) + ((u64)x39 * x7)) + ((u64)x23 * x20)) + ((u64)x38 * x5));
+ { u64 x50 = (((((0x2 * ((((((u64)x33 * x15) + ((u64)x29 * x19)) + ((u64)x37 * x11)) + ((u64)x25 * x20)) + ((u64)x38 * x7))) + ((u64)x31 * x17)) + ((u64)x35 * x13)) + ((u64)x27 * x21)) + ((u64)x39 * x9));
+ { u64 x51 = (((((((((u64)x33 * x17) + ((u64)x35 * x15)) + ((u64)x31 * x19)) + ((u64)x37 * x13)) + ((u64)x29 * x21)) + ((u64)x39 * x11)) + ((u64)x27 * x20)) + ((u64)x38 * x9));
+ { u64 x52 = (((((u64)x35 * x17) + (0x2 * (((((u64)x33 * x19) + ((u64)x37 * x15)) + ((u64)x29 * x20)) + ((u64)x38 * x11)))) + ((u64)x31 * x21)) + ((u64)x39 * x13));
+ { u64 x53 = (((((((u64)x35 * x19) + ((u64)x37 * x17)) + ((u64)x33 * x21)) + ((u64)x39 * x15)) + ((u64)x31 * x20)) + ((u64)x38 * x13));
+ { u64 x54 = (((0x2 * ((((u64)x37 * x19) + ((u64)x33 * x20)) + ((u64)x38 * x15))) + ((u64)x35 * x21)) + ((u64)x39 * x17));
+ { u64 x55 = (((((u64)x37 * x21) + ((u64)x39 * x19)) + ((u64)x35 * x20)) + ((u64)x38 * x17));
+ { u64 x56 = (((u64)x39 * x21) + (0x2 * (((u64)x37 * x20) + ((u64)x38 * x19))));
+ { u64 x57 = (((u64)x39 * x20) + ((u64)x38 * x21));
+ { u64 x58 = ((u64)(0x2 * x38) * x20);
+ { u64 x59 = (x48 + (x58 << 0x4));
+ { u64 x60 = (x59 + (x58 << 0x1));
+ { u64 x61 = (x60 + x58);
+ { u64 x62 = (x47 + (x57 << 0x4));
+ { u64 x63 = (x62 + (x57 << 0x1));
+ { u64 x64 = (x63 + x57);
+ { u64 x65 = (x46 + (x56 << 0x4));
+ { u64 x66 = (x65 + (x56 << 0x1));
+ { u64 x67 = (x66 + x56);
+ { u64 x68 = (x45 + (x55 << 0x4));
+ { u64 x69 = (x68 + (x55 << 0x1));
+ { u64 x70 = (x69 + x55);
+ { u64 x71 = (x44 + (x54 << 0x4));
+ { u64 x72 = (x71 + (x54 << 0x1));
+ { u64 x73 = (x72 + x54);
+ { u64 x74 = (x43 + (x53 << 0x4));
+ { u64 x75 = (x74 + (x53 << 0x1));
+ { u64 x76 = (x75 + x53);
+ { u64 x77 = (x42 + (x52 << 0x4));
+ { u64 x78 = (x77 + (x52 << 0x1));
+ { u64 x79 = (x78 + x52);
+ { u64 x80 = (x41 + (x51 << 0x4));
+ { u64 x81 = (x80 + (x51 << 0x1));
+ { u64 x82 = (x81 + x51);
+ { u64 x83 = (x40 + (x50 << 0x4));
+ { u64 x84 = (x83 + (x50 << 0x1));
+ { u64 x85 = (x84 + x50);
+ { u64 x86 = (x85 >> 0x1a);
+ { u32 x87 = ((u32)x85 & 0x3ffffff);
+ { u64 x88 = (x86 + x82);
+ { u64 x89 = (x88 >> 0x19);
+ { u32 x90 = ((u32)x88 & 0x1ffffff);
+ { u64 x91 = (x89 + x79);
+ { u64 x92 = (x91 >> 0x1a);
+ { u32 x93 = ((u32)x91 & 0x3ffffff);
+ { u64 x94 = (x92 + x76);
+ { u64 x95 = (x94 >> 0x19);
+ { u32 x96 = ((u32)x94 & 0x1ffffff);
+ { u64 x97 = (x95 + x73);
+ { u64 x98 = (x97 >> 0x1a);
+ { u32 x99 = ((u32)x97 & 0x3ffffff);
+ { u64 x100 = (x98 + x70);
+ { u64 x101 = (x100 >> 0x19);
+ { u32 x102 = ((u32)x100 & 0x1ffffff);
+ { u64 x103 = (x101 + x67);
+ { u64 x104 = (x103 >> 0x1a);
+ { u32 x105 = ((u32)x103 & 0x3ffffff);
+ { u64 x106 = (x104 + x64);
+ { u64 x107 = (x106 >> 0x19);
+ { u32 x108 = ((u32)x106 & 0x1ffffff);
+ { u64 x109 = (x107 + x61);
+ { u64 x110 = (x109 >> 0x1a);
+ { u32 x111 = ((u32)x109 & 0x3ffffff);
+ { u64 x112 = (x110 + x49);
+ { u64 x113 = (x112 >> 0x19);
+ { u32 x114 = ((u32)x112 & 0x1ffffff);
+ { u64 x115 = (x87 + (0x13 * x113));
+ { u32 x116 = (u32) (x115 >> 0x1a);
+ { u32 x117 = ((u32)x115 & 0x3ffffff);
+ { u32 x118 = (x116 + x90);
+ { u32 x119 = (x118 >> 0x19);
+ { u32 x120 = (x118 & 0x1ffffff);
+ out[0] = x117;
+ out[1] = x120;
+ out[2] = (x119 + x93);
+ out[3] = x96;
+ out[4] = x99;
+ out[5] = x102;
+ out[6] = x105;
+ out[7] = x108;
+ out[8] = x111;
+ out[9] = x114;
+ }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
+}
+
+static __always_inline void fe_mul_ttt(fe *h, const fe *f, const fe *g)
+{
+ fe_mul_impl(h->v, f->v, g->v);
+}
+
+static __always_inline void fe_mul_tlt(fe *h, const fe_loose *f, const fe *g)
+{
+ fe_mul_impl(h->v, f->v, g->v);
+}
+
+static __always_inline void
+fe_mul_tll(fe *h, const fe_loose *f, const fe_loose *g)
+{
+ fe_mul_impl(h->v, f->v, g->v);
+}
+
+static void fe_sqr_impl(u32 out[10], const u32 in1[10])
+{
+ { const u32 x17 = in1[9];
+ { const u32 x18 = in1[8];
+ { const u32 x16 = in1[7];
+ { const u32 x14 = in1[6];
+ { const u32 x12 = in1[5];
+ { const u32 x10 = in1[4];
+ { const u32 x8 = in1[3];
+ { const u32 x6 = in1[2];
+ { const u32 x4 = in1[1];
+ { const u32 x2 = in1[0];
+ { u64 x19 = ((u64)x2 * x2);
+ { u64 x20 = ((u64)(0x2 * x2) * x4);
+ { u64 x21 = (0x2 * (((u64)x4 * x4) + ((u64)x2 * x6)));
+ { u64 x22 = (0x2 * (((u64)x4 * x6) + ((u64)x2 * x8)));
+ { u64 x23 = ((((u64)x6 * x6) + ((u64)(0x4 * x4) * x8)) + ((u64)(0x2 * x2) * x10));
+ { u64 x24 = (0x2 * ((((u64)x6 * x8) + ((u64)x4 * x10)) + ((u64)x2 * x12)));
+ { u64 x25 = (0x2 * (((((u64)x8 * x8) + ((u64)x6 * x10)) + ((u64)x2 * x14)) + ((u64)(0x2 * x4) * x12)));
+ { u64 x26 = (0x2 * (((((u64)x8 * x10) + ((u64)x6 * x12)) + ((u64)x4 * x14)) + ((u64)x2 * x16)));
+ { u64 x27 = (((u64)x10 * x10) + (0x2 * ((((u64)x6 * x14) + ((u64)x2 * x18)) + (0x2 * (((u64)x4 * x16) + ((u64)x8 * x12))))));
+ { u64 x28 = (0x2 * ((((((u64)x10 * x12) + ((u64)x8 * x14)) + ((u64)x6 * x16)) + ((u64)x4 * x18)) + ((u64)x2 * x17)));
+ { u64 x29 = (0x2 * (((((u64)x12 * x12) + ((u64)x10 * x14)) + ((u64)x6 * x18)) + (0x2 * (((u64)x8 * x16) + ((u64)x4 * x17)))));
+ { u64 x30 = (0x2 * (((((u64)x12 * x14) + ((u64)x10 * x16)) + ((u64)x8 * x18)) + ((u64)x6 * x17)));
+ { u64 x31 = (((u64)x14 * x14) + (0x2 * (((u64)x10 * x18) + (0x2 * (((u64)x12 * x16) + ((u64)x8 * x17))))));
+ { u64 x32 = (0x2 * ((((u64)x14 * x16) + ((u64)x12 * x18)) + ((u64)x10 * x17)));
+ { u64 x33 = (0x2 * ((((u64)x16 * x16) + ((u64)x14 * x18)) + ((u64)(0x2 * x12) * x17)));
+ { u64 x34 = (0x2 * (((u64)x16 * x18) + ((u64)x14 * x17)));
+ { u64 x35 = (((u64)x18 * x18) + ((u64)(0x4 * x16) * x17));
+ { u64 x36 = ((u64)(0x2 * x18) * x17);
+ { u64 x37 = ((u64)(0x2 * x17) * x17);
+ { u64 x38 = (x27 + (x37 << 0x4));
+ { u64 x39 = (x38 + (x37 << 0x1));
+ { u64 x40 = (x39 + x37);
+ { u64 x41 = (x26 + (x36 << 0x4));
+ { u64 x42 = (x41 + (x36 << 0x1));
+ { u64 x43 = (x42 + x36);
+ { u64 x44 = (x25 + (x35 << 0x4));
+ { u64 x45 = (x44 + (x35 << 0x1));
+ { u64 x46 = (x45 + x35);
+ { u64 x47 = (x24 + (x34 << 0x4));
+ { u64 x48 = (x47 + (x34 << 0x1));
+ { u64 x49 = (x48 + x34);
+ { u64 x50 = (x23 + (x33 << 0x4));
+ { u64 x51 = (x50 + (x33 << 0x1));
+ { u64 x52 = (x51 + x33);
+ { u64 x53 = (x22 + (x32 << 0x4));
+ { u64 x54 = (x53 + (x32 << 0x1));
+ { u64 x55 = (x54 + x32);
+ { u64 x56 = (x21 + (x31 << 0x4));
+ { u64 x57 = (x56 + (x31 << 0x1));
+ { u64 x58 = (x57 + x31);
+ { u64 x59 = (x20 + (x30 << 0x4));
+ { u64 x60 = (x59 + (x30 << 0x1));
+ { u64 x61 = (x60 + x30);
+ { u64 x62 = (x19 + (x29 << 0x4));
+ { u64 x63 = (x62 + (x29 << 0x1));
+ { u64 x64 = (x63 + x29);
+ { u64 x65 = (x64 >> 0x1a);
+ { u32 x66 = ((u32)x64 & 0x3ffffff);
+ { u64 x67 = (x65 + x61);
+ { u64 x68 = (x67 >> 0x19);
+ { u32 x69 = ((u32)x67 & 0x1ffffff);
+ { u64 x70 = (x68 + x58);
+ { u64 x71 = (x70 >> 0x1a);
+ { u32 x72 = ((u32)x70 & 0x3ffffff);
+ { u64 x73 = (x71 + x55);
+ { u64 x74 = (x73 >> 0x19);
+ { u32 x75 = ((u32)x73 & 0x1ffffff);
+ { u64 x76 = (x74 + x52);
+ { u64 x77 = (x76 >> 0x1a);
+ { u32 x78 = ((u32)x76 & 0x3ffffff);
+ { u64 x79 = (x77 + x49);
+ { u64 x80 = (x79 >> 0x19);
+ { u32 x81 = ((u32)x79 & 0x1ffffff);
+ { u64 x82 = (x80 + x46);
+ { u64 x83 = (x82 >> 0x1a);
+ { u32 x84 = ((u32)x82 & 0x3ffffff);
+ { u64 x85 = (x83 + x43);
+ { u64 x86 = (x85 >> 0x19);
+ { u32 x87 = ((u32)x85 & 0x1ffffff);
+ { u64 x88 = (x86 + x40);
+ { u64 x89 = (x88 >> 0x1a);
+ { u32 x90 = ((u32)x88 & 0x3ffffff);
+ { u64 x91 = (x89 + x28);
+ { u64 x92 = (x91 >> 0x19);
+ { u32 x93 = ((u32)x91 & 0x1ffffff);
+ { u64 x94 = (x66 + (0x13 * x92));
+ { u32 x95 = (u32) (x94 >> 0x1a);
+ { u32 x96 = ((u32)x94 & 0x3ffffff);
+ { u32 x97 = (x95 + x69);
+ { u32 x98 = (x97 >> 0x19);
+ { u32 x99 = (x97 & 0x1ffffff);
+ out[0] = x96;
+ out[1] = x99;
+ out[2] = (x98 + x72);
+ out[3] = x75;
+ out[4] = x78;
+ out[5] = x81;
+ out[6] = x84;
+ out[7] = x87;
+ out[8] = x90;
+ out[9] = x93;
+ }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
+}
+
+static __always_inline void fe_sq_tl(fe *h, const fe_loose *f)
+{
+ fe_sqr_impl(h->v, f->v);
+}
+
+static __always_inline void fe_sq_tt(fe *h, const fe *f)
+{
+ fe_sqr_impl(h->v, f->v);
+}
+
+static __always_inline void fe_loose_invert(fe *out, const fe_loose *z)
+{
+ fe t0;
+ fe t1;
+ fe t2;
+ fe t3;
+ int i;
+
+ fe_sq_tl(&t0, z);
+ fe_sq_tt(&t1, &t0);
+ for (i = 1; i < 2; ++i)
+ fe_sq_tt(&t1, &t1);
+ fe_mul_tlt(&t1, z, &t1);
+ fe_mul_ttt(&t0, &t0, &t1);
+ fe_sq_tt(&t2, &t0);
+ fe_mul_ttt(&t1, &t1, &t2);
+ fe_sq_tt(&t2, &t1);
+ for (i = 1; i < 5; ++i)
+ fe_sq_tt(&t2, &t2);
+ fe_mul_ttt(&t1, &t2, &t1);
+ fe_sq_tt(&t2, &t1);
+ for (i = 1; i < 10; ++i)
+ fe_sq_tt(&t2, &t2);
+ fe_mul_ttt(&t2, &t2, &t1);
+ fe_sq_tt(&t3, &t2);
+ for (i = 1; i < 20; ++i)
+ fe_sq_tt(&t3, &t3);
+ fe_mul_ttt(&t2, &t3, &t2);
+ fe_sq_tt(&t2, &t2);
+ for (i = 1; i < 10; ++i)
+ fe_sq_tt(&t2, &t2);
+ fe_mul_ttt(&t1, &t2, &t1);
+ fe_sq_tt(&t2, &t1);
+ for (i = 1; i < 50; ++i)
+ fe_sq_tt(&t2, &t2);
+ fe_mul_ttt(&t2, &t2, &t1);
+ fe_sq_tt(&t3, &t2);
+ for (i = 1; i < 100; ++i)
+ fe_sq_tt(&t3, &t3);
+ fe_mul_ttt(&t2, &t3, &t2);
+ fe_sq_tt(&t2, &t2);
+ for (i = 1; i < 50; ++i)
+ fe_sq_tt(&t2, &t2);
+ fe_mul_ttt(&t1, &t2, &t1);
+ fe_sq_tt(&t1, &t1);
+ for (i = 1; i < 5; ++i)
+ fe_sq_tt(&t1, &t1);
+ fe_mul_ttt(out, &t1, &t0);
+}
+
+static __always_inline void fe_invert(fe *out, const fe *z)
+{
+ fe_loose l;
+ fe_copy_lt(&l, z);
+ fe_loose_invert(out, &l);
+}
+
+/* Replace (f,g) with (g,f) if b == 1;
+ * replace (f,g) with (f,g) if b == 0.
+ *
+ * Preconditions: b in {0,1}
+ */
+static __always_inline void fe_cswap(fe *f, fe *g, unsigned int b)
+{
+ unsigned i;
+ b = 0-b;
+ for (i = 0; i < 10; i++) {
+ u32 x = f->v[i] ^ g->v[i];
+ x &= b;
+ f->v[i] ^= x;
+ g->v[i] ^= x;
+ }
+}
+
+/* NOTE: based on fiat-crypto fe_mul, edited for in2=121666, 0, 0.*/
+static __always_inline void fe_mul_121666_impl(u32 out[10], const u32 in1[10])
+{
+ { const u32 x20 = in1[9];
+ { const u32 x21 = in1[8];
+ { const u32 x19 = in1[7];
+ { const u32 x17 = in1[6];
+ { const u32 x15 = in1[5];
+ { const u32 x13 = in1[4];
+ { const u32 x11 = in1[3];
+ { const u32 x9 = in1[2];
+ { const u32 x7 = in1[1];
+ { const u32 x5 = in1[0];
+ { const u32 x38 = 0;
+ { const u32 x39 = 0;
+ { const u32 x37 = 0;
+ { const u32 x35 = 0;
+ { const u32 x33 = 0;
+ { const u32 x31 = 0;
+ { const u32 x29 = 0;
+ { const u32 x27 = 0;
+ { const u32 x25 = 0;
+ { const u32 x23 = 121666;
+ { u64 x40 = ((u64)x23 * x5);
+ { u64 x41 = (((u64)x23 * x7) + ((u64)x25 * x5));
+ { u64 x42 = ((((u64)(0x2 * x25) * x7) + ((u64)x23 * x9)) + ((u64)x27 * x5));
+ { u64 x43 = (((((u64)x25 * x9) + ((u64)x27 * x7)) + ((u64)x23 * x11)) + ((u64)x29 * x5));
+ { u64 x44 = (((((u64)x27 * x9) + (0x2 * (((u64)x25 * x11) + ((u64)x29 * x7)))) + ((u64)x23 * x13)) + ((u64)x31 * x5));
+ { u64 x45 = (((((((u64)x27 * x11) + ((u64)x29 * x9)) + ((u64)x25 * x13)) + ((u64)x31 * x7)) + ((u64)x23 * x15)) + ((u64)x33 * x5));
+ { u64 x46 = (((((0x2 * ((((u64)x29 * x11) + ((u64)x25 * x15)) + ((u64)x33 * x7))) + ((u64)x27 * x13)) + ((u64)x31 * x9)) + ((u64)x23 * x17)) + ((u64)x35 * x5));
+ { u64 x47 = (((((((((u64)x29 * x13) + ((u64)x31 * x11)) + ((u64)x27 * x15)) + ((u64)x33 * x9)) + ((u64)x25 * x17)) + ((u64)x35 * x7)) + ((u64)x23 * x19)) + ((u64)x37 * x5));
+ { u64 x48 = (((((((u64)x31 * x13) + (0x2 * (((((u64)x29 * x15) + ((u64)x33 * x11)) + ((u64)x25 * x19)) + ((u64)x37 * x7)))) + ((u64)x27 * x17)) + ((u64)x35 * x9)) + ((u64)x23 * x21)) + ((u64)x39 * x5));
+ { u64 x49 = (((((((((((u64)x31 * x15) + ((u64)x33 * x13)) + ((u64)x29 * x17)) + ((u64)x35 * x11)) + ((u64)x27 * x19)) + ((u64)x37 * x9)) + ((u64)x25 * x21)) + ((u64)x39 * x7)) + ((u64)x23 * x20)) + ((u64)x38 * x5));
+ { u64 x50 = (((((0x2 * ((((((u64)x33 * x15) + ((u64)x29 * x19)) + ((u64)x37 * x11)) + ((u64)x25 * x20)) + ((u64)x38 * x7))) + ((u64)x31 * x17)) + ((u64)x35 * x13)) + ((u64)x27 * x21)) + ((u64)x39 * x9));
+ { u64 x51 = (((((((((u64)x33 * x17) + ((u64)x35 * x15)) + ((u64)x31 * x19)) + ((u64)x37 * x13)) + ((u64)x29 * x21)) + ((u64)x39 * x11)) + ((u64)x27 * x20)) + ((u64)x38 * x9));
+ { u64 x52 = (((((u64)x35 * x17) + (0x2 * (((((u64)x33 * x19) + ((u64)x37 * x15)) + ((u64)x29 * x20)) + ((u64)x38 * x11)))) + ((u64)x31 * x21)) + ((u64)x39 * x13));
+ { u64 x53 = (((((((u64)x35 * x19) + ((u64)x37 * x17)) + ((u64)x33 * x21)) + ((u64)x39 * x15)) + ((u64)x31 * x20)) + ((u64)x38 * x13));
+ { u64 x54 = (((0x2 * ((((u64)x37 * x19) + ((u64)x33 * x20)) + ((u64)x38 * x15))) + ((u64)x35 * x21)) + ((u64)x39 * x17));
+ { u64 x55 = (((((u64)x37 * x21) + ((u64)x39 * x19)) + ((u64)x35 * x20)) + ((u64)x38 * x17));
+ { u64 x56 = (((u64)x39 * x21) + (0x2 * (((u64)x37 * x20) + ((u64)x38 * x19))));
+ { u64 x57 = (((u64)x39 * x20) + ((u64)x38 * x21));
+ { u64 x58 = ((u64)(0x2 * x38) * x20);
+ { u64 x59 = (x48 + (x58 << 0x4));
+ { u64 x60 = (x59 + (x58 << 0x1));
+ { u64 x61 = (x60 + x58);
+ { u64 x62 = (x47 + (x57 << 0x4));
+ { u64 x63 = (x62 + (x57 << 0x1));
+ { u64 x64 = (x63 + x57);
+ { u64 x65 = (x46 + (x56 << 0x4));
+ { u64 x66 = (x65 + (x56 << 0x1));
+ { u64 x67 = (x66 + x56);
+ { u64 x68 = (x45 + (x55 << 0x4));
+ { u64 x69 = (x68 + (x55 << 0x1));
+ { u64 x70 = (x69 + x55);
+ { u64 x71 = (x44 + (x54 << 0x4));
+ { u64 x72 = (x71 + (x54 << 0x1));
+ { u64 x73 = (x72 + x54);
+ { u64 x74 = (x43 + (x53 << 0x4));
+ { u64 x75 = (x74 + (x53 << 0x1));
+ { u64 x76 = (x75 + x53);
+ { u64 x77 = (x42 + (x52 << 0x4));
+ { u64 x78 = (x77 + (x52 << 0x1));
+ { u64 x79 = (x78 + x52);
+ { u64 x80 = (x41 + (x51 << 0x4));
+ { u64 x81 = (x80 + (x51 << 0x1));
+ { u64 x82 = (x81 + x51);
+ { u64 x83 = (x40 + (x50 << 0x4));
+ { u64 x84 = (x83 + (x50 << 0x1));
+ { u64 x85 = (x84 + x50);
+ { u64 x86 = (x85 >> 0x1a);
+ { u32 x87 = ((u32)x85 & 0x3ffffff);
+ { u64 x88 = (x86 + x82);
+ { u64 x89 = (x88 >> 0x19);
+ { u32 x90 = ((u32)x88 & 0x1ffffff);
+ { u64 x91 = (x89 + x79);
+ { u64 x92 = (x91 >> 0x1a);
+ { u32 x93 = ((u32)x91 & 0x3ffffff);
+ { u64 x94 = (x92 + x76);
+ { u64 x95 = (x94 >> 0x19);
+ { u32 x96 = ((u32)x94 & 0x1ffffff);
+ { u64 x97 = (x95 + x73);
+ { u64 x98 = (x97 >> 0x1a);
+ { u32 x99 = ((u32)x97 & 0x3ffffff);
+ { u64 x100 = (x98 + x70);
+ { u64 x101 = (x100 >> 0x19);
+ { u32 x102 = ((u32)x100 & 0x1ffffff);
+ { u64 x103 = (x101 + x67);
+ { u64 x104 = (x103 >> 0x1a);
+ { u32 x105 = ((u32)x103 & 0x3ffffff);
+ { u64 x106 = (x104 + x64);
+ { u64 x107 = (x106 >> 0x19);
+ { u32 x108 = ((u32)x106 & 0x1ffffff);
+ { u64 x109 = (x107 + x61);
+ { u64 x110 = (x109 >> 0x1a);
+ { u32 x111 = ((u32)x109 & 0x3ffffff);
+ { u64 x112 = (x110 + x49);
+ { u64 x113 = (x112 >> 0x19);
+ { u32 x114 = ((u32)x112 & 0x1ffffff);
+ { u64 x115 = (x87 + (0x13 * x113));
+ { u32 x116 = (u32) (x115 >> 0x1a);
+ { u32 x117 = ((u32)x115 & 0x3ffffff);
+ { u32 x118 = (x116 + x90);
+ { u32 x119 = (x118 >> 0x19);
+ { u32 x120 = (x118 & 0x1ffffff);
+ out[0] = x117;
+ out[1] = x120;
+ out[2] = (x119 + x93);
+ out[3] = x96;
+ out[4] = x99;
+ out[5] = x102;
+ out[6] = x105;
+ out[7] = x108;
+ out[8] = x111;
+ out[9] = x114;
+ }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
+}
+
+static __always_inline void fe_mul121666(fe *h, const fe_loose *f)
+{
+ fe_mul_121666_impl(h->v, f->v);
+}
+
+static void curve25519_generic(u8 out[CURVE25519_KEY_SIZE],
+ const u8 scalar[CURVE25519_KEY_SIZE],
+ const u8 point[CURVE25519_KEY_SIZE])
+{
+ fe x1, x2, z2, x3, z3;
+ fe_loose x2l, z2l, x3l;
+ unsigned swap = 0;
+ int pos;
+ u8 e[32];
+
+ memcpy(e, scalar, 32);
+ normalize_secret(e);
+
+ /* The following implementation was transcribed to Coq and proven to
+ * correspond to unary scalar multiplication in affine coordinates given
+ * that x1 != 0 is the x coordinate of some point on the curve. It was
+ * also checked in Coq that doing a ladderstep with x1 = x3 = 0 gives
+ * z2' = z3' = 0, and z2 = z3 = 0 gives z2' = z3' = 0. The statement was
+ * quantified over the underlying field, so it applies to Curve25519
+ * itself and the quadratic twist of Curve25519. It was not proven in
+ * Coq that prime-field arithmetic correctly simulates extension-field
+ * arithmetic on prime-field values. The decoding of the byte array
+ * representation of e was not considered.
+ *
+ * Specification of Montgomery curves in affine coordinates:
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Spec/MontgomeryCurve.v#L27>
+ *
+ * Proof that these form a group that is isomorphic to a Weierstrass
+ * curve:
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/AffineProofs.v#L35>
+ *
+ * Coq transcription and correctness proof of the loop
+ * (where scalarbits=255):
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZ.v#L118>
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZProofs.v#L278>
+ * preconditions: 0 <= e < 2^255 (not necessarily e < order),
+ * fe_invert(0) = 0
+ */
+ fe_frombytes(&x1, point);
+ fe_1(&x2);
+ fe_0(&z2);
+ fe_copy(&x3, &x1);
+ fe_1(&z3);
+
+ for (pos = 254; pos >= 0; --pos) {
+ fe tmp0, tmp1;
+ fe_loose tmp0l, tmp1l;
+ /* loop invariant as of right before the test, for the case
+ * where x1 != 0:
+ * pos >= -1; if z2 = 0 then x2 is nonzero; if z3 = 0 then x3
+ * is nonzero
+ * let r := e >> (pos+1) in the following equalities of
+ * projective points:
+ * to_xz (r*P) === if swap then (x3, z3) else (x2, z2)
+ * to_xz ((r+1)*P) === if swap then (x2, z2) else (x3, z3)
+ * x1 is the nonzero x coordinate of the nonzero
+ * point (r*P-(r+1)*P)
+ */
+ unsigned b = 1 & (e[pos / 8] >> (pos & 7));
+ swap ^= b;
+ fe_cswap(&x2, &x3, swap);
+ fe_cswap(&z2, &z3, swap);
+ swap = b;
+ /* Coq transcription of ladderstep formula (called from
+ * transcribed loop):
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZ.v#L89>
+ * <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZProofs.v#L131>
+ * x1 != 0 <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZProofs.v#L217>
+ * x1 = 0 <https://github.com/mit-plv/fiat-crypto/blob/2456d821825521f7e03e65882cc3521795b0320f/src/Curves/Montgomery/XZProofs.v#L147>
+ */
+ fe_sub(&tmp0l, &x3, &z3);
+ fe_sub(&tmp1l, &x2, &z2);
+ fe_add(&x2l, &x2, &z2);
+ fe_add(&z2l, &x3, &z3);
+ fe_mul_tll(&z3, &tmp0l, &x2l);
+ fe_mul_tll(&z2, &z2l, &tmp1l);
+ fe_sq_tl(&tmp0, &tmp1l);
+ fe_sq_tl(&tmp1, &x2l);
+ fe_add(&x3l, &z3, &z2);
+ fe_sub(&z2l, &z3, &z2);
+ fe_mul_ttt(&x2, &tmp1, &tmp0);
+ fe_sub(&tmp1l, &tmp1, &tmp0);
+ fe_sq_tl(&z2, &z2l);
+ fe_mul121666(&z3, &tmp1l);
+ fe_sq_tl(&x3, &x3l);
+ fe_add(&tmp0l, &tmp0, &z3);
+ fe_mul_ttt(&z3, &x1, &z2);
+ fe_mul_tll(&z2, &tmp1l, &tmp0l);
+ }
+ /* here pos=-1, so r=e, so to_xz (e*P) === if swap then (x3, z3)
+ * else (x2, z2)
+ */
+ fe_cswap(&x2, &x3, swap);
+ fe_cswap(&z2, &z3, swap);
+
+ fe_invert(&z2, &z2);
+ fe_mul_ttt(&x2, &x2, &z2);
+ fe_tobytes(out, &x2);
+
+ memzero_explicit(&x1, sizeof(x1));
+ memzero_explicit(&x2, sizeof(x2));
+ memzero_explicit(&z2, sizeof(z2));
+ memzero_explicit(&x3, sizeof(x3));
+ memzero_explicit(&z3, sizeof(z3));
+ memzero_explicit(&x2l, sizeof(x2l));
+ memzero_explicit(&z2l, sizeof(z2l));
+ memzero_explicit(&x3l, sizeof(x3l));
+ memzero_explicit(&e, sizeof(e));
+}
diff --git a/lib/zinc/curve25519/curve25519-hacl64.h b/lib/zinc/curve25519/curve25519-hacl64.h
new file mode 100644
index 000000000000..598be44622a2
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-hacl64.h
@@ -0,0 +1,784 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright (C) 2016-2017 INRIA and Microsoft Corporation.
+ * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is a machine-generated formally verified implementation of Curve25519
+ * ECDH from: <https://github.com/mitls/hacl-star>. Though originally machine
+ * generated, it has been tweaked to be suitable for use in the kernel. It is
+ * optimized for 64-bit machines that can efficiently work with 128-bit
+ * integer types.
+ */
+
+typedef __uint128_t u128;
+
+static __always_inline u64 u64_eq_mask(u64 a, u64 b)
+{
+ u64 x = a ^ b;
+ u64 minus_x = ~x + (u64)1U;
+ u64 x_or_minus_x = x | minus_x;
+ u64 xnx = x_or_minus_x >> (u32)63U;
+ u64 c = xnx - (u64)1U;
+ return c;
+}
+
+static __always_inline u64 u64_gte_mask(u64 a, u64 b)
+{
+ u64 x = a;
+ u64 y = b;
+ u64 x_xor_y = x ^ y;
+ u64 x_sub_y = x - y;
+ u64 x_sub_y_xor_y = x_sub_y ^ y;
+ u64 q = x_xor_y | x_sub_y_xor_y;
+ u64 x_xor_q = x ^ q;
+ u64 x_xor_q_ = x_xor_q >> (u32)63U;
+ u64 c = x_xor_q_ - (u64)1U;
+ return c;
+}
+
+static __always_inline void modulo_carry_top(u64 *b)
+{
+ u64 b4 = b[4];
+ u64 b0 = b[0];
+ u64 b4_ = b4 & 0x7ffffffffffffLLU;
+ u64 b0_ = b0 + 19 * (b4 >> 51);
+ b[4] = b4_;
+ b[0] = b0_;
+}
+
+static __always_inline void fproduct_copy_from_wide_(u64 *output, u128 *input)
+{
+ {
+ u128 xi = input[0];
+ output[0] = ((u64)(xi));
+ }
+ {
+ u128 xi = input[1];
+ output[1] = ((u64)(xi));
+ }
+ {
+ u128 xi = input[2];
+ output[2] = ((u64)(xi));
+ }
+ {
+ u128 xi = input[3];
+ output[3] = ((u64)(xi));
+ }
+ {
+ u128 xi = input[4];
+ output[4] = ((u64)(xi));
+ }
+}
+
+static __always_inline void
+fproduct_sum_scalar_multiplication_(u128 *output, u64 *input, u64 s)
+{
+ output[0] += (u128)input[0] * s;
+ output[1] += (u128)input[1] * s;
+ output[2] += (u128)input[2] * s;
+ output[3] += (u128)input[3] * s;
+ output[4] += (u128)input[4] * s;
+}
+
+static __always_inline void fproduct_carry_wide_(u128 *tmp)
+{
+ {
+ u32 ctr = 0;
+ u128 tctr = tmp[ctr];
+ u128 tctrp1 = tmp[ctr + 1];
+ u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU;
+ u128 c = ((tctr) >> (51));
+ tmp[ctr] = ((u128)(r0));
+ tmp[ctr + 1] = ((tctrp1) + (c));
+ }
+ {
+ u32 ctr = 1;
+ u128 tctr = tmp[ctr];
+ u128 tctrp1 = tmp[ctr + 1];
+ u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU;
+ u128 c = ((tctr) >> (51));
+ tmp[ctr] = ((u128)(r0));
+ tmp[ctr + 1] = ((tctrp1) + (c));
+ }
+
+ {
+ u32 ctr = 2;
+ u128 tctr = tmp[ctr];
+ u128 tctrp1 = tmp[ctr + 1];
+ u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU;
+ u128 c = ((tctr) >> (51));
+ tmp[ctr] = ((u128)(r0));
+ tmp[ctr + 1] = ((tctrp1) + (c));
+ }
+ {
+ u32 ctr = 3;
+ u128 tctr = tmp[ctr];
+ u128 tctrp1 = tmp[ctr + 1];
+ u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU;
+ u128 c = ((tctr) >> (51));
+ tmp[ctr] = ((u128)(r0));
+ tmp[ctr + 1] = ((tctrp1) + (c));
+ }
+}
+
+static __always_inline void fmul_shift_reduce(u64 *output)
+{
+ u64 tmp = output[4];
+ u64 b0;
+ {
+ u32 ctr = 5 - 0 - 1;
+ u64 z = output[ctr - 1];
+ output[ctr] = z;
+ }
+ {
+ u32 ctr = 5 - 1 - 1;
+ u64 z = output[ctr - 1];
+ output[ctr] = z;
+ }
+ {
+ u32 ctr = 5 - 2 - 1;
+ u64 z = output[ctr - 1];
+ output[ctr] = z;
+ }
+ {
+ u32 ctr = 5 - 3 - 1;
+ u64 z = output[ctr - 1];
+ output[ctr] = z;
+ }
+ output[0] = tmp;
+ b0 = output[0];
+ output[0] = 19 * b0;
+}
+
+static __always_inline void fmul_mul_shift_reduce_(u128 *output, u64 *input,
+ u64 *input21)
+{
+ u32 i;
+ u64 input2i;
+ {
+ u64 input2i = input21[0];
+ fproduct_sum_scalar_multiplication_(output, input, input2i);
+ fmul_shift_reduce(input);
+ }
+ {
+ u64 input2i = input21[1];
+ fproduct_sum_scalar_multiplication_(output, input, input2i);
+ fmul_shift_reduce(input);
+ }
+ {
+ u64 input2i = input21[2];
+ fproduct_sum_scalar_multiplication_(output, input, input2i);
+ fmul_shift_reduce(input);
+ }
+ {
+ u64 input2i = input21[3];
+ fproduct_sum_scalar_multiplication_(output, input, input2i);
+ fmul_shift_reduce(input);
+ }
+ i = 4;
+ input2i = input21[i];
+ fproduct_sum_scalar_multiplication_(output, input, input2i);
+}
+
+static __always_inline void fmul_fmul(u64 *output, u64 *input, u64 *input21)
+{
+ u64 tmp[5] = { input[0], input[1], input[2], input[3], input[4] };
+ {
+ u128 b4;
+ u128 b0;
+ u128 b4_;
+ u128 b0_;
+ u64 i0;
+ u64 i1;
+ u64 i0_;
+ u64 i1_;
+ u128 t[5] = { 0 };
+ fmul_mul_shift_reduce_(t, tmp, input21);
+ fproduct_carry_wide_(t);
+ b4 = t[4];
+ b0 = t[0];
+ b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU))));
+ b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51))))))));
+ t[4] = b4_;
+ t[0] = b0_;
+ fproduct_copy_from_wide_(output, t);
+ i0 = output[0];
+ i1 = output[1];
+ i0_ = i0 & 0x7ffffffffffffLLU;
+ i1_ = i1 + (i0 >> 51);
+ output[0] = i0_;
+ output[1] = i1_;
+ }
+}
+
+static __always_inline void fsquare_fsquare__(u128 *tmp, u64 *output)
+{
+ u64 r0 = output[0];
+ u64 r1 = output[1];
+ u64 r2 = output[2];
+ u64 r3 = output[3];
+ u64 r4 = output[4];
+ u64 d0 = r0 * 2;
+ u64 d1 = r1 * 2;
+ u64 d2 = r2 * 2 * 19;
+ u64 d419 = r4 * 19;
+ u64 d4 = d419 * 2;
+ u128 s0 = ((((((u128)(r0) * (r0))) + (((u128)(d4) * (r1))))) +
+ (((u128)(d2) * (r3))));
+ u128 s1 = ((((((u128)(d0) * (r1))) + (((u128)(d4) * (r2))))) +
+ (((u128)(r3 * 19) * (r3))));
+ u128 s2 = ((((((u128)(d0) * (r2))) + (((u128)(r1) * (r1))))) +
+ (((u128)(d4) * (r3))));
+ u128 s3 = ((((((u128)(d0) * (r3))) + (((u128)(d1) * (r2))))) +
+ (((u128)(r4) * (d419))));
+ u128 s4 = ((((((u128)(d0) * (r4))) + (((u128)(d1) * (r3))))) +
+ (((u128)(r2) * (r2))));
+ tmp[0] = s0;
+ tmp[1] = s1;
+ tmp[2] = s2;
+ tmp[3] = s3;
+ tmp[4] = s4;
+}
+
+static __always_inline void fsquare_fsquare_(u128 *tmp, u64 *output)
+{
+ u128 b4;
+ u128 b0;
+ u128 b4_;
+ u128 b0_;
+ u64 i0;
+ u64 i1;
+ u64 i0_;
+ u64 i1_;
+ fsquare_fsquare__(tmp, output);
+ fproduct_carry_wide_(tmp);
+ b4 = tmp[4];
+ b0 = tmp[0];
+ b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU))));
+ b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51))))))));
+ tmp[4] = b4_;
+ tmp[0] = b0_;
+ fproduct_copy_from_wide_(output, tmp);
+ i0 = output[0];
+ i1 = output[1];
+ i0_ = i0 & 0x7ffffffffffffLLU;
+ i1_ = i1 + (i0 >> 51);
+ output[0] = i0_;
+ output[1] = i1_;
+}
+
+static __always_inline void fsquare_fsquare_times_(u64 *output, u128 *tmp,
+ u32 count1)
+{
+ u32 i;
+ fsquare_fsquare_(tmp, output);
+ for (i = 1; i < count1; ++i)
+ fsquare_fsquare_(tmp, output);
+}
+
+static __always_inline void fsquare_fsquare_times(u64 *output, u64 *input,
+ u32 count1)
+{
+ u128 t[5];
+ memcpy(output, input, 5 * sizeof(*input));
+ fsquare_fsquare_times_(output, t, count1);
+}
+
+static __always_inline void fsquare_fsquare_times_inplace(u64 *output,
+ u32 count1)
+{
+ u128 t[5];
+ fsquare_fsquare_times_(output, t, count1);
+}
+
+static __always_inline void crecip_crecip(u64 *out, u64 *z)
+{
+ u64 buf[20] = { 0 };
+ u64 *a0 = buf;
+ u64 *t00 = buf + 5;
+ u64 *b0 = buf + 10;
+ u64 *t01;
+ u64 *b1;
+ u64 *c0;
+ u64 *a;
+ u64 *t0;
+ u64 *b;
+ u64 *c;
+ fsquare_fsquare_times(a0, z, 1);
+ fsquare_fsquare_times(t00, a0, 2);
+ fmul_fmul(b0, t00, z);
+ fmul_fmul(a0, b0, a0);
+ fsquare_fsquare_times(t00, a0, 1);
+ fmul_fmul(b0, t00, b0);
+ fsquare_fsquare_times(t00, b0, 5);
+ t01 = buf + 5;
+ b1 = buf + 10;
+ c0 = buf + 15;
+ fmul_fmul(b1, t01, b1);
+ fsquare_fsquare_times(t01, b1, 10);
+ fmul_fmul(c0, t01, b1);
+ fsquare_fsquare_times(t01, c0, 20);
+ fmul_fmul(t01, t01, c0);
+ fsquare_fsquare_times_inplace(t01, 10);
+ fmul_fmul(b1, t01, b1);
+ fsquare_fsquare_times(t01, b1, 50);
+ a = buf;
+ t0 = buf + 5;
+ b = buf + 10;
+ c = buf + 15;
+ fmul_fmul(c, t0, b);
+ fsquare_fsquare_times(t0, c, 100);
+ fmul_fmul(t0, t0, c);
+ fsquare_fsquare_times_inplace(t0, 50);
+ fmul_fmul(t0, t0, b);
+ fsquare_fsquare_times_inplace(t0, 5);
+ fmul_fmul(out, t0, a);
+}
+
+static __always_inline void fsum(u64 *a, u64 *b)
+{
+ a[0] += b[0];
+ a[1] += b[1];
+ a[2] += b[2];
+ a[3] += b[3];
+ a[4] += b[4];
+}
+
+static __always_inline void fdifference(u64 *a, u64 *b)
+{
+ u64 tmp[5] = { 0 };
+ u64 b0;
+ u64 b1;
+ u64 b2;
+ u64 b3;
+ u64 b4;
+ memcpy(tmp, b, 5 * sizeof(*b));
+ b0 = tmp[0];
+ b1 = tmp[1];
+ b2 = tmp[2];
+ b3 = tmp[3];
+ b4 = tmp[4];
+ tmp[0] = b0 + 0x3fffffffffff68LLU;
+ tmp[1] = b1 + 0x3ffffffffffff8LLU;
+ tmp[2] = b2 + 0x3ffffffffffff8LLU;
+ tmp[3] = b3 + 0x3ffffffffffff8LLU;
+ tmp[4] = b4 + 0x3ffffffffffff8LLU;
+ {
+ u64 xi = a[0];
+ u64 yi = tmp[0];
+ a[0] = yi - xi;
+ }
+ {
+ u64 xi = a[1];
+ u64 yi = tmp[1];
+ a[1] = yi - xi;
+ }
+ {
+ u64 xi = a[2];
+ u64 yi = tmp[2];
+ a[2] = yi - xi;
+ }
+ {
+ u64 xi = a[3];
+ u64 yi = tmp[3];
+ a[3] = yi - xi;
+ }
+ {
+ u64 xi = a[4];
+ u64 yi = tmp[4];
+ a[4] = yi - xi;
+ }
+}
+
+static __always_inline void fscalar(u64 *output, u64 *b, u64 s)
+{
+ u128 tmp[5];
+ u128 b4;
+ u128 b0;
+ u128 b4_;
+ u128 b0_;
+ {
+ u64 xi = b[0];
+ tmp[0] = ((u128)(xi) * (s));
+ }
+ {
+ u64 xi = b[1];
+ tmp[1] = ((u128)(xi) * (s));
+ }
+ {
+ u64 xi = b[2];
+ tmp[2] = ((u128)(xi) * (s));
+ }
+ {
+ u64 xi = b[3];
+ tmp[3] = ((u128)(xi) * (s));
+ }
+ {
+ u64 xi = b[4];
+ tmp[4] = ((u128)(xi) * (s));
+ }
+ fproduct_carry_wide_(tmp);
+ b4 = tmp[4];
+ b0 = tmp[0];
+ b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU))));
+ b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51))))))));
+ tmp[4] = b4_;
+ tmp[0] = b0_;
+ fproduct_copy_from_wide_(output, tmp);
+}
+
+static __always_inline void fmul(u64 *output, u64 *a, u64 *b)
+{
+ fmul_fmul(output, a, b);
+}
+
+static __always_inline void crecip(u64 *output, u64 *input)
+{
+ crecip_crecip(output, input);
+}
+
+static __always_inline void point_swap_conditional_step(u64 *a, u64 *b,
+ u64 swap1, u32 ctr)
+{
+ u32 i = ctr - 1;
+ u64 ai = a[i];
+ u64 bi = b[i];
+ u64 x = swap1 & (ai ^ bi);
+ u64 ai1 = ai ^ x;
+ u64 bi1 = bi ^ x;
+ a[i] = ai1;
+ b[i] = bi1;
+}
+
+static __always_inline void point_swap_conditional5(u64 *a, u64 *b, u64 swap1)
+{
+ point_swap_conditional_step(a, b, swap1, 5);
+ point_swap_conditional_step(a, b, swap1, 4);
+ point_swap_conditional_step(a, b, swap1, 3);
+ point_swap_conditional_step(a, b, swap1, 2);
+ point_swap_conditional_step(a, b, swap1, 1);
+}
+
+static __always_inline void point_swap_conditional(u64 *a, u64 *b, u64 iswap)
+{
+ u64 swap1 = 0 - iswap;
+ point_swap_conditional5(a, b, swap1);
+ point_swap_conditional5(a + 5, b + 5, swap1);
+}
+
+static __always_inline void point_copy(u64 *output, u64 *input)
+{
+ memcpy(output, input, 5 * sizeof(*input));
+ memcpy(output + 5, input + 5, 5 * sizeof(*input));
+}
+
+static __always_inline void addanddouble_fmonty(u64 *pp, u64 *ppq, u64 *p,
+ u64 *pq, u64 *qmqp)
+{
+ u64 *qx = qmqp;
+ u64 *x2 = pp;
+ u64 *z2 = pp + 5;
+ u64 *x3 = ppq;
+ u64 *z3 = ppq + 5;
+ u64 *x = p;
+ u64 *z = p + 5;
+ u64 *xprime = pq;
+ u64 *zprime = pq + 5;
+ u64 buf[40] = { 0 };
+ u64 *origx = buf;
+ u64 *origxprime0 = buf + 5;
+ u64 *xxprime0;
+ u64 *zzprime0;
+ u64 *origxprime;
+ xxprime0 = buf + 25;
+ zzprime0 = buf + 30;
+ memcpy(origx, x, 5 * sizeof(*x));
+ fsum(x, z);
+ fdifference(z, origx);
+ memcpy(origxprime0, xprime, 5 * sizeof(*xprime));
+ fsum(xprime, zprime);
+ fdifference(zprime, origxprime0);
+ fmul(xxprime0, xprime, z);
+ fmul(zzprime0, x, zprime);
+ origxprime = buf + 5;
+ {
+ u64 *xx0;
+ u64 *zz0;
+ u64 *xxprime;
+ u64 *zzprime;
+ u64 *zzzprime;
+ xx0 = buf + 15;
+ zz0 = buf + 20;
+ xxprime = buf + 25;
+ zzprime = buf + 30;
+ zzzprime = buf + 35;
+ memcpy(origxprime, xxprime, 5 * sizeof(*xxprime));
+ fsum(xxprime, zzprime);
+ fdifference(zzprime, origxprime);
+ fsquare_fsquare_times(x3, xxprime, 1);
+ fsquare_fsquare_times(zzzprime, zzprime, 1);
+ fmul(z3, zzzprime, qx);
+ fsquare_fsquare_times(xx0, x, 1);
+ fsquare_fsquare_times(zz0, z, 1);
+ {
+ u64 *zzz;
+ u64 *xx;
+ u64 *zz;
+ u64 scalar;
+ zzz = buf + 10;
+ xx = buf + 15;
+ zz = buf + 20;
+ fmul(x2, xx, zz);
+ fdifference(zz, xx);
+ scalar = 121665;
+ fscalar(zzz, zz, scalar);
+ fsum(zzz, xx);
+ fmul(z2, zzz, zz);
+ }
+ }
+}
+
+static __always_inline void
+ladder_smallloop_cmult_small_loop_step(u64 *nq, u64 *nqpq, u64 *nq2, u64 *nqpq2,
+ u64 *q, u8 byt)
+{
+ u64 bit0 = (u64)(byt >> 7);
+ u64 bit;
+ point_swap_conditional(nq, nqpq, bit0);
+ addanddouble_fmonty(nq2, nqpq2, nq, nqpq, q);
+ bit = (u64)(byt >> 7);
+ point_swap_conditional(nq2, nqpq2, bit);
+}
+
+static __always_inline void
+ladder_smallloop_cmult_small_loop_double_step(u64 *nq, u64 *nqpq, u64 *nq2,
+ u64 *nqpq2, u64 *q, u8 byt)
+{
+ u8 byt1;
+ ladder_smallloop_cmult_small_loop_step(nq, nqpq, nq2, nqpq2, q, byt);
+ byt1 = byt << 1;
+ ladder_smallloop_cmult_small_loop_step(nq2, nqpq2, nq, nqpq, q, byt1);
+}
+
+static __always_inline void
+ladder_smallloop_cmult_small_loop(u64 *nq, u64 *nqpq, u64 *nq2, u64 *nqpq2,
+ u64 *q, u8 byt, u32 i)
+{
+ while (i--) {
+ ladder_smallloop_cmult_small_loop_double_step(nq, nqpq, nq2,
+ nqpq2, q, byt);
+ byt <<= 2;
+ }
+}
+
+static __always_inline void ladder_bigloop_cmult_big_loop(u8 *n1, u64 *nq,
+ u64 *nqpq, u64 *nq2,
+ u64 *nqpq2, u64 *q,
+ u32 i)
+{
+ while (i--) {
+ u8 byte = n1[i];
+ ladder_smallloop_cmult_small_loop(nq, nqpq, nq2, nqpq2, q,
+ byte, 4);
+ }
+}
+
+static void ladder_cmult(u64 *result, u8 *n1, u64 *q)
+{
+ u64 point_buf[40] = { 0 };
+ u64 *nq = point_buf;
+ u64 *nqpq = point_buf + 10;
+ u64 *nq2 = point_buf + 20;
+ u64 *nqpq2 = point_buf + 30;
+ point_copy(nqpq, q);
+ nq[0] = 1;
+ ladder_bigloop_cmult_big_loop(n1, nq, nqpq, nq2, nqpq2, q, 32);
+ point_copy(result, nq);
+}
+
+static __always_inline void format_fexpand(u64 *output, const u8 *input)
+{
+ const u8 *x00 = input + 6;
+ const u8 *x01 = input + 12;
+ const u8 *x02 = input + 19;
+ const u8 *x0 = input + 24;
+ u64 i0, i1, i2, i3, i4, output0, output1, output2, output3, output4;
+ i0 = get_unaligned_le64(input);
+ i1 = get_unaligned_le64(x00);
+ i2 = get_unaligned_le64(x01);
+ i3 = get_unaligned_le64(x02);
+ i4 = get_unaligned_le64(x0);
+ output0 = i0 & 0x7ffffffffffffLLU;
+ output1 = i1 >> 3 & 0x7ffffffffffffLLU;
+ output2 = i2 >> 6 & 0x7ffffffffffffLLU;
+ output3 = i3 >> 1 & 0x7ffffffffffffLLU;
+ output4 = i4 >> 12 & 0x7ffffffffffffLLU;
+ output[0] = output0;
+ output[1] = output1;
+ output[2] = output2;
+ output[3] = output3;
+ output[4] = output4;
+}
+
+static __always_inline void format_fcontract_first_carry_pass(u64 *input)
+{
+ u64 t0 = input[0];
+ u64 t1 = input[1];
+ u64 t2 = input[2];
+ u64 t3 = input[3];
+ u64 t4 = input[4];
+ u64 t1_ = t1 + (t0 >> 51);
+ u64 t0_ = t0 & 0x7ffffffffffffLLU;
+ u64 t2_ = t2 + (t1_ >> 51);
+ u64 t1__ = t1_ & 0x7ffffffffffffLLU;
+ u64 t3_ = t3 + (t2_ >> 51);
+ u64 t2__ = t2_ & 0x7ffffffffffffLLU;
+ u64 t4_ = t4 + (t3_ >> 51);
+ u64 t3__ = t3_ & 0x7ffffffffffffLLU;
+ input[0] = t0_;
+ input[1] = t1__;
+ input[2] = t2__;
+ input[3] = t3__;
+ input[4] = t4_;
+}
+
+static __always_inline void format_fcontract_first_carry_full(u64 *input)
+{
+ format_fcontract_first_carry_pass(input);
+ modulo_carry_top(input);
+}
+
+static __always_inline void format_fcontract_second_carry_pass(u64 *input)
+{
+ u64 t0 = input[0];
+ u64 t1 = input[1];
+ u64 t2 = input[2];
+ u64 t3 = input[3];
+ u64 t4 = input[4];
+ u64 t1_ = t1 + (t0 >> 51);
+ u64 t0_ = t0 & 0x7ffffffffffffLLU;
+ u64 t2_ = t2 + (t1_ >> 51);
+ u64 t1__ = t1_ & 0x7ffffffffffffLLU;
+ u64 t3_ = t3 + (t2_ >> 51);
+ u64 t2__ = t2_ & 0x7ffffffffffffLLU;
+ u64 t4_ = t4 + (t3_ >> 51);
+ u64 t3__ = t3_ & 0x7ffffffffffffLLU;
+ input[0] = t0_;
+ input[1] = t1__;
+ input[2] = t2__;
+ input[3] = t3__;
+ input[4] = t4_;
+}
+
+static __always_inline void format_fcontract_second_carry_full(u64 *input)
+{
+ u64 i0;
+ u64 i1;
+ u64 i0_;
+ u64 i1_;
+ format_fcontract_second_carry_pass(input);
+ modulo_carry_top(input);
+ i0 = input[0];
+ i1 = input[1];
+ i0_ = i0 & 0x7ffffffffffffLLU;
+ i1_ = i1 + (i0 >> 51);
+ input[0] = i0_;
+ input[1] = i1_;
+}
+
+static __always_inline void format_fcontract_trim(u64 *input)
+{
+ u64 a0 = input[0];
+ u64 a1 = input[1];
+ u64 a2 = input[2];
+ u64 a3 = input[3];
+ u64 a4 = input[4];
+ u64 mask0 = u64_gte_mask(a0, 0x7ffffffffffedLLU);
+ u64 mask1 = u64_eq_mask(a1, 0x7ffffffffffffLLU);
+ u64 mask2 = u64_eq_mask(a2, 0x7ffffffffffffLLU);
+ u64 mask3 = u64_eq_mask(a3, 0x7ffffffffffffLLU);
+ u64 mask4 = u64_eq_mask(a4, 0x7ffffffffffffLLU);
+ u64 mask = (((mask0 & mask1) & mask2) & mask3) & mask4;
+ u64 a0_ = a0 - (0x7ffffffffffedLLU & mask);
+ u64 a1_ = a1 - (0x7ffffffffffffLLU & mask);
+ u64 a2_ = a2 - (0x7ffffffffffffLLU & mask);
+ u64 a3_ = a3 - (0x7ffffffffffffLLU & mask);
+ u64 a4_ = a4 - (0x7ffffffffffffLLU & mask);
+ input[0] = a0_;
+ input[1] = a1_;
+ input[2] = a2_;
+ input[3] = a3_;
+ input[4] = a4_;
+}
+
+static __always_inline void format_fcontract_store(u8 *output, u64 *input)
+{
+ u64 t0 = input[0];
+ u64 t1 = input[1];
+ u64 t2 = input[2];
+ u64 t3 = input[3];
+ u64 t4 = input[4];
+ u64 o0 = t1 << 51 | t0;
+ u64 o1 = t2 << 38 | t1 >> 13;
+ u64 o2 = t3 << 25 | t2 >> 26;
+ u64 o3 = t4 << 12 | t3 >> 39;
+ u8 *b0 = output;
+ u8 *b1 = output + 8;
+ u8 *b2 = output + 16;
+ u8 *b3 = output + 24;
+ put_unaligned_le64(o0, b0);
+ put_unaligned_le64(o1, b1);
+ put_unaligned_le64(o2, b2);
+ put_unaligned_le64(o3, b3);
+}
+
+static __always_inline void format_fcontract(u8 *output, u64 *input)
+{
+ format_fcontract_first_carry_full(input);
+ format_fcontract_second_carry_full(input);
+ format_fcontract_trim(input);
+ format_fcontract_store(output, input);
+}
+
+static __always_inline void format_scalar_of_point(u8 *scalar, u64 *point)
+{
+ u64 *x = point;
+ u64 *z = point + 5;
+ u64 buf[10] __aligned(32) = { 0 };
+ u64 *zmone = buf;
+ u64 *sc = buf + 5;
+ crecip(zmone, z);
+ fmul(sc, x, zmone);
+ format_fcontract(scalar, sc);
+}
+
+static void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE])
+{
+ u64 buf0[10] __aligned(32) = { 0 };
+ u64 *x0 = buf0;
+ u64 *z = buf0 + 5;
+ u64 *q;
+ format_fexpand(x0, basepoint);
+ z[0] = 1;
+ q = buf0;
+ {
+ u8 e[32] __aligned(32) = { 0 };
+ u8 *scalar;
+ memcpy(e, secret, 32);
+ normalize_secret(e);
+ scalar = e;
+ {
+ u64 buf[15] = { 0 };
+ u64 *nq = buf;
+ u64 *x = nq;
+ x[0] = 1;
+ ladder_cmult(nq, scalar, q);
+ format_scalar_of_point(mypublic, nq);
+ memzero_explicit(buf, sizeof(buf));
+ }
+ memzero_explicit(e, sizeof(e));
+ }
+ memzero_explicit(buf0, sizeof(buf0));
+}
diff --git a/lib/zinc/curve25519/curve25519.c b/lib/zinc/curve25519/curve25519.c
new file mode 100644
index 000000000000..2f613d2a7519
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This is an implementation of the Curve25519 ECDH algorithm, using either
+ * a 32-bit implementation or a 64-bit implementation with 128-bit integers,
+ * depending on what is supported by the target compiler.
+ *
+ * Information: https://cr.yp.to/ecdh.html
+ */
+
+#include <zinc/curve25519.h>
+#include "../selftest/run.h"
+
+#include <asm/unaligned.h>
+#include <linux/version.h>
+#include <linux/string.h>
+#include <linux/random.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <crypto/algapi.h> // For crypto_memneq.
+
+static bool *const curve25519_nobs[] __initconst = { };
+static void __init curve25519_fpu_init(void)
+{
+}
+static inline bool curve25519_arch(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE])
+{
+ return false;
+}
+static inline bool curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE])
+{
+ return false;
+}
+
+static __always_inline void normalize_secret(u8 secret[CURVE25519_KEY_SIZE])
+{
+ secret[0] &= 248;
+ secret[31] &= 127;
+ secret[31] |= 64;
+}
+
+#if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__)
+#include "curve25519-hacl64.h"
+#else
+#include "curve25519-fiat32.h"
+#endif
+
+static const u8 null_point[CURVE25519_KEY_SIZE] = { 0 };
+
+bool curve25519(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE])
+{
+ if (!curve25519_arch(mypublic, secret, basepoint))
+ curve25519_generic(mypublic, secret, basepoint);
+ return crypto_memneq(mypublic, null_point, CURVE25519_KEY_SIZE);
+}
+EXPORT_SYMBOL(curve25519);
+
+bool curve25519_generate_public(u8 pub[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE])
+{
+ static const u8 basepoint[CURVE25519_KEY_SIZE] __aligned(32) = { 9 };
+
+ if (unlikely(!crypto_memneq(secret, null_point, CURVE25519_KEY_SIZE)))
+ return false;
+
+ if (curve25519_base_arch(pub, secret))
+ return crypto_memneq(pub, null_point, CURVE25519_KEY_SIZE);
+ return curve25519(pub, secret, basepoint);
+}
+EXPORT_SYMBOL(curve25519_generate_public);
+
+void curve25519_generate_secret(u8 secret[CURVE25519_KEY_SIZE])
+{
+ get_random_bytes_wait(secret, CURVE25519_KEY_SIZE);
+ normalize_secret(secret);
+}
+EXPORT_SYMBOL(curve25519_generate_secret);
+
+#include "../selftest/curve25519.c"
+
+static bool nosimd __initdata = false;
+
+static int __init mod_init(void)
+{
+ if (!nosimd)
+ curve25519_fpu_init();
+ if (!selftest_run("curve25519", curve25519_selftest, curve25519_nobs,
+ ARRAY_SIZE(curve25519_nobs)))
+ return -ENOTRECOVERABLE;
+ return 0;
+}
+
+static void __exit mod_exit(void)
+{
+}
+
+module_param(nosimd, bool, 0);
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Curve25519 scalar multiplication");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/zinc/selftest/curve25519.c b/lib/zinc/selftest/curve25519.c
new file mode 100644
index 000000000000..fa653d47df5a
--- /dev/null
+++ b/lib/zinc/selftest/curve25519.c
@@ -0,0 +1,1315 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+struct curve25519_test_vector {
+ u8 private[CURVE25519_KEY_SIZE];
+ u8 public[CURVE25519_KEY_SIZE];
+ u8 result[CURVE25519_KEY_SIZE];
+ bool valid;
+};
+static const struct curve25519_test_vector curve25519_test_vectors[] __initconst = {
+ {
+ .private = { 0x77, 0x07, 0x6d, 0x0a, 0x73, 0x18, 0xa5, 0x7d,
+ 0x3c, 0x16, 0xc1, 0x72, 0x51, 0xb2, 0x66, 0x45,
+ 0xdf, 0x4c, 0x2f, 0x87, 0xeb, 0xc0, 0x99, 0x2a,
+ 0xb1, 0x77, 0xfb, 0xa5, 0x1d, 0xb9, 0x2c, 0x2a },
+ .public = { 0xde, 0x9e, 0xdb, 0x7d, 0x7b, 0x7d, 0xc1, 0xb4,
+ 0xd3, 0x5b, 0x61, 0xc2, 0xec, 0xe4, 0x35, 0x37,
+ 0x3f, 0x83, 0x43, 0xc8, 0x5b, 0x78, 0x67, 0x4d,
+ 0xad, 0xfc, 0x7e, 0x14, 0x6f, 0x88, 0x2b, 0x4f },
+ .result = { 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1,
+ 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25,
+ 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33,
+ 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 },
+ .valid = true
+ },
+ {
+ .private = { 0x5d, 0xab, 0x08, 0x7e, 0x62, 0x4a, 0x8a, 0x4b,
+ 0x79, 0xe1, 0x7f, 0x8b, 0x83, 0x80, 0x0e, 0xe6,
+ 0x6f, 0x3b, 0xb1, 0x29, 0x26, 0x18, 0xb6, 0xfd,
+ 0x1c, 0x2f, 0x8b, 0x27, 0xff, 0x88, 0xe0, 0xeb },
+ .public = { 0x85, 0x20, 0xf0, 0x09, 0x89, 0x30, 0xa7, 0x54,
+ 0x74, 0x8b, 0x7d, 0xdc, 0xb4, 0x3e, 0xf7, 0x5a,
+ 0x0d, 0xbf, 0x3a, 0x0d, 0x26, 0x38, 0x1a, 0xf4,
+ 0xeb, 0xa4, 0xa9, 0x8e, 0xaa, 0x9b, 0x4e, 0x6a },
+ .result = { 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1,
+ 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25,
+ 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33,
+ 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 },
+ .valid = true
+ },
+ {
+ .private = { 1 },
+ .public = { 0x25, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x3c, 0x77, 0x77, 0xca, 0xf9, 0x97, 0xb2, 0x64,
+ 0x41, 0x60, 0x77, 0x66, 0x5b, 0x4e, 0x22, 0x9d,
+ 0x0b, 0x95, 0x48, 0xdc, 0x0c, 0xd8, 0x19, 0x98,
+ 0xdd, 0xcd, 0xc5, 0xc8, 0x53, 0x3c, 0x79, 0x7f },
+ .valid = true
+ },
+ {
+ .private = { 1 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0xb3, 0x2d, 0x13, 0x62, 0xc2, 0x48, 0xd6, 0x2f,
+ 0xe6, 0x26, 0x19, 0xcf, 0xf0, 0x4d, 0xd4, 0x3d,
+ 0xb7, 0x3f, 0xfc, 0x1b, 0x63, 0x08, 0xed, 0xe3,
+ 0x0b, 0x78, 0xd8, 0x73, 0x80, 0xf1, 0xe8, 0x34 },
+ .valid = true
+ },
+ {
+ .private = { 0xa5, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d,
+ 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd,
+ 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18,
+ 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0xc4 },
+ .public = { 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb,
+ 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c,
+ 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b,
+ 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c },
+ .result = { 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90,
+ 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f,
+ 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7,
+ 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 },
+ .valid = true
+ },
+ {
+ .private = { 1, 2, 3, 4 },
+ .public = { 0 },
+ .result = { 0 },
+ .valid = false
+ },
+ {
+ .private = { 2, 4, 6, 8 },
+ .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae,
+ 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a,
+ 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd,
+ 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8 },
+ .result = { 0 },
+ .valid = false
+ },
+ {
+ .private = { 0xff, 0xff, 0xff, 0xff, 0x0a, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0x0a, 0x00, 0xfb, 0x9f },
+ .result = { 0x77, 0x52, 0xb6, 0x18, 0xc1, 0x2d, 0x48, 0xd2,
+ 0xc6, 0x93, 0x46, 0x83, 0x81, 0x7c, 0xc6, 0x57,
+ 0xf3, 0x31, 0x03, 0x19, 0x49, 0x48, 0x20, 0x05,
+ 0x42, 0x2b, 0x4e, 0xae, 0x8d, 0x1d, 0x43, 0x23 },
+ .valid = true
+ },
+ {
+ .private = { 0x8e, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8e, 0x06 },
+ .result = { 0x5a, 0xdf, 0xaa, 0x25, 0x86, 0x8e, 0x32, 0x3d,
+ 0xae, 0x49, 0x62, 0xc1, 0x01, 0x5c, 0xb3, 0x12,
+ 0xe1, 0xc5, 0xc7, 0x9e, 0x95, 0x3f, 0x03, 0x99,
+ 0xb0, 0xba, 0x16, 0x22, 0xf3, 0xb6, 0xf7, 0x0c },
+ .valid = true
+ },
+ /* wycheproof - normal case */
+ {
+ .private = { 0x48, 0x52, 0x83, 0x4d, 0x9d, 0x6b, 0x77, 0xda,
+ 0xde, 0xab, 0xaa, 0xf2, 0xe1, 0x1d, 0xca, 0x66,
+ 0xd1, 0x9f, 0xe7, 0x49, 0x93, 0xa7, 0xbe, 0xc3,
+ 0x6c, 0x6e, 0x16, 0xa0, 0x98, 0x3f, 0xea, 0xba },
+ .public = { 0x9c, 0x64, 0x7d, 0x9a, 0xe5, 0x89, 0xb9, 0xf5,
+ 0x8f, 0xdc, 0x3c, 0xa4, 0x94, 0x7e, 0xfb, 0xc9,
+ 0x15, 0xc4, 0xb2, 0xe0, 0x8e, 0x74, 0x4a, 0x0e,
+ 0xdf, 0x46, 0x9d, 0xac, 0x59, 0xc8, 0xf8, 0x5a },
+ .result = { 0x87, 0xb7, 0xf2, 0x12, 0xb6, 0x27, 0xf7, 0xa5,
+ 0x4c, 0xa5, 0xe0, 0xbc, 0xda, 0xdd, 0xd5, 0x38,
+ 0x9d, 0x9d, 0xe6, 0x15, 0x6c, 0xdb, 0xcf, 0x8e,
+ 0xbe, 0x14, 0xff, 0xbc, 0xfb, 0x43, 0x65, 0x51 },
+ .valid = true
+ },
+ /* wycheproof - public key on twist */
+ {
+ .private = { 0x58, 0x8c, 0x06, 0x1a, 0x50, 0x80, 0x4a, 0xc4,
+ 0x88, 0xad, 0x77, 0x4a, 0xc7, 0x16, 0xc3, 0xf5,
+ 0xba, 0x71, 0x4b, 0x27, 0x12, 0xe0, 0x48, 0x49,
+ 0x13, 0x79, 0xa5, 0x00, 0x21, 0x19, 0x98, 0xa8 },
+ .public = { 0x63, 0xaa, 0x40, 0xc6, 0xe3, 0x83, 0x46, 0xc5,
+ 0xca, 0xf2, 0x3a, 0x6d, 0xf0, 0xa5, 0xe6, 0xc8,
+ 0x08, 0x89, 0xa0, 0x86, 0x47, 0xe5, 0x51, 0xb3,
+ 0x56, 0x34, 0x49, 0xbe, 0xfc, 0xfc, 0x97, 0x33 },
+ .result = { 0xb1, 0xa7, 0x07, 0x51, 0x94, 0x95, 0xff, 0xff,
+ 0xb2, 0x98, 0xff, 0x94, 0x17, 0x16, 0xb0, 0x6d,
+ 0xfa, 0xb8, 0x7c, 0xf8, 0xd9, 0x11, 0x23, 0xfe,
+ 0x2b, 0xe9, 0xa2, 0x33, 0xdd, 0xa2, 0x22, 0x12 },
+ .valid = true
+ },
+ /* wycheproof - public key on twist */
+ {
+ .private = { 0xb0, 0x5b, 0xfd, 0x32, 0xe5, 0x53, 0x25, 0xd9,
+ 0xfd, 0x64, 0x8c, 0xb3, 0x02, 0x84, 0x80, 0x39,
+ 0x00, 0x0b, 0x39, 0x0e, 0x44, 0xd5, 0x21, 0xe5,
+ 0x8a, 0xab, 0x3b, 0x29, 0xa6, 0x96, 0x0b, 0xa8 },
+ .public = { 0x0f, 0x83, 0xc3, 0x6f, 0xde, 0xd9, 0xd3, 0x2f,
+ 0xad, 0xf4, 0xef, 0xa3, 0xae, 0x93, 0xa9, 0x0b,
+ 0xb5, 0xcf, 0xa6, 0x68, 0x93, 0xbc, 0x41, 0x2c,
+ 0x43, 0xfa, 0x72, 0x87, 0xdb, 0xb9, 0x97, 0x79 },
+ .result = { 0x67, 0xdd, 0x4a, 0x6e, 0x16, 0x55, 0x33, 0x53,
+ 0x4c, 0x0e, 0x3f, 0x17, 0x2e, 0x4a, 0xb8, 0x57,
+ 0x6b, 0xca, 0x92, 0x3a, 0x5f, 0x07, 0xb2, 0xc0,
+ 0x69, 0xb4, 0xc3, 0x10, 0xff, 0x2e, 0x93, 0x5b },
+ .valid = true
+ },
+ /* wycheproof - public key on twist */
+ {
+ .private = { 0x70, 0xe3, 0x4b, 0xcb, 0xe1, 0xf4, 0x7f, 0xbc,
+ 0x0f, 0xdd, 0xfd, 0x7c, 0x1e, 0x1a, 0xa5, 0x3d,
+ 0x57, 0xbf, 0xe0, 0xf6, 0x6d, 0x24, 0x30, 0x67,
+ 0xb4, 0x24, 0xbb, 0x62, 0x10, 0xbe, 0xd1, 0x9c },
+ .public = { 0x0b, 0x82, 0x11, 0xa2, 0xb6, 0x04, 0x90, 0x97,
+ 0xf6, 0x87, 0x1c, 0x6c, 0x05, 0x2d, 0x3c, 0x5f,
+ 0xc1, 0xba, 0x17, 0xda, 0x9e, 0x32, 0xae, 0x45,
+ 0x84, 0x03, 0xb0, 0x5b, 0xb2, 0x83, 0x09, 0x2a },
+ .result = { 0x4a, 0x06, 0x38, 0xcf, 0xaa, 0x9e, 0xf1, 0x93,
+ 0x3b, 0x47, 0xf8, 0x93, 0x92, 0x96, 0xa6, 0xb2,
+ 0x5b, 0xe5, 0x41, 0xef, 0x7f, 0x70, 0xe8, 0x44,
+ 0xc0, 0xbc, 0xc0, 0x0b, 0x13, 0x4d, 0xe6, 0x4a },
+ .valid = true
+ },
+ /* wycheproof - public key on twist */
+ {
+ .private = { 0x68, 0xc1, 0xf3, 0xa6, 0x53, 0xa4, 0xcd, 0xb1,
+ 0xd3, 0x7b, 0xba, 0x94, 0x73, 0x8f, 0x8b, 0x95,
+ 0x7a, 0x57, 0xbe, 0xb2, 0x4d, 0x64, 0x6e, 0x99,
+ 0x4d, 0xc2, 0x9a, 0x27, 0x6a, 0xad, 0x45, 0x8d },
+ .public = { 0x34, 0x3a, 0xc2, 0x0a, 0x3b, 0x9c, 0x6a, 0x27,
+ 0xb1, 0x00, 0x81, 0x76, 0x50, 0x9a, 0xd3, 0x07,
+ 0x35, 0x85, 0x6e, 0xc1, 0xc8, 0xd8, 0xfc, 0xae,
+ 0x13, 0x91, 0x2d, 0x08, 0xd1, 0x52, 0xf4, 0x6c },
+ .result = { 0x39, 0x94, 0x91, 0xfc, 0xe8, 0xdf, 0xab, 0x73,
+ 0xb4, 0xf9, 0xf6, 0x11, 0xde, 0x8e, 0xa0, 0xb2,
+ 0x7b, 0x28, 0xf8, 0x59, 0x94, 0x25, 0x0b, 0x0f,
+ 0x47, 0x5d, 0x58, 0x5d, 0x04, 0x2a, 0xc2, 0x07 },
+ .valid = true
+ },
+ /* wycheproof - public key on twist */
+ {
+ .private = { 0xd8, 0x77, 0xb2, 0x6d, 0x06, 0xdf, 0xf9, 0xd9,
+ 0xf7, 0xfd, 0x4c, 0x5b, 0x37, 0x69, 0xf8, 0xcd,
+ 0xd5, 0xb3, 0x05, 0x16, 0xa5, 0xab, 0x80, 0x6b,
+ 0xe3, 0x24, 0xff, 0x3e, 0xb6, 0x9e, 0xa0, 0xb2 },
+ .public = { 0xfa, 0x69, 0x5f, 0xc7, 0xbe, 0x8d, 0x1b, 0xe5,
+ 0xbf, 0x70, 0x48, 0x98, 0xf3, 0x88, 0xc4, 0x52,
+ 0xba, 0xfd, 0xd3, 0xb8, 0xea, 0xe8, 0x05, 0xf8,
+ 0x68, 0x1a, 0x8d, 0x15, 0xc2, 0xd4, 0xe1, 0x42 },
+ .result = { 0x2c, 0x4f, 0xe1, 0x1d, 0x49, 0x0a, 0x53, 0x86,
+ 0x17, 0x76, 0xb1, 0x3b, 0x43, 0x54, 0xab, 0xd4,
+ 0xcf, 0x5a, 0x97, 0x69, 0x9d, 0xb6, 0xe6, 0xc6,
+ 0x8c, 0x16, 0x26, 0xd0, 0x76, 0x62, 0xf7, 0x58 },
+ .valid = true
+ },
+ /* wycheproof - public key = 0 */
+ {
+ .private = { 0x20, 0x74, 0x94, 0x03, 0x8f, 0x2b, 0xb8, 0x11,
+ 0xd4, 0x78, 0x05, 0xbc, 0xdf, 0x04, 0xa2, 0xac,
+ 0x58, 0x5a, 0xda, 0x7f, 0x2f, 0x23, 0x38, 0x9b,
+ 0xfd, 0x46, 0x58, 0xf9, 0xdd, 0xd4, 0xde, 0xbc },
+ .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key = 1 */
+ {
+ .private = { 0x20, 0x2e, 0x89, 0x72, 0xb6, 0x1c, 0x7e, 0x61,
+ 0x93, 0x0e, 0xb9, 0x45, 0x0b, 0x50, 0x70, 0xea,
+ 0xe1, 0xc6, 0x70, 0x47, 0x56, 0x85, 0x54, 0x1f,
+ 0x04, 0x76, 0x21, 0x7e, 0x48, 0x18, 0xcf, 0xab },
+ .public = { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0x38, 0xdd, 0xe9, 0xf3, 0xe7, 0xb7, 0x99, 0x04,
+ 0x5f, 0x9a, 0xc3, 0x79, 0x3d, 0x4a, 0x92, 0x77,
+ 0xda, 0xde, 0xad, 0xc4, 0x1b, 0xec, 0x02, 0x90,
+ 0xf8, 0x1f, 0x74, 0x4f, 0x73, 0x77, 0x5f, 0x84 },
+ .public = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x9a, 0x2c, 0xfe, 0x84, 0xff, 0x9c, 0x4a, 0x97,
+ 0x39, 0x62, 0x5c, 0xae, 0x4a, 0x3b, 0x82, 0xa9,
+ 0x06, 0x87, 0x7a, 0x44, 0x19, 0x46, 0xf8, 0xd7,
+ 0xb3, 0xd7, 0x95, 0xfe, 0x8f, 0x5d, 0x16, 0x39 },
+ .valid = true
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0x98, 0x57, 0xa9, 0x14, 0xe3, 0xc2, 0x90, 0x36,
+ 0xfd, 0x9a, 0x44, 0x2b, 0xa5, 0x26, 0xb5, 0xcd,
+ 0xcd, 0xf2, 0x82, 0x16, 0x15, 0x3e, 0x63, 0x6c,
+ 0x10, 0x67, 0x7a, 0xca, 0xb6, 0xbd, 0x6a, 0xa5 },
+ .public = { 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x4d, 0xa4, 0xe0, 0xaa, 0x07, 0x2c, 0x23, 0x2e,
+ 0xe2, 0xf0, 0xfa, 0x4e, 0x51, 0x9a, 0xe5, 0x0b,
+ 0x52, 0xc1, 0xed, 0xd0, 0x8a, 0x53, 0x4d, 0x4e,
+ 0xf3, 0x46, 0xc2, 0xe1, 0x06, 0xd2, 0x1d, 0x60 },
+ .valid = true
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0x48, 0xe2, 0x13, 0x0d, 0x72, 0x33, 0x05, 0xed,
+ 0x05, 0xe6, 0xe5, 0x89, 0x4d, 0x39, 0x8a, 0x5e,
+ 0x33, 0x36, 0x7a, 0x8c, 0x6a, 0xac, 0x8f, 0xcd,
+ 0xf0, 0xa8, 0x8e, 0x4b, 0x42, 0x82, 0x0d, 0xb7 },
+ .public = { 0xff, 0xff, 0xff, 0x03, 0x00, 0x00, 0xf8, 0xff,
+ 0xff, 0x1f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0xfe, 0xff, 0xff, 0x07, 0x00,
+ 0x00, 0xf0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00 },
+ .result = { 0x9e, 0xd1, 0x0c, 0x53, 0x74, 0x7f, 0x64, 0x7f,
+ 0x82, 0xf4, 0x51, 0x25, 0xd3, 0xde, 0x15, 0xa1,
+ 0xe6, 0xb8, 0x24, 0x49, 0x6a, 0xb4, 0x04, 0x10,
+ 0xff, 0xcc, 0x3c, 0xfe, 0x95, 0x76, 0x0f, 0x3b },
+ .valid = true
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0x28, 0xf4, 0x10, 0x11, 0x69, 0x18, 0x51, 0xb3,
+ 0xa6, 0x2b, 0x64, 0x15, 0x53, 0xb3, 0x0d, 0x0d,
+ 0xfd, 0xdc, 0xb8, 0xff, 0xfc, 0xf5, 0x37, 0x00,
+ 0xa7, 0xbe, 0x2f, 0x6a, 0x87, 0x2e, 0x9f, 0xb0 },
+ .public = { 0x00, 0x00, 0x00, 0xfc, 0xff, 0xff, 0x07, 0x00,
+ 0x00, 0xe0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0xf8, 0xff,
+ 0xff, 0x0f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0x7f },
+ .result = { 0xcf, 0x72, 0xb4, 0xaa, 0x6a, 0xa1, 0xc9, 0xf8,
+ 0x94, 0xf4, 0x16, 0x5b, 0x86, 0x10, 0x9a, 0xa4,
+ 0x68, 0x51, 0x76, 0x48, 0xe1, 0xf0, 0xcc, 0x70,
+ 0xe1, 0xab, 0x08, 0x46, 0x01, 0x76, 0x50, 0x6b },
+ .valid = true
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0x18, 0xa9, 0x3b, 0x64, 0x99, 0xb9, 0xf6, 0xb3,
+ 0x22, 0x5c, 0xa0, 0x2f, 0xef, 0x41, 0x0e, 0x0a,
+ 0xde, 0xc2, 0x35, 0x32, 0x32, 0x1d, 0x2d, 0x8e,
+ 0xf1, 0xa6, 0xd6, 0x02, 0xa8, 0xc6, 0x5b, 0x83 },
+ .public = { 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x5d, 0x50, 0xb6, 0x28, 0x36, 0xbb, 0x69, 0x57,
+ 0x94, 0x10, 0x38, 0x6c, 0xf7, 0xbb, 0x81, 0x1c,
+ 0x14, 0xbf, 0x85, 0xb1, 0xc7, 0xb1, 0x7e, 0x59,
+ 0x24, 0xc7, 0xff, 0xea, 0x91, 0xef, 0x9e, 0x12 },
+ .valid = true
+ },
+ /* wycheproof - edge case on twist */
+ {
+ .private = { 0xc0, 0x1d, 0x13, 0x05, 0xa1, 0x33, 0x8a, 0x1f,
+ 0xca, 0xc2, 0xba, 0x7e, 0x2e, 0x03, 0x2b, 0x42,
+ 0x7e, 0x0b, 0x04, 0x90, 0x31, 0x65, 0xac, 0xa9,
+ 0x57, 0xd8, 0xd0, 0x55, 0x3d, 0x87, 0x17, 0xb0 },
+ .public = { 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x19, 0x23, 0x0e, 0xb1, 0x48, 0xd5, 0xd6, 0x7c,
+ 0x3c, 0x22, 0xab, 0x1d, 0xae, 0xff, 0x80, 0xa5,
+ 0x7e, 0xae, 0x42, 0x65, 0xce, 0x28, 0x72, 0x65,
+ 0x7b, 0x2c, 0x80, 0x99, 0xfc, 0x69, 0x8e, 0x50 },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0x38, 0x6f, 0x7f, 0x16, 0xc5, 0x07, 0x31, 0xd6,
+ 0x4f, 0x82, 0xe6, 0xa1, 0x70, 0xb1, 0x42, 0xa4,
+ 0xe3, 0x4f, 0x31, 0xfd, 0x77, 0x68, 0xfc, 0xb8,
+ 0x90, 0x29, 0x25, 0xe7, 0xd1, 0xe2, 0x1a, 0xbe },
+ .public = { 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x0f, 0xca, 0xb5, 0xd8, 0x42, 0xa0, 0x78, 0xd7,
+ 0xa7, 0x1f, 0xc5, 0x9b, 0x57, 0xbf, 0xb4, 0xca,
+ 0x0b, 0xe6, 0x87, 0x3b, 0x49, 0xdc, 0xdb, 0x9f,
+ 0x44, 0xe1, 0x4a, 0xe8, 0xfb, 0xdf, 0xa5, 0x42 },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0xe0, 0x23, 0xa2, 0x89, 0xbd, 0x5e, 0x90, 0xfa,
+ 0x28, 0x04, 0xdd, 0xc0, 0x19, 0xa0, 0x5e, 0xf3,
+ 0xe7, 0x9d, 0x43, 0x4b, 0xb6, 0xea, 0x2f, 0x52,
+ 0x2e, 0xcb, 0x64, 0x3a, 0x75, 0x29, 0x6e, 0x95 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 },
+ .result = { 0x54, 0xce, 0x8f, 0x22, 0x75, 0xc0, 0x77, 0xe3,
+ 0xb1, 0x30, 0x6a, 0x39, 0x39, 0xc5, 0xe0, 0x3e,
+ 0xef, 0x6b, 0xbb, 0x88, 0x06, 0x05, 0x44, 0x75,
+ 0x8d, 0x9f, 0xef, 0x59, 0xb0, 0xbc, 0x3e, 0x4f },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0x68, 0xf0, 0x10, 0xd6, 0x2e, 0xe8, 0xd9, 0x26,
+ 0x05, 0x3a, 0x36, 0x1c, 0x3a, 0x75, 0xc6, 0xea,
+ 0x4e, 0xbd, 0xc8, 0x60, 0x6a, 0xb2, 0x85, 0x00,
+ 0x3a, 0x6f, 0x8f, 0x40, 0x76, 0xb0, 0x1e, 0x83 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 },
+ .result = { 0xf1, 0x36, 0x77, 0x5c, 0x5b, 0xeb, 0x0a, 0xf8,
+ 0x11, 0x0a, 0xf1, 0x0b, 0x20, 0x37, 0x23, 0x32,
+ 0x04, 0x3c, 0xab, 0x75, 0x24, 0x19, 0x67, 0x87,
+ 0x75, 0xa2, 0x23, 0xdf, 0x57, 0xc9, 0xd3, 0x0d },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0x58, 0xeb, 0xcb, 0x35, 0xb0, 0xf8, 0x84, 0x5c,
+ 0xaf, 0x1e, 0xc6, 0x30, 0xf9, 0x65, 0x76, 0xb6,
+ 0x2c, 0x4b, 0x7b, 0x6c, 0x36, 0xb2, 0x9d, 0xeb,
+ 0x2c, 0xb0, 0x08, 0x46, 0x51, 0x75, 0x5c, 0x96 },
+ .public = { 0xff, 0xff, 0xff, 0xfb, 0xff, 0xff, 0xfb, 0xff,
+ 0xff, 0xdf, 0xff, 0xff, 0xdf, 0xff, 0xff, 0xff,
+ 0xfe, 0xff, 0xff, 0xfe, 0xff, 0xff, 0xf7, 0xff,
+ 0xff, 0xf7, 0xff, 0xff, 0xbf, 0xff, 0xff, 0x3f },
+ .result = { 0xbf, 0x9a, 0xff, 0xd0, 0x6b, 0x84, 0x40, 0x85,
+ 0x58, 0x64, 0x60, 0x96, 0x2e, 0xf2, 0x14, 0x6f,
+ 0xf3, 0xd4, 0x53, 0x3d, 0x94, 0x44, 0xaa, 0xb0,
+ 0x06, 0xeb, 0x88, 0xcc, 0x30, 0x54, 0x40, 0x7d },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0x18, 0x8c, 0x4b, 0xc5, 0xb9, 0xc4, 0x4b, 0x38,
+ 0xbb, 0x65, 0x8b, 0x9b, 0x2a, 0xe8, 0x2d, 0x5b,
+ 0x01, 0x01, 0x5e, 0x09, 0x31, 0x84, 0xb1, 0x7c,
+ 0xb7, 0x86, 0x35, 0x03, 0xa7, 0x83, 0xe1, 0xbb },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f },
+ .result = { 0xd4, 0x80, 0xde, 0x04, 0xf6, 0x99, 0xcb, 0x3b,
+ 0xe0, 0x68, 0x4a, 0x9c, 0xc2, 0xe3, 0x12, 0x81,
+ 0xea, 0x0b, 0xc5, 0xa9, 0xdc, 0xc1, 0x57, 0xd3,
+ 0xd2, 0x01, 0x58, 0xd4, 0x6c, 0xa5, 0x24, 0x6d },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0xe0, 0x6c, 0x11, 0xbb, 0x2e, 0x13, 0xce, 0x3d,
+ 0xc7, 0x67, 0x3f, 0x67, 0xf5, 0x48, 0x22, 0x42,
+ 0x90, 0x94, 0x23, 0xa9, 0xae, 0x95, 0xee, 0x98,
+ 0x6a, 0x98, 0x8d, 0x98, 0xfa, 0xee, 0x23, 0xa2 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f,
+ 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f },
+ .result = { 0x4c, 0x44, 0x01, 0xcc, 0xe6, 0xb5, 0x1e, 0x4c,
+ 0xb1, 0x8f, 0x27, 0x90, 0x24, 0x6c, 0x9b, 0xf9,
+ 0x14, 0xdb, 0x66, 0x77, 0x50, 0xa1, 0xcb, 0x89,
+ 0x06, 0x90, 0x92, 0xaf, 0x07, 0x29, 0x22, 0x76 },
+ .valid = true
+ },
+ /* wycheproof - edge case for public key */
+ {
+ .private = { 0xc0, 0x65, 0x8c, 0x46, 0xdd, 0xe1, 0x81, 0x29,
+ 0x29, 0x38, 0x77, 0x53, 0x5b, 0x11, 0x62, 0xb6,
+ 0xf9, 0xf5, 0x41, 0x4a, 0x23, 0xcf, 0x4d, 0x2c,
+ 0xbc, 0x14, 0x0a, 0x4d, 0x99, 0xda, 0x2b, 0x8f },
+ .public = { 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x57, 0x8b, 0xa8, 0xcc, 0x2d, 0xbd, 0xc5, 0x75,
+ 0xaf, 0xcf, 0x9d, 0xf2, 0xb3, 0xee, 0x61, 0x89,
+ 0xf5, 0x33, 0x7d, 0x68, 0x54, 0xc7, 0x9b, 0x4c,
+ 0xe1, 0x65, 0xea, 0x12, 0x29, 0x3b, 0x3a, 0x0f },
+ .valid = true
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x10, 0x25, 0x5c, 0x92, 0x30, 0xa9, 0x7a, 0x30,
+ 0xa4, 0x58, 0xca, 0x28, 0x4a, 0x62, 0x96, 0x69,
+ 0x29, 0x3a, 0x31, 0x89, 0x0c, 0xda, 0x9d, 0x14,
+ 0x7f, 0xeb, 0xc7, 0xd1, 0xe2, 0x2d, 0x6b, 0xb1 },
+ .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae,
+ 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a,
+ 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd,
+ 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8, 0x00 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x78, 0xf1, 0xe8, 0xed, 0xf1, 0x44, 0x81, 0xb3,
+ 0x89, 0x44, 0x8d, 0xac, 0x8f, 0x59, 0xc7, 0x0b,
+ 0x03, 0x8e, 0x7c, 0xf9, 0x2e, 0xf2, 0xc7, 0xef,
+ 0xf5, 0x7a, 0x72, 0x46, 0x6e, 0x11, 0x52, 0x96 },
+ .public = { 0x5f, 0x9c, 0x95, 0xbc, 0xa3, 0x50, 0x8c, 0x24,
+ 0xb1, 0xd0, 0xb1, 0x55, 0x9c, 0x83, 0xef, 0x5b,
+ 0x04, 0x44, 0x5c, 0xc4, 0x58, 0x1c, 0x8e, 0x86,
+ 0xd8, 0x22, 0x4e, 0xdd, 0xd0, 0x9f, 0x11, 0x57 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0xa0, 0xa0, 0x5a, 0x3e, 0x8f, 0x9f, 0x44, 0x20,
+ 0x4d, 0x5f, 0x80, 0x59, 0xa9, 0x4a, 0xc7, 0xdf,
+ 0xc3, 0x9a, 0x49, 0xac, 0x01, 0x6d, 0xd7, 0x43,
+ 0xdb, 0xfa, 0x43, 0xc5, 0xd6, 0x71, 0xfd, 0x88 },
+ .public = { 0xec, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0xd0, 0xdb, 0xb3, 0xed, 0x19, 0x06, 0x66, 0x3f,
+ 0x15, 0x42, 0x0a, 0xf3, 0x1f, 0x4e, 0xaf, 0x65,
+ 0x09, 0xd9, 0xa9, 0x94, 0x97, 0x23, 0x50, 0x06,
+ 0x05, 0xad, 0x7c, 0x1c, 0x6e, 0x74, 0x50, 0xa9 },
+ .public = { 0xed, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0xc0, 0xb1, 0xd0, 0xeb, 0x22, 0xb2, 0x44, 0xfe,
+ 0x32, 0x91, 0x14, 0x00, 0x72, 0xcd, 0xd9, 0xd9,
+ 0x89, 0xb5, 0xf0, 0xec, 0xd9, 0x6c, 0x10, 0x0f,
+ 0xeb, 0x5b, 0xca, 0x24, 0x1c, 0x1d, 0x9f, 0x8f },
+ .public = { 0xee, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x48, 0x0b, 0xf4, 0x5f, 0x59, 0x49, 0x42, 0xa8,
+ 0xbc, 0x0f, 0x33, 0x53, 0xc6, 0xe8, 0xb8, 0x85,
+ 0x3d, 0x77, 0xf3, 0x51, 0xf1, 0xc2, 0xca, 0x6c,
+ 0x2d, 0x1a, 0xbf, 0x8a, 0x00, 0xb4, 0x22, 0x9c },
+ .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x30, 0xf9, 0x93, 0xfc, 0xf8, 0x51, 0x4f, 0xc8,
+ 0x9b, 0xd8, 0xdb, 0x14, 0xcd, 0x43, 0xba, 0x0d,
+ 0x4b, 0x25, 0x30, 0xe7, 0x3c, 0x42, 0x76, 0xa0,
+ 0x5e, 0x1b, 0x14, 0x5d, 0x42, 0x0c, 0xed, 0xb4 },
+ .public = { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0xc0, 0x49, 0x74, 0xb7, 0x58, 0x38, 0x0e, 0x2a,
+ 0x5b, 0x5d, 0xf6, 0xeb, 0x09, 0xbb, 0x2f, 0x6b,
+ 0x34, 0x34, 0xf9, 0x82, 0x72, 0x2a, 0x8e, 0x67,
+ 0x6d, 0x3d, 0xa2, 0x51, 0xd1, 0xb3, 0xde, 0x83 },
+ .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae,
+ 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a,
+ 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd,
+ 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8, 0x80 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x50, 0x2a, 0x31, 0x37, 0x3d, 0xb3, 0x24, 0x46,
+ 0x84, 0x2f, 0xe5, 0xad, 0xd3, 0xe0, 0x24, 0x02,
+ 0x2e, 0xa5, 0x4f, 0x27, 0x41, 0x82, 0xaf, 0xc3,
+ 0xd9, 0xf1, 0xbb, 0x3d, 0x39, 0x53, 0x4e, 0xb5 },
+ .public = { 0x5f, 0x9c, 0x95, 0xbc, 0xa3, 0x50, 0x8c, 0x24,
+ 0xb1, 0xd0, 0xb1, 0x55, 0x9c, 0x83, 0xef, 0x5b,
+ 0x04, 0x44, 0x5c, 0xc4, 0x58, 0x1c, 0x8e, 0x86,
+ 0xd8, 0x22, 0x4e, 0xdd, 0xd0, 0x9f, 0x11, 0xd7 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x90, 0xfa, 0x64, 0x17, 0xb0, 0xe3, 0x70, 0x30,
+ 0xfd, 0x6e, 0x43, 0xef, 0xf2, 0xab, 0xae, 0xf1,
+ 0x4c, 0x67, 0x93, 0x11, 0x7a, 0x03, 0x9c, 0xf6,
+ 0x21, 0x31, 0x8b, 0xa9, 0x0f, 0x4e, 0x98, 0xbe },
+ .public = { 0xec, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x78, 0xad, 0x3f, 0x26, 0x02, 0x7f, 0x1c, 0x9f,
+ 0xdd, 0x97, 0x5a, 0x16, 0x13, 0xb9, 0x47, 0x77,
+ 0x9b, 0xad, 0x2c, 0xf2, 0xb7, 0x41, 0xad, 0xe0,
+ 0x18, 0x40, 0x88, 0x5a, 0x30, 0xbb, 0x97, 0x9c },
+ .public = { 0xed, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key with low order */
+ {
+ .private = { 0x98, 0xe2, 0x3d, 0xe7, 0xb1, 0xe0, 0x92, 0x6e,
+ 0xd9, 0xc8, 0x7e, 0x7b, 0x14, 0xba, 0xf5, 0x5f,
+ 0x49, 0x7a, 0x1d, 0x70, 0x96, 0xf9, 0x39, 0x77,
+ 0x68, 0x0e, 0x44, 0xdc, 0x1c, 0x7b, 0x7b, 0x8b },
+ .public = { 0xee, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = false
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xf0, 0x1e, 0x48, 0xda, 0xfa, 0xc9, 0xd7, 0xbc,
+ 0xf5, 0x89, 0xcb, 0xc3, 0x82, 0xc8, 0x78, 0xd1,
+ 0x8b, 0xda, 0x35, 0x50, 0x58, 0x9f, 0xfb, 0x5d,
+ 0x50, 0xb5, 0x23, 0xbe, 0xbe, 0x32, 0x9d, 0xae },
+ .public = { 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0xbd, 0x36, 0xa0, 0x79, 0x0e, 0xb8, 0x83, 0x09,
+ 0x8c, 0x98, 0x8b, 0x21, 0x78, 0x67, 0x73, 0xde,
+ 0x0b, 0x3a, 0x4d, 0xf1, 0x62, 0x28, 0x2c, 0xf1,
+ 0x10, 0xde, 0x18, 0xdd, 0x48, 0x4c, 0xe7, 0x4b },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x28, 0x87, 0x96, 0xbc, 0x5a, 0xff, 0x4b, 0x81,
+ 0xa3, 0x75, 0x01, 0x75, 0x7b, 0xc0, 0x75, 0x3a,
+ 0x3c, 0x21, 0x96, 0x47, 0x90, 0xd3, 0x86, 0x99,
+ 0x30, 0x8d, 0xeb, 0xc1, 0x7a, 0x6e, 0xaf, 0x8d },
+ .public = { 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0xb4, 0xe0, 0xdd, 0x76, 0xda, 0x7b, 0x07, 0x17,
+ 0x28, 0xb6, 0x1f, 0x85, 0x67, 0x71, 0xaa, 0x35,
+ 0x6e, 0x57, 0xed, 0xa7, 0x8a, 0x5b, 0x16, 0x55,
+ 0xcc, 0x38, 0x20, 0xfb, 0x5f, 0x85, 0x4c, 0x5c },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x98, 0xdf, 0x84, 0x5f, 0x66, 0x51, 0xbf, 0x11,
+ 0x38, 0x22, 0x1f, 0x11, 0x90, 0x41, 0xf7, 0x2b,
+ 0x6d, 0xbc, 0x3c, 0x4a, 0xce, 0x71, 0x43, 0xd9,
+ 0x9f, 0xd5, 0x5a, 0xd8, 0x67, 0x48, 0x0d, 0xa8 },
+ .public = { 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x6f, 0xdf, 0x6c, 0x37, 0x61, 0x1d, 0xbd, 0x53,
+ 0x04, 0xdc, 0x0f, 0x2e, 0xb7, 0xc9, 0x51, 0x7e,
+ 0xb3, 0xc5, 0x0e, 0x12, 0xfd, 0x05, 0x0a, 0xc6,
+ 0xde, 0xc2, 0x70, 0x71, 0xd4, 0xbf, 0xc0, 0x34 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xf0, 0x94, 0x98, 0xe4, 0x6f, 0x02, 0xf8, 0x78,
+ 0x82, 0x9e, 0x78, 0xb8, 0x03, 0xd3, 0x16, 0xa2,
+ 0xed, 0x69, 0x5d, 0x04, 0x98, 0xa0, 0x8a, 0xbd,
+ 0xf8, 0x27, 0x69, 0x30, 0xe2, 0x4e, 0xdc, 0xb0 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .result = { 0x4c, 0x8f, 0xc4, 0xb1, 0xc6, 0xab, 0x88, 0xfb,
+ 0x21, 0xf1, 0x8f, 0x6d, 0x4c, 0x81, 0x02, 0x40,
+ 0xd4, 0xe9, 0x46, 0x51, 0xba, 0x44, 0xf7, 0xa2,
+ 0xc8, 0x63, 0xce, 0xc7, 0xdc, 0x56, 0x60, 0x2d },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x18, 0x13, 0xc1, 0x0a, 0x5c, 0x7f, 0x21, 0xf9,
+ 0x6e, 0x17, 0xf2, 0x88, 0xc0, 0xcc, 0x37, 0x60,
+ 0x7c, 0x04, 0xc5, 0xf5, 0xae, 0xa2, 0xdb, 0x13,
+ 0x4f, 0x9e, 0x2f, 0xfc, 0x66, 0xbd, 0x9d, 0xb8 },
+ .public = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 },
+ .result = { 0x1c, 0xd0, 0xb2, 0x82, 0x67, 0xdc, 0x54, 0x1c,
+ 0x64, 0x2d, 0x6d, 0x7d, 0xca, 0x44, 0xa8, 0xb3,
+ 0x8a, 0x63, 0x73, 0x6e, 0xef, 0x5c, 0x4e, 0x65,
+ 0x01, 0xff, 0xbb, 0xb1, 0x78, 0x0c, 0x03, 0x3c },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x78, 0x57, 0xfb, 0x80, 0x86, 0x53, 0x64, 0x5a,
+ 0x0b, 0xeb, 0x13, 0x8a, 0x64, 0xf5, 0xf4, 0xd7,
+ 0x33, 0xa4, 0x5e, 0xa8, 0x4c, 0x3c, 0xda, 0x11,
+ 0xa9, 0xc0, 0x6f, 0x7e, 0x71, 0x39, 0x14, 0x9e },
+ .public = { 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 },
+ .result = { 0x87, 0x55, 0xbe, 0x01, 0xc6, 0x0a, 0x7e, 0x82,
+ 0x5c, 0xff, 0x3e, 0x0e, 0x78, 0xcb, 0x3a, 0xa4,
+ 0x33, 0x38, 0x61, 0x51, 0x6a, 0xa5, 0x9b, 0x1c,
+ 0x51, 0xa8, 0xb2, 0xa5, 0x43, 0xdf, 0xa8, 0x22 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xe0, 0x3a, 0xa8, 0x42, 0xe2, 0xab, 0xc5, 0x6e,
+ 0x81, 0xe8, 0x7b, 0x8b, 0x9f, 0x41, 0x7b, 0x2a,
+ 0x1e, 0x59, 0x13, 0xc7, 0x23, 0xee, 0xd2, 0x8d,
+ 0x75, 0x2f, 0x8d, 0x47, 0xa5, 0x9f, 0x49, 0x8f },
+ .public = { 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 },
+ .result = { 0x54, 0xc9, 0xa1, 0xed, 0x95, 0xe5, 0x46, 0xd2,
+ 0x78, 0x22, 0xa3, 0x60, 0x93, 0x1d, 0xda, 0x60,
+ 0xa1, 0xdf, 0x04, 0x9d, 0xa6, 0xf9, 0x04, 0x25,
+ 0x3c, 0x06, 0x12, 0xbb, 0xdc, 0x08, 0x74, 0x76 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xf8, 0xf7, 0x07, 0xb7, 0x99, 0x9b, 0x18, 0xcb,
+ 0x0d, 0x6b, 0x96, 0x12, 0x4f, 0x20, 0x45, 0x97,
+ 0x2c, 0xa2, 0x74, 0xbf, 0xc1, 0x54, 0xad, 0x0c,
+ 0x87, 0x03, 0x8c, 0x24, 0xc6, 0xd0, 0xd4, 0xb2 },
+ .public = { 0xda, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0xcc, 0x1f, 0x40, 0xd7, 0x43, 0xcd, 0xc2, 0x23,
+ 0x0e, 0x10, 0x43, 0xda, 0xba, 0x8b, 0x75, 0xe8,
+ 0x10, 0xf1, 0xfb, 0xab, 0x7f, 0x25, 0x52, 0x69,
+ 0xbd, 0x9e, 0xbb, 0x29, 0xe6, 0xbf, 0x49, 0x4f },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xa0, 0x34, 0xf6, 0x84, 0xfa, 0x63, 0x1e, 0x1a,
+ 0x34, 0x81, 0x18, 0xc1, 0xce, 0x4c, 0x98, 0x23,
+ 0x1f, 0x2d, 0x9e, 0xec, 0x9b, 0xa5, 0x36, 0x5b,
+ 0x4a, 0x05, 0xd6, 0x9a, 0x78, 0x5b, 0x07, 0x96 },
+ .public = { 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x54, 0x99, 0x8e, 0xe4, 0x3a, 0x5b, 0x00, 0x7b,
+ 0xf4, 0x99, 0xf0, 0x78, 0xe7, 0x36, 0x52, 0x44,
+ 0x00, 0xa8, 0xb5, 0xc7, 0xe9, 0xb9, 0xb4, 0x37,
+ 0x71, 0x74, 0x8c, 0x7c, 0xdf, 0x88, 0x04, 0x12 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x30, 0xb6, 0xc6, 0xa0, 0xf2, 0xff, 0xa6, 0x80,
+ 0x76, 0x8f, 0x99, 0x2b, 0xa8, 0x9e, 0x15, 0x2d,
+ 0x5b, 0xc9, 0x89, 0x3d, 0x38, 0xc9, 0x11, 0x9b,
+ 0xe4, 0xf7, 0x67, 0xbf, 0xab, 0x6e, 0x0c, 0xa5 },
+ .public = { 0xdc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0xea, 0xd9, 0xb3, 0x8e, 0xfd, 0xd7, 0x23, 0x63,
+ 0x79, 0x34, 0xe5, 0x5a, 0xb7, 0x17, 0xa7, 0xae,
+ 0x09, 0xeb, 0x86, 0xa2, 0x1d, 0xc3, 0x6a, 0x3f,
+ 0xee, 0xb8, 0x8b, 0x75, 0x9e, 0x39, 0x1e, 0x09 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x90, 0x1b, 0x9d, 0xcf, 0x88, 0x1e, 0x01, 0xe0,
+ 0x27, 0x57, 0x50, 0x35, 0xd4, 0x0b, 0x43, 0xbd,
+ 0xc1, 0xc5, 0x24, 0x2e, 0x03, 0x08, 0x47, 0x49,
+ 0x5b, 0x0c, 0x72, 0x86, 0x46, 0x9b, 0x65, 0x91 },
+ .public = { 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x60, 0x2f, 0xf4, 0x07, 0x89, 0xb5, 0x4b, 0x41,
+ 0x80, 0x59, 0x15, 0xfe, 0x2a, 0x62, 0x21, 0xf0,
+ 0x7a, 0x50, 0xff, 0xc2, 0xc3, 0xfc, 0x94, 0xcf,
+ 0x61, 0xf1, 0x3d, 0x79, 0x04, 0xe8, 0x8e, 0x0e },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x80, 0x46, 0x67, 0x7c, 0x28, 0xfd, 0x82, 0xc9,
+ 0xa1, 0xbd, 0xb7, 0x1a, 0x1a, 0x1a, 0x34, 0xfa,
+ 0xba, 0x12, 0x25, 0xe2, 0x50, 0x7f, 0xe3, 0xf5,
+ 0x4d, 0x10, 0xbd, 0x5b, 0x0d, 0x86, 0x5f, 0x8e },
+ .public = { 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0xe0, 0x0a, 0xe8, 0xb1, 0x43, 0x47, 0x12, 0x47,
+ 0xba, 0x24, 0xf1, 0x2c, 0x88, 0x55, 0x36, 0xc3,
+ 0xcb, 0x98, 0x1b, 0x58, 0xe1, 0xe5, 0x6b, 0x2b,
+ 0xaf, 0x35, 0xc1, 0x2a, 0xe1, 0xf7, 0x9c, 0x26 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x60, 0x2f, 0x7e, 0x2f, 0x68, 0xa8, 0x46, 0xb8,
+ 0x2c, 0xc2, 0x69, 0xb1, 0xd4, 0x8e, 0x93, 0x98,
+ 0x86, 0xae, 0x54, 0xfd, 0x63, 0x6c, 0x1f, 0xe0,
+ 0x74, 0xd7, 0x10, 0x12, 0x7d, 0x47, 0x24, 0x91 },
+ .public = { 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x98, 0xcb, 0x9b, 0x50, 0xdd, 0x3f, 0xc2, 0xb0,
+ 0xd4, 0xf2, 0xd2, 0xbf, 0x7c, 0x5c, 0xfd, 0xd1,
+ 0x0c, 0x8f, 0xcd, 0x31, 0xfc, 0x40, 0xaf, 0x1a,
+ 0xd4, 0x4f, 0x47, 0xc1, 0x31, 0x37, 0x63, 0x62 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x60, 0x88, 0x7b, 0x3d, 0xc7, 0x24, 0x43, 0x02,
+ 0x6e, 0xbe, 0xdb, 0xbb, 0xb7, 0x06, 0x65, 0xf4,
+ 0x2b, 0x87, 0xad, 0xd1, 0x44, 0x0e, 0x77, 0x68,
+ 0xfb, 0xd7, 0xe8, 0xe2, 0xce, 0x5f, 0x63, 0x9d },
+ .public = { 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x38, 0xd6, 0x30, 0x4c, 0x4a, 0x7e, 0x6d, 0x9f,
+ 0x79, 0x59, 0x33, 0x4f, 0xb5, 0x24, 0x5b, 0xd2,
+ 0xc7, 0x54, 0x52, 0x5d, 0x4c, 0x91, 0xdb, 0x95,
+ 0x02, 0x06, 0x92, 0x62, 0x34, 0xc1, 0xf6, 0x33 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0x78, 0xd3, 0x1d, 0xfa, 0x85, 0x44, 0x97, 0xd7,
+ 0x2d, 0x8d, 0xef, 0x8a, 0x1b, 0x7f, 0xb0, 0x06,
+ 0xce, 0xc2, 0xd8, 0xc4, 0x92, 0x46, 0x47, 0xc9,
+ 0x38, 0x14, 0xae, 0x56, 0xfa, 0xed, 0xa4, 0x95 },
+ .public = { 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x78, 0x6c, 0xd5, 0x49, 0x96, 0xf0, 0x14, 0xa5,
+ 0xa0, 0x31, 0xec, 0x14, 0xdb, 0x81, 0x2e, 0xd0,
+ 0x83, 0x55, 0x06, 0x1f, 0xdb, 0x5d, 0xe6, 0x80,
+ 0xa8, 0x00, 0xac, 0x52, 0x1f, 0x31, 0x8e, 0x23 },
+ .valid = true
+ },
+ /* wycheproof - public key >= p */
+ {
+ .private = { 0xc0, 0x4c, 0x5b, 0xae, 0xfa, 0x83, 0x02, 0xdd,
+ 0xde, 0xd6, 0xa4, 0xbb, 0x95, 0x77, 0x61, 0xb4,
+ 0xeb, 0x97, 0xae, 0xfa, 0x4f, 0xc3, 0xb8, 0x04,
+ 0x30, 0x85, 0xf9, 0x6a, 0x56, 0x59, 0xb3, 0xa5 },
+ .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+ .result = { 0x29, 0xae, 0x8b, 0xc7, 0x3e, 0x9b, 0x10, 0xa0,
+ 0x8b, 0x4f, 0x68, 0x1c, 0x43, 0xc3, 0xe0, 0xac,
+ 0x1a, 0x17, 0x1d, 0x31, 0xb3, 0x8f, 0x1a, 0x48,
+ 0xef, 0xba, 0x29, 0xae, 0x63, 0x9e, 0xa1, 0x34 },
+ .valid = true
+ },
+ /* wycheproof - RFC 7748 */
+ {
+ .private = { 0xa0, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d,
+ 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd,
+ 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18,
+ 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0x44 },
+ .public = { 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb,
+ 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c,
+ 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b,
+ 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c },
+ .result = { 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90,
+ 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f,
+ 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7,
+ 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 },
+ .valid = true
+ },
+ /* wycheproof - RFC 7748 */
+ {
+ .private = { 0x48, 0x66, 0xe9, 0xd4, 0xd1, 0xb4, 0x67, 0x3c,
+ 0x5a, 0xd2, 0x26, 0x91, 0x95, 0x7d, 0x6a, 0xf5,
+ 0xc1, 0x1b, 0x64, 0x21, 0xe0, 0xea, 0x01, 0xd4,
+ 0x2c, 0xa4, 0x16, 0x9e, 0x79, 0x18, 0xba, 0x4d },
+ .public = { 0xe5, 0x21, 0x0f, 0x12, 0x78, 0x68, 0x11, 0xd3,
+ 0xf4, 0xb7, 0x95, 0x9d, 0x05, 0x38, 0xae, 0x2c,
+ 0x31, 0xdb, 0xe7, 0x10, 0x6f, 0xc0, 0x3c, 0x3e,
+ 0xfc, 0x4c, 0xd5, 0x49, 0xc7, 0x15, 0xa4, 0x13 },
+ .result = { 0x95, 0xcb, 0xde, 0x94, 0x76, 0xe8, 0x90, 0x7d,
+ 0x7a, 0xad, 0xe4, 0x5c, 0xb4, 0xb8, 0x73, 0xf8,
+ 0x8b, 0x59, 0x5a, 0x68, 0x79, 0x9f, 0xa1, 0x52,
+ 0xe6, 0xf8, 0xf7, 0x64, 0x7a, 0xac, 0x79, 0x57 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x0a, 0xb4, 0xe7, 0x63, 0x80, 0xd8, 0x4d, 0xde,
+ 0x4f, 0x68, 0x33, 0xc5, 0x8f, 0x2a, 0x9f, 0xb8,
+ 0xf8, 0x3b, 0xb0, 0x16, 0x9b, 0x17, 0x2b, 0xe4,
+ 0xb6, 0xe0, 0x59, 0x28, 0x87, 0x74, 0x1a, 0x36 },
+ .result = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x89, 0xe1, 0x0d, 0x57, 0x01, 0xb4, 0x33, 0x7d,
+ 0x2d, 0x03, 0x21, 0x81, 0x53, 0x8b, 0x10, 0x64,
+ 0xbd, 0x40, 0x84, 0x40, 0x1c, 0xec, 0xa1, 0xfd,
+ 0x12, 0x66, 0x3a, 0x19, 0x59, 0x38, 0x80, 0x00 },
+ .result = { 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x2b, 0x55, 0xd3, 0xaa, 0x4a, 0x8f, 0x80, 0xc8,
+ 0xc0, 0xb2, 0xae, 0x5f, 0x93, 0x3e, 0x85, 0xaf,
+ 0x49, 0xbe, 0xac, 0x36, 0xc2, 0xfa, 0x73, 0x94,
+ 0xba, 0xb7, 0x6c, 0x89, 0x33, 0xf8, 0xf8, 0x1d },
+ .result = { 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x63, 0xe5, 0xb1, 0xfe, 0x96, 0x01, 0xfe, 0x84,
+ 0x38, 0x5d, 0x88, 0x66, 0xb0, 0x42, 0x12, 0x62,
+ 0xf7, 0x8f, 0xbf, 0xa5, 0xaf, 0xf9, 0x58, 0x5e,
+ 0x62, 0x66, 0x79, 0xb1, 0x85, 0x47, 0xd9, 0x59 },
+ .result = { 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0xe4, 0x28, 0xf3, 0xda, 0xc1, 0x78, 0x09, 0xf8,
+ 0x27, 0xa5, 0x22, 0xce, 0x32, 0x35, 0x50, 0x58,
+ 0xd0, 0x73, 0x69, 0x36, 0x4a, 0xa7, 0x89, 0x02,
+ 0xee, 0x10, 0x13, 0x9b, 0x9f, 0x9d, 0xd6, 0x53 },
+ .result = { 0xfc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0xb3, 0xb5, 0x0e, 0x3e, 0xd3, 0xa4, 0x07, 0xb9,
+ 0x5d, 0xe9, 0x42, 0xef, 0x74, 0x57, 0x5b, 0x5a,
+ 0xb8, 0xa1, 0x0c, 0x09, 0xee, 0x10, 0x35, 0x44,
+ 0xd6, 0x0b, 0xdf, 0xed, 0x81, 0x38, 0xab, 0x2b },
+ .result = { 0xf9, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x21, 0x3f, 0xff, 0xe9, 0x3d, 0x5e, 0xa8, 0xcd,
+ 0x24, 0x2e, 0x46, 0x28, 0x44, 0x02, 0x99, 0x22,
+ 0xc4, 0x3c, 0x77, 0xc9, 0xe3, 0xe4, 0x2f, 0x56,
+ 0x2f, 0x48, 0x5d, 0x24, 0xc5, 0x01, 0xa2, 0x0b },
+ .result = { 0xf3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x91, 0xb2, 0x32, 0xa1, 0x78, 0xb3, 0xcd, 0x53,
+ 0x09, 0x32, 0x44, 0x1e, 0x61, 0x39, 0x41, 0x8f,
+ 0x72, 0x17, 0x22, 0x92, 0xf1, 0xda, 0x4c, 0x18,
+ 0x34, 0xfc, 0x5e, 0xbf, 0xef, 0xb5, 0x1e, 0x3f },
+ .result = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x04, 0x5c, 0x6e, 0x11, 0xc5, 0xd3, 0x32, 0x55,
+ 0x6c, 0x78, 0x22, 0xfe, 0x94, 0xeb, 0xf8, 0x9b,
+ 0x56, 0xa3, 0x87, 0x8d, 0xc2, 0x7c, 0xa0, 0x79,
+ 0x10, 0x30, 0x58, 0x84, 0x9f, 0xab, 0xcb, 0x4f },
+ .result = { 0xe5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x1c, 0xa2, 0x19, 0x0b, 0x71, 0x16, 0x35, 0x39,
+ 0x06, 0x3c, 0x35, 0x77, 0x3b, 0xda, 0x0c, 0x9c,
+ 0x92, 0x8e, 0x91, 0x36, 0xf0, 0x62, 0x0a, 0xeb,
+ 0x09, 0x3f, 0x09, 0x91, 0x97, 0xb7, 0xf7, 0x4e },
+ .result = { 0xe3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0xf7, 0x6e, 0x90, 0x10, 0xac, 0x33, 0xc5, 0x04,
+ 0x3b, 0x2d, 0x3b, 0x76, 0xa8, 0x42, 0x17, 0x10,
+ 0x00, 0xc4, 0x91, 0x62, 0x22, 0xe9, 0xe8, 0x58,
+ 0x97, 0xa0, 0xae, 0xc7, 0xf6, 0x35, 0x0b, 0x3c },
+ .result = { 0xdd, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0xbb, 0x72, 0x68, 0x8d, 0x8f, 0x8a, 0xa7, 0xa3,
+ 0x9c, 0xd6, 0x06, 0x0c, 0xd5, 0xc8, 0x09, 0x3c,
+ 0xde, 0xc6, 0xfe, 0x34, 0x19, 0x37, 0xc3, 0x88,
+ 0x6a, 0x99, 0x34, 0x6c, 0xd0, 0x7f, 0xaa, 0x55 },
+ .result = { 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x88, 0xfd, 0xde, 0xa1, 0x93, 0x39, 0x1c, 0x6a,
+ 0x59, 0x33, 0xef, 0x9b, 0x71, 0x90, 0x15, 0x49,
+ 0x44, 0x72, 0x05, 0xaa, 0xe9, 0xda, 0x92, 0x8a,
+ 0x6b, 0x91, 0xa3, 0x52, 0xba, 0x10, 0xf4, 0x1f },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02 },
+ .valid = true
+ },
+ /* wycheproof - edge case for shared secret */
+ {
+ .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4,
+ 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3,
+ 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc,
+ 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 },
+ .public = { 0x30, 0x3b, 0x39, 0x2f, 0x15, 0x31, 0x16, 0xca,
+ 0xd9, 0xcc, 0x68, 0x2a, 0x00, 0xcc, 0xc4, 0x4c,
+ 0x95, 0xff, 0x0d, 0x3b, 0xbe, 0x56, 0x8b, 0xeb,
+ 0x6c, 0x4e, 0x73, 0x9b, 0xaf, 0xdc, 0x2c, 0x68 },
+ .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00 },
+ .valid = true
+ },
+ /* wycheproof - checking for overflow */
+ {
+ .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d,
+ 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d,
+ 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c,
+ 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 },
+ .public = { 0xfd, 0x30, 0x0a, 0xeb, 0x40, 0xe1, 0xfa, 0x58,
+ 0x25, 0x18, 0x41, 0x2b, 0x49, 0xb2, 0x08, 0xa7,
+ 0x84, 0x2b, 0x1e, 0x1f, 0x05, 0x6a, 0x04, 0x01,
+ 0x78, 0xea, 0x41, 0x41, 0x53, 0x4f, 0x65, 0x2d },
+ .result = { 0xb7, 0x34, 0x10, 0x5d, 0xc2, 0x57, 0x58, 0x5d,
+ 0x73, 0xb5, 0x66, 0xcc, 0xb7, 0x6f, 0x06, 0x27,
+ 0x95, 0xcc, 0xbe, 0xc8, 0x91, 0x28, 0xe5, 0x2b,
+ 0x02, 0xf3, 0xe5, 0x96, 0x39, 0xf1, 0x3c, 0x46 },
+ .valid = true
+ },
+ /* wycheproof - checking for overflow */
+ {
+ .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d,
+ 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d,
+ 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c,
+ 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 },
+ .public = { 0xc8, 0xef, 0x79, 0xb5, 0x14, 0xd7, 0x68, 0x26,
+ 0x77, 0xbc, 0x79, 0x31, 0xe0, 0x6e, 0xe5, 0xc2,
+ 0x7c, 0x9b, 0x39, 0x2b, 0x4a, 0xe9, 0x48, 0x44,
+ 0x73, 0xf5, 0x54, 0xe6, 0x67, 0x8e, 0xcc, 0x2e },
+ .result = { 0x64, 0x7a, 0x46, 0xb6, 0xfc, 0x3f, 0x40, 0xd6,
+ 0x21, 0x41, 0xee, 0x3c, 0xee, 0x70, 0x6b, 0x4d,
+ 0x7a, 0x92, 0x71, 0x59, 0x3a, 0x7b, 0x14, 0x3e,
+ 0x8e, 0x2e, 0x22, 0x79, 0x88, 0x3e, 0x45, 0x50 },
+ .valid = true
+ },
+ /* wycheproof - checking for overflow */
+ {
+ .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d,
+ 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d,
+ 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c,
+ 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 },
+ .public = { 0x64, 0xae, 0xac, 0x25, 0x04, 0x14, 0x48, 0x61,
+ 0x53, 0x2b, 0x7b, 0xbc, 0xb6, 0xc8, 0x7d, 0x67,
+ 0xdd, 0x4c, 0x1f, 0x07, 0xeb, 0xc2, 0xe0, 0x6e,
+ 0xff, 0xb9, 0x5a, 0xec, 0xc6, 0x17, 0x0b, 0x2c },
+ .result = { 0x4f, 0xf0, 0x3d, 0x5f, 0xb4, 0x3c, 0xd8, 0x65,
+ 0x7a, 0x3c, 0xf3, 0x7c, 0x13, 0x8c, 0xad, 0xce,
+ 0xcc, 0xe5, 0x09, 0xe4, 0xeb, 0xa0, 0x89, 0xd0,
+ 0xef, 0x40, 0xb4, 0xe4, 0xfb, 0x94, 0x61, 0x55 },
+ .valid = true
+ },
+ /* wycheproof - checking for overflow */
+ {
+ .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d,
+ 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d,
+ 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c,
+ 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 },
+ .public = { 0xbf, 0x68, 0xe3, 0x5e, 0x9b, 0xdb, 0x7e, 0xee,
+ 0x1b, 0x50, 0x57, 0x02, 0x21, 0x86, 0x0f, 0x5d,
+ 0xcd, 0xad, 0x8a, 0xcb, 0xab, 0x03, 0x1b, 0x14,
+ 0x97, 0x4c, 0xc4, 0x90, 0x13, 0xc4, 0x98, 0x31 },
+ .result = { 0x21, 0xce, 0xe5, 0x2e, 0xfd, 0xbc, 0x81, 0x2e,
+ 0x1d, 0x02, 0x1a, 0x4a, 0xf1, 0xe1, 0xd8, 0xbc,
+ 0x4d, 0xb3, 0xc4, 0x00, 0xe4, 0xd2, 0xa2, 0xc5,
+ 0x6a, 0x39, 0x26, 0xdb, 0x4d, 0x99, 0xc6, 0x5b },
+ .valid = true
+ },
+ /* wycheproof - checking for overflow */
+ {
+ .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d,
+ 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d,
+ 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c,
+ 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 },
+ .public = { 0x53, 0x47, 0xc4, 0x91, 0x33, 0x1a, 0x64, 0xb4,
+ 0x3d, 0xdc, 0x68, 0x30, 0x34, 0xe6, 0x77, 0xf5,
+ 0x3d, 0xc3, 0x2b, 0x52, 0xa5, 0x2a, 0x57, 0x7c,
+ 0x15, 0xa8, 0x3b, 0xf2, 0x98, 0xe9, 0x9f, 0x19 },
+ .result = { 0x18, 0xcb, 0x89, 0xe4, 0xe2, 0x0c, 0x0c, 0x2b,
+ 0xd3, 0x24, 0x30, 0x52, 0x45, 0x26, 0x6c, 0x93,
+ 0x27, 0x69, 0x0b, 0xbe, 0x79, 0xac, 0xb8, 0x8f,
+ 0x5b, 0x8f, 0xb3, 0xf7, 0x4e, 0xca, 0x3e, 0x52 },
+ .valid = true
+ },
+ /* wycheproof - private key == -1 (mod order) */
+ {
+ .private = { 0xa0, 0x23, 0xcd, 0xd0, 0x83, 0xef, 0x5b, 0xb8,
+ 0x2f, 0x10, 0xd6, 0x2e, 0x59, 0xe1, 0x5a, 0x68,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50 },
+ .public = { 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e,
+ 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57,
+ 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f,
+ 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 },
+ .result = { 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e,
+ 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57,
+ 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f,
+ 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 },
+ .valid = true
+ },
+ /* wycheproof - private key == 1 (mod order) on twist */
+ {
+ .private = { 0x58, 0x08, 0x3d, 0xd2, 0x61, 0xad, 0x91, 0xef,
+ 0xf9, 0x52, 0x32, 0x2e, 0xc8, 0x24, 0xc6, 0x82,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x5f },
+ .public = { 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f,
+ 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6,
+ 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64,
+ 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 },
+ .result = { 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f,
+ 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6,
+ 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64,
+ 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 },
+ .valid = true
+ }
+};
+
+static bool __init curve25519_selftest(void)
+{
+ bool success = true, ret, ret2;
+ size_t i = 0, j;
+ u8 in[CURVE25519_KEY_SIZE];
+ u8 out[CURVE25519_KEY_SIZE], out2[CURVE25519_KEY_SIZE];
+
+ for (i = 0; i < ARRAY_SIZE(curve25519_test_vectors); ++i) {
+ memset(out, 0, CURVE25519_KEY_SIZE);
+ ret = curve25519(out, curve25519_test_vectors[i].private,
+ curve25519_test_vectors[i].public);
+ if (ret != curve25519_test_vectors[i].valid ||
+ memcmp(out, curve25519_test_vectors[i].result,
+ CURVE25519_KEY_SIZE)) {
+ pr_err("curve25519 self-test %zu: FAIL\n", i + 1);
+ success = false;
+ }
+ }
+
+ for (i = 0; i < 5; ++i) {
+ get_random_bytes(in, sizeof(in));
+ ret = curve25519_generate_public(out, in);
+ ret2 = curve25519(out2, in, (u8[CURVE25519_KEY_SIZE]){ 9 });
+ if (ret != ret2 || memcmp(out, out2, CURVE25519_KEY_SIZE)) {
+ pr_err("curve25519 basepoint self-test %zu: FAIL: input - 0x",
+ i + 1);
+ for (j = CURVE25519_KEY_SIZE; j-- > 0;)
+ printk(KERN_CONT "%02x", in[j]);
+ printk(KERN_CONT "\n");
+ success = false;
+ }
+ }
+
+ return success;
+}
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 22/28] zinc: Curve25519 x86_64 implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (18 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 21/28] zinc: Curve25519 generic C implementations and selftest Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 23/28] zinc: import Bernstein and Schwabe's Curve25519 ARM implementation Jason A. Donenfeld
` (3 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Armando Faz-Hernández,
Thomas Gleixner, Ingo Molnar, x86, Jean-Philippe Aumasson,
Andy Lutomirski, Andrew Morton, Linus Torvalds, kernel-hardening,
linux-crypto
This implementation is the fastest available x86_64 implementation, and
unlike Sandy2x, it doesn't requie use of the floating point registers at
all. Instead it makes use of BMI2 and ADX, available on recent
microarchitectures. The implementation was written by Armando
Faz-Hernández with contributions (upstream) from Samuel Neves and me,
in addition to further changes in the kernel implementation from us.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Armando Faz-Hernández <armfazh@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/curve25519/curve25519-x86_64-glue.c | 48 +
lib/zinc/curve25519/curve25519-x86_64.h | 2333 ++++++++++++++++++
lib/zinc/curve25519/curve25519.c | 4 +
3 files changed, 2385 insertions(+)
create mode 100644 lib/zinc/curve25519/curve25519-x86_64-glue.c
create mode 100644 lib/zinc/curve25519/curve25519-x86_64.h
diff --git a/lib/zinc/curve25519/curve25519-x86_64-glue.c b/lib/zinc/curve25519/curve25519-x86_64-glue.c
new file mode 100644
index 000000000000..a0e35bb41683
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-x86_64-glue.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/cpufeature.h>
+#include <asm/processor.h>
+
+#include "curve25519-x86_64.h"
+
+static bool curve25519_use_bmi2 __ro_after_init;
+static bool curve25519_use_adx __ro_after_init;
+static bool *const curve25519_nobs[] __initconst = {
+ &curve25519_use_bmi2, &curve25519_use_adx };
+
+static void __init curve25519_fpu_init(void)
+{
+ curve25519_use_bmi2 = boot_cpu_has(X86_FEATURE_BMI2);
+ curve25519_use_adx = boot_cpu_has(X86_FEATURE_BMI2) &&
+ boot_cpu_has(X86_FEATURE_ADX);
+}
+
+static inline bool curve25519_arch(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE])
+{
+ if (curve25519_use_adx) {
+ curve25519_adx(mypublic, secret, basepoint);
+ return true;
+ } else if (curve25519_use_bmi2) {
+ curve25519_bmi2(mypublic, secret, basepoint);
+ return true;
+ }
+ return false;
+}
+
+static inline bool curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE])
+{
+ if (curve25519_use_adx) {
+ curve25519_adx_base(pub, secret);
+ return true;
+ } else if (curve25519_use_bmi2) {
+ curve25519_bmi2_base(pub, secret);
+ return true;
+ }
+ return false;
+}
diff --git a/lib/zinc/curve25519/curve25519-x86_64.h b/lib/zinc/curve25519/curve25519-x86_64.h
new file mode 100644
index 000000000000..258a30dbe66c
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-x86_64.h
@@ -0,0 +1,2333 @@
+/* SPDX-License-Identifier: GPL-2.0 OR LGPL-2.1 */
+/*
+ * Copyright (c) 2017 Armando Faz <armfazh@ic.unicamp.br>. All Rights Reserved.
+ * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ * Copyright (C) 2018 Samuel Neves <sneves@dei.uc.pt>. All Rights Reserved.
+ */
+
+enum { NUM_WORDS_ELTFP25519 = 4 };
+typedef __aligned(32) u64 eltfp25519_1w[NUM_WORDS_ELTFP25519];
+typedef __aligned(32) u64 eltfp25519_1w_buffer[2 * NUM_WORDS_ELTFP25519];
+
+#define mul_eltfp25519_1w_adx(c, a, b) do { \
+ mul_256x256_integer_adx(m.buffer, a, b); \
+ red_eltfp25519_1w_adx(c, m.buffer); \
+} while (0)
+
+#define mul_eltfp25519_1w_bmi2(c, a, b) do { \
+ mul_256x256_integer_bmi2(m.buffer, a, b); \
+ red_eltfp25519_1w_bmi2(c, m.buffer); \
+} while (0)
+
+#define sqr_eltfp25519_1w_adx(a) do { \
+ sqr_256x256_integer_adx(m.buffer, a); \
+ red_eltfp25519_1w_adx(a, m.buffer); \
+} while (0)
+
+#define sqr_eltfp25519_1w_bmi2(a) do { \
+ sqr_256x256_integer_bmi2(m.buffer, a); \
+ red_eltfp25519_1w_bmi2(a, m.buffer); \
+} while (0)
+
+#define mul_eltfp25519_2w_adx(c, a, b) do { \
+ mul2_256x256_integer_adx(m.buffer, a, b); \
+ red_eltfp25519_2w_adx(c, m.buffer); \
+} while (0)
+
+#define mul_eltfp25519_2w_bmi2(c, a, b) do { \
+ mul2_256x256_integer_bmi2(m.buffer, a, b); \
+ red_eltfp25519_2w_bmi2(c, m.buffer); \
+} while (0)
+
+#define sqr_eltfp25519_2w_adx(a) do { \
+ sqr2_256x256_integer_adx(m.buffer, a); \
+ red_eltfp25519_2w_adx(a, m.buffer); \
+} while (0)
+
+#define sqr_eltfp25519_2w_bmi2(a) do { \
+ sqr2_256x256_integer_bmi2(m.buffer, a); \
+ red_eltfp25519_2w_bmi2(a, m.buffer); \
+} while (0)
+
+#define sqrn_eltfp25519_1w_adx(a, times) do { \
+ int ____counter = (times); \
+ while (____counter-- > 0) \
+ sqr_eltfp25519_1w_adx(a); \
+} while (0)
+
+#define sqrn_eltfp25519_1w_bmi2(a, times) do { \
+ int ____counter = (times); \
+ while (____counter-- > 0) \
+ sqr_eltfp25519_1w_bmi2(a); \
+} while (0)
+
+#define copy_eltfp25519_1w(C, A) do { \
+ (C)[0] = (A)[0]; \
+ (C)[1] = (A)[1]; \
+ (C)[2] = (A)[2]; \
+ (C)[3] = (A)[3]; \
+} while (0)
+
+#define setzero_eltfp25519_1w(C) do { \
+ (C)[0] = 0; \
+ (C)[1] = 0; \
+ (C)[2] = 0; \
+ (C)[3] = 0; \
+} while (0)
+
+__aligned(32) static const u64 table_ladder_8k[252 * NUM_WORDS_ELTFP25519] = {
+ /* 1 */ 0xfffffffffffffff3UL, 0xffffffffffffffffUL,
+ 0xffffffffffffffffUL, 0x5fffffffffffffffUL,
+ /* 2 */ 0x6b8220f416aafe96UL, 0x82ebeb2b4f566a34UL,
+ 0xd5a9a5b075a5950fUL, 0x5142b2cf4b2488f4UL,
+ /* 3 */ 0x6aaebc750069680cUL, 0x89cf7820a0f99c41UL,
+ 0x2a58d9183b56d0f4UL, 0x4b5aca80e36011a4UL,
+ /* 4 */ 0x329132348c29745dUL, 0xf4a2e616e1642fd7UL,
+ 0x1e45bb03ff67bc34UL, 0x306912d0f42a9b4aUL,
+ /* 5 */ 0xff886507e6af7154UL, 0x04f50e13dfeec82fUL,
+ 0xaa512fe82abab5ceUL, 0x174e251a68d5f222UL,
+ /* 6 */ 0xcf96700d82028898UL, 0x1743e3370a2c02c5UL,
+ 0x379eec98b4e86eaaUL, 0x0c59888a51e0482eUL,
+ /* 7 */ 0xfbcbf1d699b5d189UL, 0xacaef0d58e9fdc84UL,
+ 0xc1c20d06231f7614UL, 0x2938218da274f972UL,
+ /* 8 */ 0xf6af49beff1d7f18UL, 0xcc541c22387ac9c2UL,
+ 0x96fcc9ef4015c56bUL, 0x69c1627c690913a9UL,
+ /* 9 */ 0x7a86fd2f4733db0eUL, 0xfdb8c4f29e087de9UL,
+ 0x095e4b1a8ea2a229UL, 0x1ad7a7c829b37a79UL,
+ /* 10 */ 0x342d89cad17ea0c0UL, 0x67bedda6cced2051UL,
+ 0x19ca31bf2bb42f74UL, 0x3df7b4c84980acbbUL,
+ /* 11 */ 0xa8c6444dc80ad883UL, 0xb91e440366e3ab85UL,
+ 0xc215cda00164f6d8UL, 0x3d867c6ef247e668UL,
+ /* 12 */ 0xc7dd582bcc3e658cUL, 0xfd2c4748ee0e5528UL,
+ 0xa0fd9b95cc9f4f71UL, 0x7529d871b0675ddfUL,
+ /* 13 */ 0xb8f568b42d3cbd78UL, 0x1233011b91f3da82UL,
+ 0x2dce6ccd4a7c3b62UL, 0x75e7fc8e9e498603UL,
+ /* 14 */ 0x2f4f13f1fcd0b6ecUL, 0xf1a8ca1f29ff7a45UL,
+ 0xc249c1a72981e29bUL, 0x6ebe0dbb8c83b56aUL,
+ /* 15 */ 0x7114fa8d170bb222UL, 0x65a2dcd5bf93935fUL,
+ 0xbdc41f68b59c979aUL, 0x2f0eef79a2ce9289UL,
+ /* 16 */ 0x42ecbf0c083c37ceUL, 0x2930bc09ec496322UL,
+ 0xf294b0c19cfeac0dUL, 0x3780aa4bedfabb80UL,
+ /* 17 */ 0x56c17d3e7cead929UL, 0xe7cb4beb2e5722c5UL,
+ 0x0ce931732dbfe15aUL, 0x41b883c7621052f8UL,
+ /* 18 */ 0xdbf75ca0c3d25350UL, 0x2936be086eb1e351UL,
+ 0xc936e03cb4a9b212UL, 0x1d45bf82322225aaUL,
+ /* 19 */ 0xe81ab1036a024cc5UL, 0xe212201c304c9a72UL,
+ 0xc5d73fba6832b1fcUL, 0x20ffdb5a4d839581UL,
+ /* 20 */ 0xa283d367be5d0fadUL, 0x6c2b25ca8b164475UL,
+ 0x9d4935467caaf22eUL, 0x5166408eee85ff49UL,
+ /* 21 */ 0x3c67baa2fab4e361UL, 0xb3e433c67ef35cefUL,
+ 0x5259729241159b1cUL, 0x6a621892d5b0ab33UL,
+ /* 22 */ 0x20b74a387555cdcbUL, 0x532aa10e1208923fUL,
+ 0xeaa17b7762281dd1UL, 0x61ab3443f05c44bfUL,
+ /* 23 */ 0x257a6c422324def8UL, 0x131c6c1017e3cf7fUL,
+ 0x23758739f630a257UL, 0x295a407a01a78580UL,
+ /* 24 */ 0xf8c443246d5da8d9UL, 0x19d775450c52fa5dUL,
+ 0x2afcfc92731bf83dUL, 0x7d10c8e81b2b4700UL,
+ /* 25 */ 0xc8e0271f70baa20bUL, 0x993748867ca63957UL,
+ 0x5412efb3cb7ed4bbUL, 0x3196d36173e62975UL,
+ /* 26 */ 0xde5bcad141c7dffcUL, 0x47cc8cd2b395c848UL,
+ 0xa34cd942e11af3cbUL, 0x0256dbf2d04ecec2UL,
+ /* 27 */ 0x875ab7e94b0e667fUL, 0xcad4dd83c0850d10UL,
+ 0x47f12e8f4e72c79fUL, 0x5f1a87bb8c85b19bUL,
+ /* 28 */ 0x7ae9d0b6437f51b8UL, 0x12c7ce5518879065UL,
+ 0x2ade09fe5cf77aeeUL, 0x23a05a2f7d2c5627UL,
+ /* 29 */ 0x5908e128f17c169aUL, 0xf77498dd8ad0852dUL,
+ 0x74b4c4ceab102f64UL, 0x183abadd10139845UL,
+ /* 30 */ 0xb165ba8daa92aaacUL, 0xd5c5ef9599386705UL,
+ 0xbe2f8f0cf8fc40d1UL, 0x2701e635ee204514UL,
+ /* 31 */ 0x629fa80020156514UL, 0xf223868764a8c1ceUL,
+ 0x5b894fff0b3f060eUL, 0x60d9944cf708a3faUL,
+ /* 32 */ 0xaeea001a1c7a201fUL, 0xebf16a633ee2ce63UL,
+ 0x6f7709594c7a07e1UL, 0x79b958150d0208cbUL,
+ /* 33 */ 0x24b55e5301d410e7UL, 0xe3a34edff3fdc84dUL,
+ 0xd88768e4904032d8UL, 0x131384427b3aaeecUL,
+ /* 34 */ 0x8405e51286234f14UL, 0x14dc4739adb4c529UL,
+ 0xb8a2b5b250634ffdUL, 0x2fe2a94ad8a7ff93UL,
+ /* 35 */ 0xec5c57efe843faddUL, 0x2843ce40f0bb9918UL,
+ 0xa4b561d6cf3d6305UL, 0x743629bde8fb777eUL,
+ /* 36 */ 0x343edd46bbaf738fUL, 0xed981828b101a651UL,
+ 0xa401760b882c797aUL, 0x1fc223e28dc88730UL,
+ /* 37 */ 0x48604e91fc0fba0eUL, 0xb637f78f052c6fa4UL,
+ 0x91ccac3d09e9239cUL, 0x23f7eed4437a687cUL,
+ /* 38 */ 0x5173b1118d9bd800UL, 0x29d641b63189d4a7UL,
+ 0xfdbf177988bbc586UL, 0x2959894fcad81df5UL,
+ /* 39 */ 0xaebc8ef3b4bbc899UL, 0x4148995ab26992b9UL,
+ 0x24e20b0134f92cfbUL, 0x40d158894a05dee8UL,
+ /* 40 */ 0x46b00b1185af76f6UL, 0x26bac77873187a79UL,
+ 0x3dc0bf95ab8fff5fUL, 0x2a608bd8945524d7UL,
+ /* 41 */ 0x26449588bd446302UL, 0x7c4bc21c0388439cUL,
+ 0x8e98a4f383bd11b2UL, 0x26218d7bc9d876b9UL,
+ /* 42 */ 0xe3081542997c178aUL, 0x3c2d29a86fb6606fUL,
+ 0x5c217736fa279374UL, 0x7dde05734afeb1faUL,
+ /* 43 */ 0x3bf10e3906d42babUL, 0xe4f7803e1980649cUL,
+ 0xe6053bf89595bf7aUL, 0x394faf38da245530UL,
+ /* 44 */ 0x7a8efb58896928f4UL, 0xfbc778e9cc6a113cUL,
+ 0x72670ce330af596fUL, 0x48f222a81d3d6cf7UL,
+ /* 45 */ 0xf01fce410d72caa7UL, 0x5a20ecc7213b5595UL,
+ 0x7bc21165c1fa1483UL, 0x07f89ae31da8a741UL,
+ /* 46 */ 0x05d2c2b4c6830ff9UL, 0xd43e330fc6316293UL,
+ 0xa5a5590a96d3a904UL, 0x705edb91a65333b6UL,
+ /* 47 */ 0x048ee15e0bb9a5f7UL, 0x3240cfca9e0aaf5dUL,
+ 0x8f4b71ceedc4a40bUL, 0x621c0da3de544a6dUL,
+ /* 48 */ 0x92872836a08c4091UL, 0xce8375b010c91445UL,
+ 0x8a72eb524f276394UL, 0x2667fcfa7ec83635UL,
+ /* 49 */ 0x7f4c173345e8752aUL, 0x061b47feee7079a5UL,
+ 0x25dd9afa9f86ff34UL, 0x3780cef5425dc89cUL,
+ /* 50 */ 0x1a46035a513bb4e9UL, 0x3e1ef379ac575adaUL,
+ 0xc78c5f1c5fa24b50UL, 0x321a967634fd9f22UL,
+ /* 51 */ 0x946707b8826e27faUL, 0x3dca84d64c506fd0UL,
+ 0xc189218075e91436UL, 0x6d9284169b3b8484UL,
+ /* 52 */ 0x3a67e840383f2ddfUL, 0x33eec9a30c4f9b75UL,
+ 0x3ec7c86fa783ef47UL, 0x26ec449fbac9fbc4UL,
+ /* 53 */ 0x5c0f38cba09b9e7dUL, 0x81168cc762a3478cUL,
+ 0x3e23b0d306fc121cUL, 0x5a238aa0a5efdcddUL,
+ /* 54 */ 0x1ba26121c4ea43ffUL, 0x36f8c77f7c8832b5UL,
+ 0x88fbea0b0adcf99aUL, 0x5ca9938ec25bebf9UL,
+ /* 55 */ 0xd5436a5e51fccda0UL, 0x1dbc4797c2cd893bUL,
+ 0x19346a65d3224a08UL, 0x0f5034e49b9af466UL,
+ /* 56 */ 0xf23c3967a1e0b96eUL, 0xe58b08fa867a4d88UL,
+ 0xfb2fabc6a7341679UL, 0x2a75381eb6026946UL,
+ /* 57 */ 0xc80a3be4c19420acUL, 0x66b1f6c681f2b6dcUL,
+ 0x7cf7036761e93388UL, 0x25abbbd8a660a4c4UL,
+ /* 58 */ 0x91ea12ba14fd5198UL, 0x684950fc4a3cffa9UL,
+ 0xf826842130f5ad28UL, 0x3ea988f75301a441UL,
+ /* 59 */ 0xc978109a695f8c6fUL, 0x1746eb4a0530c3f3UL,
+ 0x444d6d77b4459995UL, 0x75952b8c054e5cc7UL,
+ /* 60 */ 0xa3703f7915f4d6aaUL, 0x66c346202f2647d8UL,
+ 0xd01469df811d644bUL, 0x77fea47d81a5d71fUL,
+ /* 61 */ 0xc5e9529ef57ca381UL, 0x6eeeb4b9ce2f881aUL,
+ 0xb6e91a28e8009bd6UL, 0x4b80be3e9afc3fecUL,
+ /* 62 */ 0x7e3773c526aed2c5UL, 0x1b4afcb453c9a49dUL,
+ 0xa920bdd7baffb24dUL, 0x7c54699f122d400eUL,
+ /* 63 */ 0xef46c8e14fa94bc8UL, 0xe0b074ce2952ed5eUL,
+ 0xbea450e1dbd885d5UL, 0x61b68649320f712cUL,
+ /* 64 */ 0x8a485f7309ccbdd1UL, 0xbd06320d7d4d1a2dUL,
+ 0x25232973322dbef4UL, 0x445dc4758c17f770UL,
+ /* 65 */ 0xdb0434177cc8933cUL, 0xed6fe82175ea059fUL,
+ 0x1efebefdc053db34UL, 0x4adbe867c65daf99UL,
+ /* 66 */ 0x3acd71a2a90609dfUL, 0xe5e991856dd04050UL,
+ 0x1ec69b688157c23cUL, 0x697427f6885cfe4dUL,
+ /* 67 */ 0xd7be7b9b65e1a851UL, 0xa03d28d522c536ddUL,
+ 0x28399d658fd2b645UL, 0x49e5b7e17c2641e1UL,
+ /* 68 */ 0x6f8c3a98700457a4UL, 0x5078f0a25ebb6778UL,
+ 0xd13c3ccbc382960fUL, 0x2e003258a7df84b1UL,
+ /* 69 */ 0x8ad1f39be6296a1cUL, 0xc1eeaa652a5fbfb2UL,
+ 0x33ee0673fd26f3cbUL, 0x59256173a69d2cccUL,
+ /* 70 */ 0x41ea07aa4e18fc41UL, 0xd9fc19527c87a51eUL,
+ 0xbdaacb805831ca6fUL, 0x445b652dc916694fUL,
+ /* 71 */ 0xce92a3a7f2172315UL, 0x1edc282de11b9964UL,
+ 0xa1823aafe04c314aUL, 0x790a2d94437cf586UL,
+ /* 72 */ 0x71c447fb93f6e009UL, 0x8922a56722845276UL,
+ 0xbf70903b204f5169UL, 0x2f7a89891ba319feUL,
+ /* 73 */ 0x02a08eb577e2140cUL, 0xed9a4ed4427bdcf4UL,
+ 0x5253ec44e4323cd1UL, 0x3e88363c14e9355bUL,
+ /* 74 */ 0xaa66c14277110b8cUL, 0x1ae0391610a23390UL,
+ 0x2030bd12c93fc2a2UL, 0x3ee141579555c7abUL,
+ /* 75 */ 0x9214de3a6d6e7d41UL, 0x3ccdd88607f17efeUL,
+ 0x674f1288f8e11217UL, 0x5682250f329f93d0UL,
+ /* 76 */ 0x6cf00b136d2e396eUL, 0x6e4cf86f1014debfUL,
+ 0x5930b1b5bfcc4e83UL, 0x047069b48aba16b6UL,
+ /* 77 */ 0x0d4ce4ab69b20793UL, 0xb24db91a97d0fb9eUL,
+ 0xcdfa50f54e00d01dUL, 0x221b1085368bddb5UL,
+ /* 78 */ 0xe7e59468b1e3d8d2UL, 0x53c56563bd122f93UL,
+ 0xeee8a903e0663f09UL, 0x61efa662cbbe3d42UL,
+ /* 79 */ 0x2cf8ddddde6eab2aUL, 0x9bf80ad51435f231UL,
+ 0x5deadacec9f04973UL, 0x29275b5d41d29b27UL,
+ /* 80 */ 0xcfde0f0895ebf14fUL, 0xb9aab96b054905a7UL,
+ 0xcae80dd9a1c420fdUL, 0x0a63bf2f1673bbc7UL,
+ /* 81 */ 0x092f6e11958fbc8cUL, 0x672a81e804822fadUL,
+ 0xcac8351560d52517UL, 0x6f3f7722c8f192f8UL,
+ /* 82 */ 0xf8ba90ccc2e894b7UL, 0x2c7557a438ff9f0dUL,
+ 0x894d1d855ae52359UL, 0x68e122157b743d69UL,
+ /* 83 */ 0xd87e5570cfb919f3UL, 0x3f2cdecd95798db9UL,
+ 0x2121154710c0a2ceUL, 0x3c66a115246dc5b2UL,
+ /* 84 */ 0xcbedc562294ecb72UL, 0xba7143c36a280b16UL,
+ 0x9610c2efd4078b67UL, 0x6144735d946a4b1eUL,
+ /* 85 */ 0x536f111ed75b3350UL, 0x0211db8c2041d81bUL,
+ 0xf93cb1000e10413cUL, 0x149dfd3c039e8876UL,
+ /* 86 */ 0xd479dde46b63155bUL, 0xb66e15e93c837976UL,
+ 0xdafde43b1f13e038UL, 0x5fafda1a2e4b0b35UL,
+ /* 87 */ 0x3600bbdf17197581UL, 0x3972050bbe3cd2c2UL,
+ 0x5938906dbdd5be86UL, 0x34fce5e43f9b860fUL,
+ /* 88 */ 0x75a8a4cd42d14d02UL, 0x828dabc53441df65UL,
+ 0x33dcabedd2e131d3UL, 0x3ebad76fb814d25fUL,
+ /* 89 */ 0xd4906f566f70e10fUL, 0x5d12f7aa51690f5aUL,
+ 0x45adb16e76cefcf2UL, 0x01f768aead232999UL,
+ /* 90 */ 0x2b6cc77b6248febdUL, 0x3cd30628ec3aaffdUL,
+ 0xce1c0b80d4ef486aUL, 0x4c3bff2ea6f66c23UL,
+ /* 91 */ 0x3f2ec4094aeaeb5fUL, 0x61b19b286e372ca7UL,
+ 0x5eefa966de2a701dUL, 0x23b20565de55e3efUL,
+ /* 92 */ 0xe301ca5279d58557UL, 0x07b2d4ce27c2874fUL,
+ 0xa532cd8a9dcf1d67UL, 0x2a52fee23f2bff56UL,
+ /* 93 */ 0x8624efb37cd8663dUL, 0xbbc7ac20ffbd7594UL,
+ 0x57b85e9c82d37445UL, 0x7b3052cb86a6ec66UL,
+ /* 94 */ 0x3482f0ad2525e91eUL, 0x2cb68043d28edca0UL,
+ 0xaf4f6d052e1b003aUL, 0x185f8c2529781b0aUL,
+ /* 95 */ 0xaa41de5bd80ce0d6UL, 0x9407b2416853e9d6UL,
+ 0x563ec36e357f4c3aUL, 0x4cc4b8dd0e297bceUL,
+ /* 96 */ 0xa2fc1a52ffb8730eUL, 0x1811f16e67058e37UL,
+ 0x10f9a366cddf4ee1UL, 0x72f4a0c4a0b9f099UL,
+ /* 97 */ 0x8c16c06f663f4ea7UL, 0x693b3af74e970fbaUL,
+ 0x2102e7f1d69ec345UL, 0x0ba53cbc968a8089UL,
+ /* 98 */ 0xca3d9dc7fea15537UL, 0x4c6824bb51536493UL,
+ 0xb9886314844006b1UL, 0x40d2a72ab454cc60UL,
+ /* 99 */ 0x5936a1b712570975UL, 0x91b9d648debda657UL,
+ 0x3344094bb64330eaUL, 0x006ba10d12ee51d0UL,
+ /* 100 */ 0x19228468f5de5d58UL, 0x0eb12f4c38cc05b0UL,
+ 0xa1039f9dd5601990UL, 0x4502d4ce4fff0e0bUL,
+ /* 101 */ 0xeb2054106837c189UL, 0xd0f6544c6dd3b93cUL,
+ 0x40727064c416d74fUL, 0x6e15c6114b502ef0UL,
+ /* 102 */ 0x4df2a398cfb1a76bUL, 0x11256c7419f2f6b1UL,
+ 0x4a497962066e6043UL, 0x705b3aab41355b44UL,
+ /* 103 */ 0x365ef536d797b1d8UL, 0x00076bd622ddf0dbUL,
+ 0x3bbf33b0e0575a88UL, 0x3777aa05c8e4ca4dUL,
+ /* 104 */ 0x392745c85578db5fUL, 0x6fda4149dbae5ae2UL,
+ 0xb1f0b00b8adc9867UL, 0x09963437d36f1da3UL,
+ /* 105 */ 0x7e824e90a5dc3853UL, 0xccb5f6641f135cbdUL,
+ 0x6736d86c87ce8fccUL, 0x625f3ce26604249fUL,
+ /* 106 */ 0xaf8ac8059502f63fUL, 0x0c05e70a2e351469UL,
+ 0x35292e9c764b6305UL, 0x1a394360c7e23ac3UL,
+ /* 107 */ 0xd5c6d53251183264UL, 0x62065abd43c2b74fUL,
+ 0xb5fbf5d03b973f9bUL, 0x13a3da3661206e5eUL,
+ /* 108 */ 0xc6bd5837725d94e5UL, 0x18e30912205016c5UL,
+ 0x2088ce1570033c68UL, 0x7fba1f495c837987UL,
+ /* 109 */ 0x5a8c7423f2f9079dUL, 0x1735157b34023fc5UL,
+ 0xe4f9b49ad2fab351UL, 0x6691ff72c878e33cUL,
+ /* 110 */ 0x122c2adedc5eff3eUL, 0xf8dd4bf1d8956cf4UL,
+ 0xeb86205d9e9e5bdaUL, 0x049b92b9d975c743UL,
+ /* 111 */ 0xa5379730b0f6c05aUL, 0x72a0ffacc6f3a553UL,
+ 0xb0032c34b20dcd6dUL, 0x470e9dbc88d5164aUL,
+ /* 112 */ 0xb19cf10ca237c047UL, 0xb65466711f6c81a2UL,
+ 0xb3321bd16dd80b43UL, 0x48c14f600c5fbe8eUL,
+ /* 113 */ 0x66451c264aa6c803UL, 0xb66e3904a4fa7da6UL,
+ 0xd45f19b0b3128395UL, 0x31602627c3c9bc10UL,
+ /* 114 */ 0x3120dc4832e4e10dUL, 0xeb20c46756c717f7UL,
+ 0x00f52e3f67280294UL, 0x566d4fc14730c509UL,
+ /* 115 */ 0x7e3a5d40fd837206UL, 0xc1e926dc7159547aUL,
+ 0x216730fba68d6095UL, 0x22e8c3843f69cea7UL,
+ /* 116 */ 0x33d074e8930e4b2bUL, 0xb6e4350e84d15816UL,
+ 0x5534c26ad6ba2365UL, 0x7773c12f89f1f3f3UL,
+ /* 117 */ 0x8cba404da57962aaUL, 0x5b9897a81999ce56UL,
+ 0x508e862f121692fcUL, 0x3a81907fa093c291UL,
+ /* 118 */ 0x0dded0ff4725a510UL, 0x10d8cc10673fc503UL,
+ 0x5b9d151c9f1f4e89UL, 0x32a5c1d5cb09a44cUL,
+ /* 119 */ 0x1e0aa442b90541fbUL, 0x5f85eb7cc1b485dbUL,
+ 0xbee595ce8a9df2e5UL, 0x25e496c722422236UL,
+ /* 120 */ 0x5edf3c46cd0fe5b9UL, 0x34e75a7ed2a43388UL,
+ 0xe488de11d761e352UL, 0x0e878a01a085545cUL,
+ /* 121 */ 0xba493c77e021bb04UL, 0x2b4d1843c7df899aUL,
+ 0x9ea37a487ae80d67UL, 0x67a9958011e41794UL,
+ /* 122 */ 0x4b58051a6697b065UL, 0x47e33f7d8d6ba6d4UL,
+ 0xbb4da8d483ca46c1UL, 0x68becaa181c2db0dUL,
+ /* 123 */ 0x8d8980e90b989aa5UL, 0xf95eb14a2c93c99bUL,
+ 0x51c6c7c4796e73a2UL, 0x6e228363b5efb569UL,
+ /* 124 */ 0xc6bbc0b02dd624c8UL, 0x777eb47dec8170eeUL,
+ 0x3cde15a004cfafa9UL, 0x1dc6bc087160bf9bUL,
+ /* 125 */ 0x2e07e043eec34002UL, 0x18e9fc677a68dc7fUL,
+ 0xd8da03188bd15b9aUL, 0x48fbc3bb00568253UL,
+ /* 126 */ 0x57547d4cfb654ce1UL, 0xd3565b82a058e2adUL,
+ 0xf63eaf0bbf154478UL, 0x47531ef114dfbb18UL,
+ /* 127 */ 0xe1ec630a4278c587UL, 0x5507d546ca8e83f3UL,
+ 0x85e135c63adc0c2bUL, 0x0aa7efa85682844eUL,
+ /* 128 */ 0x72691ba8b3e1f615UL, 0x32b4e9701fbe3ffaUL,
+ 0x97b6d92e39bb7868UL, 0x2cfe53dea02e39e8UL,
+ /* 129 */ 0x687392cd85cd52b0UL, 0x27ff66c910e29831UL,
+ 0x97134556a9832d06UL, 0x269bb0360a84f8a0UL,
+ /* 130 */ 0x706e55457643f85cUL, 0x3734a48c9b597d1bUL,
+ 0x7aee91e8c6efa472UL, 0x5cd6abc198a9d9e0UL,
+ /* 131 */ 0x0e04de06cb3ce41aUL, 0xd8c6eb893402e138UL,
+ 0x904659bb686e3772UL, 0x7215c371746ba8c8UL,
+ /* 132 */ 0xfd12a97eeae4a2d9UL, 0x9514b7516394f2c5UL,
+ 0x266fd5809208f294UL, 0x5c847085619a26b9UL,
+ /* 133 */ 0x52985410fed694eaUL, 0x3c905b934a2ed254UL,
+ 0x10bb47692d3be467UL, 0x063b3d2d69e5e9e1UL,
+ /* 134 */ 0x472726eedda57debUL, 0xefb6c4ae10f41891UL,
+ 0x2b1641917b307614UL, 0x117c554fc4f45b7cUL,
+ /* 135 */ 0xc07cf3118f9d8812UL, 0x01dbd82050017939UL,
+ 0xd7e803f4171b2827UL, 0x1015e87487d225eaUL,
+ /* 136 */ 0xc58de3fed23acc4dUL, 0x50db91c294a7be2dUL,
+ 0x0b94d43d1c9cf457UL, 0x6b1640fa6e37524aUL,
+ /* 137 */ 0x692f346c5fda0d09UL, 0x200b1c59fa4d3151UL,
+ 0xb8c46f760777a296UL, 0x4b38395f3ffdfbcfUL,
+ /* 138 */ 0x18d25e00be54d671UL, 0x60d50582bec8aba6UL,
+ 0x87ad8f263b78b982UL, 0x50fdf64e9cda0432UL,
+ /* 139 */ 0x90f567aac578dcf0UL, 0xef1e9b0ef2a3133bUL,
+ 0x0eebba9242d9de71UL, 0x15473c9bf03101c7UL,
+ /* 140 */ 0x7c77e8ae56b78095UL, 0xb678e7666e6f078eUL,
+ 0x2da0b9615348ba1fUL, 0x7cf931c1ff733f0bUL,
+ /* 141 */ 0x26b357f50a0a366cUL, 0xe9708cf42b87d732UL,
+ 0xc13aeea5f91cb2c0UL, 0x35d90c991143bb4cUL,
+ /* 142 */ 0x47c1c404a9a0d9dcUL, 0x659e58451972d251UL,
+ 0x3875a8c473b38c31UL, 0x1fbd9ed379561f24UL,
+ /* 143 */ 0x11fabc6fd41ec28dUL, 0x7ef8dfe3cd2a2dcaUL,
+ 0x72e73b5d8c404595UL, 0x6135fa4954b72f27UL,
+ /* 144 */ 0xccfc32a2de24b69cUL, 0x3f55698c1f095d88UL,
+ 0xbe3350ed5ac3f929UL, 0x5e9bf806ca477eebUL,
+ /* 145 */ 0xe9ce8fb63c309f68UL, 0x5376f63565e1f9f4UL,
+ 0xd1afcfb35a6393f1UL, 0x6632a1ede5623506UL,
+ /* 146 */ 0x0b7d6c390c2ded4cUL, 0x56cb3281df04cb1fUL,
+ 0x66305a1249ecc3c7UL, 0x5d588b60a38ca72aUL,
+ /* 147 */ 0xa6ecbf78e8e5f42dUL, 0x86eeb44b3c8a3eecUL,
+ 0xec219c48fbd21604UL, 0x1aaf1af517c36731UL,
+ /* 148 */ 0xc306a2836769bde7UL, 0x208280622b1e2adbUL,
+ 0x8027f51ffbff94a6UL, 0x76cfa1ce1124f26bUL,
+ /* 149 */ 0x18eb00562422abb6UL, 0xf377c4d58f8c29c3UL,
+ 0x4dbbc207f531561aUL, 0x0253b7f082128a27UL,
+ /* 150 */ 0x3d1f091cb62c17e0UL, 0x4860e1abd64628a9UL,
+ 0x52d17436309d4253UL, 0x356f97e13efae576UL,
+ /* 151 */ 0xd351e11aa150535bUL, 0x3e6b45bb1dd878ccUL,
+ 0x0c776128bed92c98UL, 0x1d34ae93032885b8UL,
+ /* 152 */ 0x4ba0488ca85ba4c3UL, 0x985348c33c9ce6ceUL,
+ 0x66124c6f97bda770UL, 0x0f81a0290654124aUL,
+ /* 153 */ 0x9ed09ca6569b86fdUL, 0x811009fd18af9a2dUL,
+ 0xff08d03f93d8c20aUL, 0x52a148199faef26bUL,
+ /* 154 */ 0x3e03f9dc2d8d1b73UL, 0x4205801873961a70UL,
+ 0xc0d987f041a35970UL, 0x07aa1f15a1c0d549UL,
+ /* 155 */ 0xdfd46ce08cd27224UL, 0x6d0a024f934e4239UL,
+ 0x808a7a6399897b59UL, 0x0a4556e9e13d95a2UL,
+ /* 156 */ 0xd21a991fe9c13045UL, 0x9b0e8548fe7751b8UL,
+ 0x5da643cb4bf30035UL, 0x77db28d63940f721UL,
+ /* 157 */ 0xfc5eeb614adc9011UL, 0x5229419ae8c411ebUL,
+ 0x9ec3e7787d1dcf74UL, 0x340d053e216e4cb5UL,
+ /* 158 */ 0xcac7af39b48df2b4UL, 0xc0faec2871a10a94UL,
+ 0x140a69245ca575edUL, 0x0cf1c37134273a4cUL,
+ /* 159 */ 0xc8ee306ac224b8a5UL, 0x57eaee7ccb4930b0UL,
+ 0xa1e806bdaacbe74fUL, 0x7d9a62742eeb657dUL,
+ /* 160 */ 0x9eb6b6ef546c4830UL, 0x885cca1fddb36e2eUL,
+ 0xe6b9f383ef0d7105UL, 0x58654fef9d2e0412UL,
+ /* 161 */ 0xa905c4ffbe0e8e26UL, 0x942de5df9b31816eUL,
+ 0x497d723f802e88e1UL, 0x30684dea602f408dUL,
+ /* 162 */ 0x21e5a278a3e6cb34UL, 0xaefb6e6f5b151dc4UL,
+ 0xb30b8e049d77ca15UL, 0x28c3c9cf53b98981UL,
+ /* 163 */ 0x287fb721556cdd2aUL, 0x0d317ca897022274UL,
+ 0x7468c7423a543258UL, 0x4a7f11464eb5642fUL,
+ /* 164 */ 0xa237a4774d193aa6UL, 0xd865986ea92129a1UL,
+ 0x24c515ecf87c1a88UL, 0x604003575f39f5ebUL,
+ /* 165 */ 0x47b9f189570a9b27UL, 0x2b98cede465e4b78UL,
+ 0x026df551dbb85c20UL, 0x74fcd91047e21901UL,
+ /* 166 */ 0x13e2a90a23c1bfa3UL, 0x0cb0074e478519f6UL,
+ 0x5ff1cbbe3af6cf44UL, 0x67fe5438be812dbeUL,
+ /* 167 */ 0xd13cf64fa40f05b0UL, 0x054dfb2f32283787UL,
+ 0x4173915b7f0d2aeaUL, 0x482f144f1f610d4eUL,
+ /* 168 */ 0xf6210201b47f8234UL, 0x5d0ae1929e70b990UL,
+ 0xdcd7f455b049567cUL, 0x7e93d0f1f0916f01UL,
+ /* 169 */ 0xdd79cbf18a7db4faUL, 0xbe8391bf6f74c62fUL,
+ 0x027145d14b8291bdUL, 0x585a73ea2cbf1705UL,
+ /* 170 */ 0x485ca03e928a0db2UL, 0x10fc01a5742857e7UL,
+ 0x2f482edbd6d551a7UL, 0x0f0433b5048fdb8aUL,
+ /* 171 */ 0x60da2e8dd7dc6247UL, 0x88b4c9d38cd4819aUL,
+ 0x13033ac001f66697UL, 0x273b24fe3b367d75UL,
+ /* 172 */ 0xc6e8f66a31b3b9d4UL, 0x281514a494df49d5UL,
+ 0xd1726fdfc8b23da7UL, 0x4b3ae7d103dee548UL,
+ /* 173 */ 0xc6256e19ce4b9d7eUL, 0xff5c5cf186e3c61cUL,
+ 0xacc63ca34b8ec145UL, 0x74621888fee66574UL,
+ /* 174 */ 0x956f409645290a1eUL, 0xef0bf8e3263a962eUL,
+ 0xed6a50eb5ec2647bUL, 0x0694283a9dca7502UL,
+ /* 175 */ 0x769b963643a2dcd1UL, 0x42b7c8ea09fc5353UL,
+ 0x4f002aee13397eabUL, 0x63005e2c19b7d63aUL,
+ /* 176 */ 0xca6736da63023beaUL, 0x966c7f6db12a99b7UL,
+ 0xace09390c537c5e1UL, 0x0b696063a1aa89eeUL,
+ /* 177 */ 0xebb03e97288c56e5UL, 0x432a9f9f938c8be8UL,
+ 0xa6a5a93d5b717f71UL, 0x1a5fb4c3e18f9d97UL,
+ /* 178 */ 0x1c94e7ad1c60cdceUL, 0xee202a43fc02c4a0UL,
+ 0x8dafe4d867c46a20UL, 0x0a10263c8ac27b58UL,
+ /* 179 */ 0xd0dea9dfe4432a4aUL, 0x856af87bbe9277c5UL,
+ 0xce8472acc212c71aUL, 0x6f151b6d9bbb1e91UL,
+ /* 180 */ 0x26776c527ceed56aUL, 0x7d211cb7fbf8faecUL,
+ 0x37ae66a6fd4609ccUL, 0x1f81b702d2770c42UL,
+ /* 181 */ 0x2fb0b057eac58392UL, 0xe1dd89fe29744e9dUL,
+ 0xc964f8eb17beb4f8UL, 0x29571073c9a2d41eUL,
+ /* 182 */ 0xa948a18981c0e254UL, 0x2df6369b65b22830UL,
+ 0xa33eb2d75fcfd3c6UL, 0x078cd6ec4199a01fUL,
+ /* 183 */ 0x4a584a41ad900d2fUL, 0x32142b78e2c74c52UL,
+ 0x68c4e8338431c978UL, 0x7f69ea9008689fc2UL,
+ /* 184 */ 0x52f2c81e46a38265UL, 0xfd78072d04a832fdUL,
+ 0x8cd7d5fa25359e94UL, 0x4de71b7454cc29d2UL,
+ /* 185 */ 0x42eb60ad1eda6ac9UL, 0x0aad37dfdbc09c3aUL,
+ 0x81004b71e33cc191UL, 0x44e6be345122803cUL,
+ /* 186 */ 0x03fe8388ba1920dbUL, 0xf5d57c32150db008UL,
+ 0x49c8c4281af60c29UL, 0x21edb518de701aeeUL,
+ /* 187 */ 0x7fb63e418f06dc99UL, 0xa4460d99c166d7b8UL,
+ 0x24dd5248ce520a83UL, 0x5ec3ad712b928358UL,
+ /* 188 */ 0x15022a5fbd17930fUL, 0xa4f64a77d82570e3UL,
+ 0x12bc8d6915783712UL, 0x498194c0fc620abbUL,
+ /* 189 */ 0x38a2d9d255686c82UL, 0x785c6bd9193e21f0UL,
+ 0xe4d5c81ab24a5484UL, 0x56307860b2e20989UL,
+ /* 190 */ 0x429d55f78b4d74c4UL, 0x22f1834643350131UL,
+ 0x1e60c24598c71fffUL, 0x59f2f014979983efUL,
+ /* 191 */ 0x46a47d56eb494a44UL, 0x3e22a854d636a18eUL,
+ 0xb346e15274491c3bUL, 0x2ceafd4e5390cde7UL,
+ /* 192 */ 0xba8a8538be0d6675UL, 0x4b9074bb50818e23UL,
+ 0xcbdab89085d304c3UL, 0x61a24fe0e56192c4UL,
+ /* 193 */ 0xcb7615e6db525bcbUL, 0xdd7d8c35a567e4caUL,
+ 0xe6b4153acafcdd69UL, 0x2d668e097f3c9766UL,
+ /* 194 */ 0xa57e7e265ce55ef0UL, 0x5d9f4e527cd4b967UL,
+ 0xfbc83606492fd1e5UL, 0x090d52beb7c3f7aeUL,
+ /* 195 */ 0x09b9515a1e7b4d7cUL, 0x1f266a2599da44c0UL,
+ 0xa1c49548e2c55504UL, 0x7ef04287126f15ccUL,
+ /* 196 */ 0xfed1659dbd30ef15UL, 0x8b4ab9eec4e0277bUL,
+ 0x884d6236a5df3291UL, 0x1fd96ea6bf5cf788UL,
+ /* 197 */ 0x42a161981f190d9aUL, 0x61d849507e6052c1UL,
+ 0x9fe113bf285a2cd5UL, 0x7c22d676dbad85d8UL,
+ /* 198 */ 0x82e770ed2bfbd27dUL, 0x4c05b2ece996f5a5UL,
+ 0xcd40a9c2b0900150UL, 0x5895319213d9bf64UL,
+ /* 199 */ 0xe7cc5d703fea2e08UL, 0xb50c491258e2188cUL,
+ 0xcce30baa48205bf0UL, 0x537c659ccfa32d62UL,
+ /* 200 */ 0x37b6623a98cfc088UL, 0xfe9bed1fa4d6aca4UL,
+ 0x04d29b8e56a8d1b0UL, 0x725f71c40b519575UL,
+ /* 201 */ 0x28c7f89cd0339ce6UL, 0x8367b14469ddc18bUL,
+ 0x883ada83a6a1652cUL, 0x585f1974034d6c17UL,
+ /* 202 */ 0x89cfb266f1b19188UL, 0xe63b4863e7c35217UL,
+ 0xd88c9da6b4c0526aUL, 0x3e035c9df0954635UL,
+ /* 203 */ 0xdd9d5412fb45de9dUL, 0xdd684532e4cff40dUL,
+ 0x4b5c999b151d671cUL, 0x2d8c2cc811e7f690UL,
+ /* 204 */ 0x7f54be1d90055d40UL, 0xa464c5df464aaf40UL,
+ 0x33979624f0e917beUL, 0x2c018dc527356b30UL,
+ /* 205 */ 0xa5415024e330b3d4UL, 0x73ff3d96691652d3UL,
+ 0x94ec42c4ef9b59f1UL, 0x0747201618d08e5aUL,
+ /* 206 */ 0x4d6ca48aca411c53UL, 0x66415f2fcfa66119UL,
+ 0x9c4dd40051e227ffUL, 0x59810bc09a02f7ebUL,
+ /* 207 */ 0x2a7eb171b3dc101dUL, 0x441c5ab99ffef68eUL,
+ 0x32025c9b93b359eaUL, 0x5e8ce0a71e9d112fUL,
+ /* 208 */ 0xbfcccb92429503fdUL, 0xd271ba752f095d55UL,
+ 0x345ead5e972d091eUL, 0x18c8df11a83103baUL,
+ /* 209 */ 0x90cd949a9aed0f4cUL, 0xc5d1f4cb6660e37eUL,
+ 0xb8cac52d56c52e0bUL, 0x6e42e400c5808e0dUL,
+ /* 210 */ 0xa3b46966eeaefd23UL, 0x0c4f1f0be39ecdcaUL,
+ 0x189dc8c9d683a51dUL, 0x51f27f054c09351bUL,
+ /* 211 */ 0x4c487ccd2a320682UL, 0x587ea95bb3df1c96UL,
+ 0xc8ccf79e555cb8e8UL, 0x547dc829a206d73dUL,
+ /* 212 */ 0xb822a6cd80c39b06UL, 0xe96d54732000d4c6UL,
+ 0x28535b6f91463b4dUL, 0x228f4660e2486e1dUL,
+ /* 213 */ 0x98799538de8d3abfUL, 0x8cd8330045ebca6eUL,
+ 0x79952a008221e738UL, 0x4322e1a7535cd2bbUL,
+ /* 214 */ 0xb114c11819d1801cUL, 0x2016e4d84f3f5ec7UL,
+ 0xdd0e2df409260f4cUL, 0x5ec362c0ae5f7266UL,
+ /* 215 */ 0xc0462b18b8b2b4eeUL, 0x7cc8d950274d1afbUL,
+ 0xf25f7105436b02d2UL, 0x43bbf8dcbff9ccd3UL,
+ /* 216 */ 0xb6ad1767a039e9dfUL, 0xb0714da8f69d3583UL,
+ 0x5e55fa18b42931f5UL, 0x4ed5558f33c60961UL,
+ /* 217 */ 0x1fe37901c647a5ddUL, 0x593ddf1f8081d357UL,
+ 0x0249a4fd813fd7a6UL, 0x69acca274e9caf61UL,
+ /* 218 */ 0x047ba3ea330721c9UL, 0x83423fc20e7e1ea0UL,
+ 0x1df4c0af01314a60UL, 0x09a62dab89289527UL,
+ /* 219 */ 0xa5b325a49cc6cb00UL, 0xe94b5dc654b56cb6UL,
+ 0x3be28779adc994a0UL, 0x4296e8f8ba3a4aadUL,
+ /* 220 */ 0x328689761e451eabUL, 0x2e4d598bff59594aUL,
+ 0x49b96853d7a7084aUL, 0x4980a319601420a8UL,
+ /* 221 */ 0x9565b9e12f552c42UL, 0x8a5318db7100fe96UL,
+ 0x05c90b4d43add0d7UL, 0x538b4cd66a5d4edaUL,
+ /* 222 */ 0xf4e94fc3e89f039fUL, 0x592c9af26f618045UL,
+ 0x08a36eb5fd4b9550UL, 0x25fffaf6c2ed1419UL,
+ /* 223 */ 0x34434459cc79d354UL, 0xeeecbfb4b1d5476bUL,
+ 0xddeb34a061615d99UL, 0x5129cecceb64b773UL,
+ /* 224 */ 0xee43215894993520UL, 0x772f9c7cf14c0b3bUL,
+ 0xd2e2fce306bedad5UL, 0x715f42b546f06a97UL,
+ /* 225 */ 0x434ecdceda5b5f1aUL, 0x0da17115a49741a9UL,
+ 0x680bd77c73edad2eUL, 0x487c02354edd9041UL,
+ /* 226 */ 0xb8efeff3a70ed9c4UL, 0x56a32aa3e857e302UL,
+ 0xdf3a68bd48a2a5a0UL, 0x07f650b73176c444UL,
+ /* 227 */ 0xe38b9b1626e0ccb1UL, 0x79e053c18b09fb36UL,
+ 0x56d90319c9f94964UL, 0x1ca941e7ac9ff5c4UL,
+ /* 228 */ 0x49c4df29162fa0bbUL, 0x8488cf3282b33305UL,
+ 0x95dfda14cabb437dUL, 0x3391f78264d5ad86UL,
+ /* 229 */ 0x729ae06ae2b5095dUL, 0xd58a58d73259a946UL,
+ 0xe9834262d13921edUL, 0x27fedafaa54bb592UL,
+ /* 230 */ 0xa99dc5b829ad48bbUL, 0x5f025742499ee260UL,
+ 0x802c8ecd5d7513fdUL, 0x78ceb3ef3f6dd938UL,
+ /* 231 */ 0xc342f44f8a135d94UL, 0x7b9edb44828cdda3UL,
+ 0x9436d11a0537cfe7UL, 0x5064b164ec1ab4c8UL,
+ /* 232 */ 0x7020eccfd37eb2fcUL, 0x1f31ea3ed90d25fcUL,
+ 0x1b930d7bdfa1bb34UL, 0x5344467a48113044UL,
+ /* 233 */ 0x70073170f25e6dfbUL, 0xe385dc1a50114cc8UL,
+ 0x2348698ac8fc4f00UL, 0x2a77a55284dd40d8UL,
+ /* 234 */ 0xfe06afe0c98c6ce4UL, 0xc235df96dddfd6e4UL,
+ 0x1428d01e33bf1ed3UL, 0x785768ec9300bdafUL,
+ /* 235 */ 0x9702e57a91deb63bUL, 0x61bdb8bfe5ce8b80UL,
+ 0x645b426f3d1d58acUL, 0x4804a82227a557bcUL,
+ /* 236 */ 0x8e57048ab44d2601UL, 0x68d6501a4b3a6935UL,
+ 0xc39c9ec3f9e1c293UL, 0x4172f257d4de63e2UL,
+ /* 237 */ 0xd368b450330c6401UL, 0x040d3017418f2391UL,
+ 0x2c34bb6090b7d90dUL, 0x16f649228fdfd51fUL,
+ /* 238 */ 0xbea6818e2b928ef5UL, 0xe28ccf91cdc11e72UL,
+ 0x594aaa68e77a36cdUL, 0x313034806c7ffd0fUL,
+ /* 239 */ 0x8a9d27ac2249bd65UL, 0x19a3b464018e9512UL,
+ 0xc26ccff352b37ec7UL, 0x056f68341d797b21UL,
+ /* 240 */ 0x5e79d6757efd2327UL, 0xfabdbcb6553afe15UL,
+ 0xd3e7222c6eaf5a60UL, 0x7046c76d4dae743bUL,
+ /* 241 */ 0x660be872b18d4a55UL, 0x19992518574e1496UL,
+ 0xc103053a302bdcbbUL, 0x3ed8e9800b218e8eUL,
+ /* 242 */ 0x7b0b9239fa75e03eUL, 0xefe9fb684633c083UL,
+ 0x98a35fbe391a7793UL, 0x6065510fe2d0fe34UL,
+ /* 243 */ 0x55cb668548abad0cUL, 0xb4584548da87e527UL,
+ 0x2c43ecea0107c1ddUL, 0x526028809372de35UL,
+ /* 244 */ 0x3415c56af9213b1fUL, 0x5bee1a4d017e98dbUL,
+ 0x13f6b105b5cf709bUL, 0x5ff20e3482b29ab6UL,
+ /* 245 */ 0x0aa29c75cc2e6c90UL, 0xfc7d73ca3a70e206UL,
+ 0x899fc38fc4b5c515UL, 0x250386b124ffc207UL,
+ /* 246 */ 0x54ea28d5ae3d2b56UL, 0x9913149dd6de60ceUL,
+ 0x16694fc58f06d6c1UL, 0x46b23975eb018fc7UL,
+ /* 247 */ 0x470a6a0fb4b7b4e2UL, 0x5d92475a8f7253deUL,
+ 0xabeee5b52fbd3adbUL, 0x7fa20801a0806968UL,
+ /* 248 */ 0x76f3faf19f7714d2UL, 0xb3e840c12f4660c3UL,
+ 0x0fb4cd8df212744eUL, 0x4b065a251d3a2dd2UL,
+ /* 249 */ 0x5cebde383d77cd4aUL, 0x6adf39df882c9cb1UL,
+ 0xa2dd242eb09af759UL, 0x3147c0e50e5f6422UL,
+ /* 250 */ 0x164ca5101d1350dbUL, 0xf8d13479c33fc962UL,
+ 0xe640ce4d13e5da08UL, 0x4bdee0c45061f8baUL,
+ /* 251 */ 0xd7c46dc1a4edb1c9UL, 0x5514d7b6437fd98aUL,
+ 0x58942f6bb2a1c00bUL, 0x2dffb2ab1d70710eUL,
+ /* 252 */ 0xccdfcf2fc18b6d68UL, 0xa8ebcba8b7806167UL,
+ 0x980697f95e2937e3UL, 0x02fbba1cd0126e8cUL
+};
+
+/* c is two 512-bit products: c0[0:7]=a0[0:3]*b0[0:3] and c1[8:15]=a1[4:7]*b1[4:7]
+ * a is two 256-bit integers: a0[0:3] and a1[4:7]
+ * b is two 256-bit integers: b0[0:3] and b1[4:7]
+ */
+static void mul2_256x256_integer_adx(u64 *const c, const u64 *const a,
+ const u64 *const b)
+{
+ asm volatile(
+ "xorl %%r14d, %%r14d ;"
+ "movq (%1), %%rdx; " /* A[0] */
+ "mulx (%2), %%r8, %%r15; " /* A[0]*B[0] */
+ "xorl %%r10d, %%r10d ;"
+ "movq %%r8, (%0) ;"
+ "mulx 8(%2), %%r10, %%rax; " /* A[0]*B[1] */
+ "adox %%r10, %%r15 ;"
+ "mulx 16(%2), %%r8, %%rbx; " /* A[0]*B[2] */
+ "adox %%r8, %%rax ;"
+ "mulx 24(%2), %%r10, %%rcx; " /* A[0]*B[3] */
+ "adox %%r10, %%rbx ;"
+ /******************************************/
+ "adox %%r14, %%rcx ;"
+
+ "movq 8(%1), %%rdx; " /* A[1] */
+ "mulx (%2), %%r8, %%r9; " /* A[1]*B[0] */
+ "adox %%r15, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[1]*B[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rax ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[1]*B[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%rbx ;"
+ "mulx 24(%2), %%r10, %%r15; " /* A[1]*B[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%rcx ;"
+ /******************************************/
+ "adox %%r14, %%r15 ;"
+ "adcx %%r14, %%r15 ;"
+
+ "movq 16(%1), %%rdx; " /* A[2] */
+ "xorl %%r10d, %%r10d ;"
+ "mulx (%2), %%r8, %%r9; " /* A[2]*B[0] */
+ "adox %%rax, %%r8 ;"
+ "movq %%r8, 16(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[2]*B[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rbx ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[2]*B[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%rcx ;"
+ "mulx 24(%2), %%r10, %%rax; " /* A[2]*B[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%r15 ;"
+ /******************************************/
+ "adox %%r14, %%rax ;"
+ "adcx %%r14, %%rax ;"
+
+ "movq 24(%1), %%rdx; " /* A[3] */
+ "xorl %%r10d, %%r10d ;"
+ "mulx (%2), %%r8, %%r9; " /* A[3]*B[0] */
+ "adox %%rbx, %%r8 ;"
+ "movq %%r8, 24(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[3]*B[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rcx ;"
+ "movq %%rcx, 32(%0) ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[3]*B[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "mulx 24(%2), %%r10, %%rbx; " /* A[3]*B[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%rax ;"
+ "movq %%rax, 48(%0) ;"
+ /******************************************/
+ "adox %%r14, %%rbx ;"
+ "adcx %%r14, %%rbx ;"
+ "movq %%rbx, 56(%0) ;"
+
+ "movq 32(%1), %%rdx; " /* C[0] */
+ "mulx 32(%2), %%r8, %%r15; " /* C[0]*D[0] */
+ "xorl %%r10d, %%r10d ;"
+ "movq %%r8, 64(%0);"
+ "mulx 40(%2), %%r10, %%rax; " /* C[0]*D[1] */
+ "adox %%r10, %%r15 ;"
+ "mulx 48(%2), %%r8, %%rbx; " /* C[0]*D[2] */
+ "adox %%r8, %%rax ;"
+ "mulx 56(%2), %%r10, %%rcx; " /* C[0]*D[3] */
+ "adox %%r10, %%rbx ;"
+ /******************************************/
+ "adox %%r14, %%rcx ;"
+
+ "movq 40(%1), %%rdx; " /* C[1] */
+ "xorl %%r10d, %%r10d ;"
+ "mulx 32(%2), %%r8, %%r9; " /* C[1]*D[0] */
+ "adox %%r15, %%r8 ;"
+ "movq %%r8, 72(%0);"
+ "mulx 40(%2), %%r10, %%r11; " /* C[1]*D[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rax ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[1]*D[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%rbx ;"
+ "mulx 56(%2), %%r10, %%r15; " /* C[1]*D[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%rcx ;"
+ /******************************************/
+ "adox %%r14, %%r15 ;"
+ "adcx %%r14, %%r15 ;"
+
+ "movq 48(%1), %%rdx; " /* C[2] */
+ "xorl %%r10d, %%r10d ;"
+ "mulx 32(%2), %%r8, %%r9; " /* C[2]*D[0] */
+ "adox %%rax, %%r8 ;"
+ "movq %%r8, 80(%0);"
+ "mulx 40(%2), %%r10, %%r11; " /* C[2]*D[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rbx ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[2]*D[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%rcx ;"
+ "mulx 56(%2), %%r10, %%rax; " /* C[2]*D[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%r15 ;"
+ /******************************************/
+ "adox %%r14, %%rax ;"
+ "adcx %%r14, %%rax ;"
+
+ "movq 56(%1), %%rdx; " /* C[3] */
+ "xorl %%r10d, %%r10d ;"
+ "mulx 32(%2), %%r8, %%r9; " /* C[3]*D[0] */
+ "adox %%rbx, %%r8 ;"
+ "movq %%r8, 88(%0);"
+ "mulx 40(%2), %%r10, %%r11; " /* C[3]*D[1] */
+ "adox %%r10, %%r9 ;"
+ "adcx %%r9, %%rcx ;"
+ "movq %%rcx, 96(%0) ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[3]*D[2] */
+ "adox %%r8, %%r11 ;"
+ "adcx %%r11, %%r15 ;"
+ "movq %%r15, 104(%0) ;"
+ "mulx 56(%2), %%r10, %%rbx; " /* C[3]*D[3] */
+ "adox %%r10, %%r13 ;"
+ "adcx %%r13, %%rax ;"
+ "movq %%rax, 112(%0) ;"
+ /******************************************/
+ "adox %%r14, %%rbx ;"
+ "adcx %%r14, %%rbx ;"
+ "movq %%rbx, 120(%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11", "%r13", "%r14", "%r15");
+}
+
+static void mul2_256x256_integer_bmi2(u64 *const c, const u64 *const a,
+ const u64 *const b)
+{
+ asm volatile(
+ "movq (%1), %%rdx; " /* A[0] */
+ "mulx (%2), %%r8, %%r15; " /* A[0]*B[0] */
+ "movq %%r8, (%0) ;"
+ "mulx 8(%2), %%r10, %%rax; " /* A[0]*B[1] */
+ "addq %%r10, %%r15 ;"
+ "mulx 16(%2), %%r8, %%rbx; " /* A[0]*B[2] */
+ "adcq %%r8, %%rax ;"
+ "mulx 24(%2), %%r10, %%rcx; " /* A[0]*B[3] */
+ "adcq %%r10, %%rbx ;"
+ /******************************************/
+ "adcq $0, %%rcx ;"
+
+ "movq 8(%1), %%rdx; " /* A[1] */
+ "mulx (%2), %%r8, %%r9; " /* A[1]*B[0] */
+ "addq %%r15, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[1]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[1]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%r15; " /* A[1]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%r15 ;"
+
+ "addq %%r9, %%rax ;"
+ "adcq %%r11, %%rbx ;"
+ "adcq %%r13, %%rcx ;"
+ "adcq $0, %%r15 ;"
+
+ "movq 16(%1), %%rdx; " /* A[2] */
+ "mulx (%2), %%r8, %%r9; " /* A[2]*B[0] */
+ "addq %%rax, %%r8 ;"
+ "movq %%r8, 16(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[2]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[2]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%rax; " /* A[2]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rax ;"
+
+ "addq %%r9, %%rbx ;"
+ "adcq %%r11, %%rcx ;"
+ "adcq %%r13, %%r15 ;"
+ "adcq $0, %%rax ;"
+
+ "movq 24(%1), %%rdx; " /* A[3] */
+ "mulx (%2), %%r8, %%r9; " /* A[3]*B[0] */
+ "addq %%rbx, %%r8 ;"
+ "movq %%r8, 24(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[3]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[3]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%rbx; " /* A[3]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rbx ;"
+
+ "addq %%r9, %%rcx ;"
+ "movq %%rcx, 32(%0) ;"
+ "adcq %%r11, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "adcq %%r13, %%rax ;"
+ "movq %%rax, 48(%0) ;"
+ "adcq $0, %%rbx ;"
+ "movq %%rbx, 56(%0) ;"
+
+ "movq 32(%1), %%rdx; " /* C[0] */
+ "mulx 32(%2), %%r8, %%r15; " /* C[0]*D[0] */
+ "movq %%r8, 64(%0) ;"
+ "mulx 40(%2), %%r10, %%rax; " /* C[0]*D[1] */
+ "addq %%r10, %%r15 ;"
+ "mulx 48(%2), %%r8, %%rbx; " /* C[0]*D[2] */
+ "adcq %%r8, %%rax ;"
+ "mulx 56(%2), %%r10, %%rcx; " /* C[0]*D[3] */
+ "adcq %%r10, %%rbx ;"
+ /******************************************/
+ "adcq $0, %%rcx ;"
+
+ "movq 40(%1), %%rdx; " /* C[1] */
+ "mulx 32(%2), %%r8, %%r9; " /* C[1]*D[0] */
+ "addq %%r15, %%r8 ;"
+ "movq %%r8, 72(%0) ;"
+ "mulx 40(%2), %%r10, %%r11; " /* C[1]*D[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[1]*D[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 56(%2), %%r10, %%r15; " /* C[1]*D[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%r15 ;"
+
+ "addq %%r9, %%rax ;"
+ "adcq %%r11, %%rbx ;"
+ "adcq %%r13, %%rcx ;"
+ "adcq $0, %%r15 ;"
+
+ "movq 48(%1), %%rdx; " /* C[2] */
+ "mulx 32(%2), %%r8, %%r9; " /* C[2]*D[0] */
+ "addq %%rax, %%r8 ;"
+ "movq %%r8, 80(%0) ;"
+ "mulx 40(%2), %%r10, %%r11; " /* C[2]*D[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[2]*D[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 56(%2), %%r10, %%rax; " /* C[2]*D[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rax ;"
+
+ "addq %%r9, %%rbx ;"
+ "adcq %%r11, %%rcx ;"
+ "adcq %%r13, %%r15 ;"
+ "adcq $0, %%rax ;"
+
+ "movq 56(%1), %%rdx; " /* C[3] */
+ "mulx 32(%2), %%r8, %%r9; " /* C[3]*D[0] */
+ "addq %%rbx, %%r8 ;"
+ "movq %%r8, 88(%0) ;"
+ "mulx 40(%2), %%r10, %%r11; " /* C[3]*D[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 48(%2), %%r8, %%r13; " /* C[3]*D[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 56(%2), %%r10, %%rbx; " /* C[3]*D[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rbx ;"
+
+ "addq %%r9, %%rcx ;"
+ "movq %%rcx, 96(%0) ;"
+ "adcq %%r11, %%r15 ;"
+ "movq %%r15, 104(%0) ;"
+ "adcq %%r13, %%rax ;"
+ "movq %%rax, 112(%0) ;"
+ "adcq $0, %%rbx ;"
+ "movq %%rbx, 120(%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11", "%r13", "%r15");
+}
+
+static void sqr2_256x256_integer_adx(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movq (%1), %%rdx ;" /* A[0] */
+ "mulx 8(%1), %%r8, %%r14 ;" /* A[1]*A[0] */
+ "xorl %%r15d, %%r15d;"
+ "mulx 16(%1), %%r9, %%r10 ;" /* A[2]*A[0] */
+ "adcx %%r14, %%r9 ;"
+ "mulx 24(%1), %%rax, %%rcx ;" /* A[3]*A[0] */
+ "adcx %%rax, %%r10 ;"
+ "movq 24(%1), %%rdx ;" /* A[3] */
+ "mulx 8(%1), %%r11, %%rbx ;" /* A[1]*A[3] */
+ "adcx %%rcx, %%r11 ;"
+ "mulx 16(%1), %%rax, %%r13 ;" /* A[2]*A[3] */
+ "adcx %%rax, %%rbx ;"
+ "movq 8(%1), %%rdx ;" /* A[1] */
+ "adcx %%r15, %%r13 ;"
+ "mulx 16(%1), %%rax, %%rcx ;" /* A[2]*A[1] */
+ "movq $0, %%r14 ;"
+ /******************************************/
+ "adcx %%r15, %%r14 ;"
+
+ "xorl %%r15d, %%r15d;"
+ "adox %%rax, %%r10 ;"
+ "adcx %%r8, %%r8 ;"
+ "adox %%rcx, %%r11 ;"
+ "adcx %%r9, %%r9 ;"
+ "adox %%r15, %%rbx ;"
+ "adcx %%r10, %%r10 ;"
+ "adox %%r15, %%r13 ;"
+ "adcx %%r11, %%r11 ;"
+ "adox %%r15, %%r14 ;"
+ "adcx %%rbx, %%rbx ;"
+ "adcx %%r13, %%r13 ;"
+ "adcx %%r14, %%r14 ;"
+
+ "movq (%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[0]^2 */
+ /*******************/
+ "movq %%rax, 0(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "movq 8(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 16(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 24(%0) ;"
+ "movq 16(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 32(%0) ;"
+ "adcq %%rcx, %%rbx ;"
+ "movq %%rbx, 40(%0) ;"
+ "movq 24(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 48(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 56(%0) ;"
+
+
+ "movq 32(%1), %%rdx ;" /* B[0] */
+ "mulx 40(%1), %%r8, %%r14 ;" /* B[1]*B[0] */
+ "xorl %%r15d, %%r15d;"
+ "mulx 48(%1), %%r9, %%r10 ;" /* B[2]*B[0] */
+ "adcx %%r14, %%r9 ;"
+ "mulx 56(%1), %%rax, %%rcx ;" /* B[3]*B[0] */
+ "adcx %%rax, %%r10 ;"
+ "movq 56(%1), %%rdx ;" /* B[3] */
+ "mulx 40(%1), %%r11, %%rbx ;" /* B[1]*B[3] */
+ "adcx %%rcx, %%r11 ;"
+ "mulx 48(%1), %%rax, %%r13 ;" /* B[2]*B[3] */
+ "adcx %%rax, %%rbx ;"
+ "movq 40(%1), %%rdx ;" /* B[1] */
+ "adcx %%r15, %%r13 ;"
+ "mulx 48(%1), %%rax, %%rcx ;" /* B[2]*B[1] */
+ "movq $0, %%r14 ;"
+ /******************************************/
+ "adcx %%r15, %%r14 ;"
+
+ "xorl %%r15d, %%r15d;"
+ "adox %%rax, %%r10 ;"
+ "adcx %%r8, %%r8 ;"
+ "adox %%rcx, %%r11 ;"
+ "adcx %%r9, %%r9 ;"
+ "adox %%r15, %%rbx ;"
+ "adcx %%r10, %%r10 ;"
+ "adox %%r15, %%r13 ;"
+ "adcx %%r11, %%r11 ;"
+ "adox %%r15, %%r14 ;"
+ "adcx %%rbx, %%rbx ;"
+ "adcx %%r13, %%r13 ;"
+ "adcx %%r14, %%r14 ;"
+
+ "movq 32(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* B[0]^2 */
+ /*******************/
+ "movq %%rax, 64(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 72(%0) ;"
+ "movq 40(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* B[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 80(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 88(%0) ;"
+ "movq 48(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* B[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 96(%0) ;"
+ "adcq %%rcx, %%rbx ;"
+ "movq %%rbx, 104(%0) ;"
+ "movq 56(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* B[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 112(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 120(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11", "%r13", "%r14", "%r15");
+}
+
+static void sqr2_256x256_integer_bmi2(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movq 8(%1), %%rdx ;" /* A[1] */
+ "mulx (%1), %%r8, %%r9 ;" /* A[0]*A[1] */
+ "mulx 16(%1), %%r10, %%r11 ;" /* A[2]*A[1] */
+ "mulx 24(%1), %%rcx, %%r14 ;" /* A[3]*A[1] */
+
+ "movq 16(%1), %%rdx ;" /* A[2] */
+ "mulx 24(%1), %%r15, %%r13 ;" /* A[3]*A[2] */
+ "mulx (%1), %%rax, %%rdx ;" /* A[0]*A[2] */
+
+ "addq %%rax, %%r9 ;"
+ "adcq %%rdx, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq %%r14, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "movq $0, %%r14 ;"
+ "adcq $0, %%r14 ;"
+
+ "movq (%1), %%rdx ;" /* A[0] */
+ "mulx 24(%1), %%rax, %%rcx ;" /* A[0]*A[3] */
+
+ "addq %%rax, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq $0, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "adcq $0, %%r14 ;"
+
+ "shldq $1, %%r13, %%r14 ;"
+ "shldq $1, %%r15, %%r13 ;"
+ "shldq $1, %%r11, %%r15 ;"
+ "shldq $1, %%r10, %%r11 ;"
+ "shldq $1, %%r9, %%r10 ;"
+ "shldq $1, %%r8, %%r9 ;"
+ "shlq $1, %%r8 ;"
+
+ /*******************/
+ "mulx %%rdx, %%rax, %%rcx ; " /* A[0]^2 */
+ /*******************/
+ "movq %%rax, 0(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "movq 8(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* A[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 16(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 24(%0) ;"
+ "movq 16(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* A[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 32(%0) ;"
+ "adcq %%rcx, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "movq 24(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* A[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 48(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 56(%0) ;"
+
+ "movq 40(%1), %%rdx ;" /* B[1] */
+ "mulx 32(%1), %%r8, %%r9 ;" /* B[0]*B[1] */
+ "mulx 48(%1), %%r10, %%r11 ;" /* B[2]*B[1] */
+ "mulx 56(%1), %%rcx, %%r14 ;" /* B[3]*B[1] */
+
+ "movq 48(%1), %%rdx ;" /* B[2] */
+ "mulx 56(%1), %%r15, %%r13 ;" /* B[3]*B[2] */
+ "mulx 32(%1), %%rax, %%rdx ;" /* B[0]*B[2] */
+
+ "addq %%rax, %%r9 ;"
+ "adcq %%rdx, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq %%r14, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "movq $0, %%r14 ;"
+ "adcq $0, %%r14 ;"
+
+ "movq 32(%1), %%rdx ;" /* B[0] */
+ "mulx 56(%1), %%rax, %%rcx ;" /* B[0]*B[3] */
+
+ "addq %%rax, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq $0, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "adcq $0, %%r14 ;"
+
+ "shldq $1, %%r13, %%r14 ;"
+ "shldq $1, %%r15, %%r13 ;"
+ "shldq $1, %%r11, %%r15 ;"
+ "shldq $1, %%r10, %%r11 ;"
+ "shldq $1, %%r9, %%r10 ;"
+ "shldq $1, %%r8, %%r9 ;"
+ "shlq $1, %%r8 ;"
+
+ /*******************/
+ "mulx %%rdx, %%rax, %%rcx ; " /* B[0]^2 */
+ /*******************/
+ "movq %%rax, 64(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 72(%0) ;"
+ "movq 40(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* B[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 80(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 88(%0) ;"
+ "movq 48(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* B[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 96(%0) ;"
+ "adcq %%rcx, %%r15 ;"
+ "movq %%r15, 104(%0) ;"
+ "movq 56(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ; " /* B[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 112(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 120(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10",
+ "%r11", "%r13", "%r14", "%r15");
+}
+
+static void red_eltfp25519_2w_adx(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movl $38, %%edx; " /* 2*c = 38 = 2^256 */
+ "mulx 32(%1), %%r8, %%r10; " /* c*C[4] */
+ "xorl %%ebx, %%ebx ;"
+ "adox (%1), %%r8 ;"
+ "mulx 40(%1), %%r9, %%r11; " /* c*C[5] */
+ "adcx %%r10, %%r9 ;"
+ "adox 8(%1), %%r9 ;"
+ "mulx 48(%1), %%r10, %%rax; " /* c*C[6] */
+ "adcx %%r11, %%r10 ;"
+ "adox 16(%1), %%r10 ;"
+ "mulx 56(%1), %%r11, %%rcx; " /* c*C[7] */
+ "adcx %%rax, %%r11 ;"
+ "adox 24(%1), %%r11 ;"
+ /***************************************/
+ "adcx %%rbx, %%rcx ;"
+ "adox %%rbx, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0, of=0 */
+ "adcx %%rcx, %%r8 ;"
+ "adcx %%rbx, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcx %%rbx, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcx %%rbx, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+
+ "mulx 96(%1), %%r8, %%r10; " /* c*C[4] */
+ "xorl %%ebx, %%ebx ;"
+ "adox 64(%1), %%r8 ;"
+ "mulx 104(%1), %%r9, %%r11; " /* c*C[5] */
+ "adcx %%r10, %%r9 ;"
+ "adox 72(%1), %%r9 ;"
+ "mulx 112(%1), %%r10, %%rax; " /* c*C[6] */
+ "adcx %%r11, %%r10 ;"
+ "adox 80(%1), %%r10 ;"
+ "mulx 120(%1), %%r11, %%rcx; " /* c*C[7] */
+ "adcx %%rax, %%r11 ;"
+ "adox 88(%1), %%r11 ;"
+ /****************************************/
+ "adcx %%rbx, %%rcx ;"
+ "adox %%rbx, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0, of=0 */
+ "adcx %%rcx, %%r8 ;"
+ "adcx %%rbx, %%r9 ;"
+ "movq %%r9, 40(%0) ;"
+ "adcx %%rbx, %%r10 ;"
+ "movq %%r10, 48(%0) ;"
+ "adcx %%rbx, %%r11 ;"
+ "movq %%r11, 56(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 32(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11");
+}
+
+static void red_eltfp25519_2w_bmi2(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movl $38, %%edx ; " /* 2*c = 38 = 2^256 */
+ "mulx 32(%1), %%r8, %%r10 ;" /* c*C[4] */
+ "mulx 40(%1), %%r9, %%r11 ;" /* c*C[5] */
+ "addq %%r10, %%r9 ;"
+ "mulx 48(%1), %%r10, %%rax ;" /* c*C[6] */
+ "adcq %%r11, %%r10 ;"
+ "mulx 56(%1), %%r11, %%rcx ;" /* c*C[7] */
+ "adcq %%rax, %%r11 ;"
+ /***************************************/
+ "adcq $0, %%rcx ;"
+ "addq (%1), %%r8 ;"
+ "adcq 8(%1), %%r9 ;"
+ "adcq 16(%1), %%r10 ;"
+ "adcq 24(%1), %%r11 ;"
+ "adcq $0, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0 */
+ "addq %%rcx, %%r8 ;"
+ "adcq $0, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcq $0, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcq $0, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+
+ "mulx 96(%1), %%r8, %%r10 ;" /* c*C[4] */
+ "mulx 104(%1), %%r9, %%r11 ;" /* c*C[5] */
+ "addq %%r10, %%r9 ;"
+ "mulx 112(%1), %%r10, %%rax ;" /* c*C[6] */
+ "adcq %%r11, %%r10 ;"
+ "mulx 120(%1), %%r11, %%rcx ;" /* c*C[7] */
+ "adcq %%rax, %%r11 ;"
+ /****************************************/
+ "adcq $0, %%rcx ;"
+ "addq 64(%1), %%r8 ;"
+ "adcq 72(%1), %%r9 ;"
+ "adcq 80(%1), %%r10 ;"
+ "adcq 88(%1), %%r11 ;"
+ "adcq $0, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0 */
+ "addq %%rcx, %%r8 ;"
+ "adcq $0, %%r9 ;"
+ "movq %%r9, 40(%0) ;"
+ "adcq $0, %%r10 ;"
+ "movq %%r10, 48(%0) ;"
+ "adcq $0, %%r11 ;"
+ "movq %%r11, 56(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 32(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10",
+ "%r11");
+}
+
+static void mul_256x256_integer_adx(u64 *const c, const u64 *const a,
+ const u64 *const b)
+{
+ asm volatile(
+ "movq (%1), %%rdx; " /* A[0] */
+ "mulx (%2), %%r8, %%r9; " /* A[0]*B[0] */
+ "xorl %%r10d, %%r10d ;"
+ "movq %%r8, (%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[0]*B[1] */
+ "adox %%r9, %%r10 ;"
+ "movq %%r10, 8(%0) ;"
+ "mulx 16(%2), %%r15, %%r13; " /* A[0]*B[2] */
+ "adox %%r11, %%r15 ;"
+ "mulx 24(%2), %%r14, %%rdx; " /* A[0]*B[3] */
+ "adox %%r13, %%r14 ;"
+ "movq $0, %%rax ;"
+ /******************************************/
+ "adox %%rdx, %%rax ;"
+
+ "movq 8(%1), %%rdx; " /* A[1] */
+ "mulx (%2), %%r8, %%r9; " /* A[1]*B[0] */
+ "xorl %%r10d, %%r10d ;"
+ "adcx 8(%0), %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[1]*B[1] */
+ "adox %%r9, %%r10 ;"
+ "adcx %%r15, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "mulx 16(%2), %%r15, %%r13; " /* A[1]*B[2] */
+ "adox %%r11, %%r15 ;"
+ "adcx %%r14, %%r15 ;"
+ "movq $0, %%r8 ;"
+ "mulx 24(%2), %%r14, %%rdx; " /* A[1]*B[3] */
+ "adox %%r13, %%r14 ;"
+ "adcx %%rax, %%r14 ;"
+ "movq $0, %%rax ;"
+ /******************************************/
+ "adox %%rdx, %%rax ;"
+ "adcx %%r8, %%rax ;"
+
+ "movq 16(%1), %%rdx; " /* A[2] */
+ "mulx (%2), %%r8, %%r9; " /* A[2]*B[0] */
+ "xorl %%r10d, %%r10d ;"
+ "adcx 16(%0), %%r8 ;"
+ "movq %%r8, 16(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[2]*B[1] */
+ "adox %%r9, %%r10 ;"
+ "adcx %%r15, %%r10 ;"
+ "movq %%r10, 24(%0) ;"
+ "mulx 16(%2), %%r15, %%r13; " /* A[2]*B[2] */
+ "adox %%r11, %%r15 ;"
+ "adcx %%r14, %%r15 ;"
+ "movq $0, %%r8 ;"
+ "mulx 24(%2), %%r14, %%rdx; " /* A[2]*B[3] */
+ "adox %%r13, %%r14 ;"
+ "adcx %%rax, %%r14 ;"
+ "movq $0, %%rax ;"
+ /******************************************/
+ "adox %%rdx, %%rax ;"
+ "adcx %%r8, %%rax ;"
+
+ "movq 24(%1), %%rdx; " /* A[3] */
+ "mulx (%2), %%r8, %%r9; " /* A[3]*B[0] */
+ "xorl %%r10d, %%r10d ;"
+ "adcx 24(%0), %%r8 ;"
+ "movq %%r8, 24(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[3]*B[1] */
+ "adox %%r9, %%r10 ;"
+ "adcx %%r15, %%r10 ;"
+ "movq %%r10, 32(%0) ;"
+ "mulx 16(%2), %%r15, %%r13; " /* A[3]*B[2] */
+ "adox %%r11, %%r15 ;"
+ "adcx %%r14, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "movq $0, %%r8 ;"
+ "mulx 24(%2), %%r14, %%rdx; " /* A[3]*B[3] */
+ "adox %%r13, %%r14 ;"
+ "adcx %%rax, %%r14 ;"
+ "movq %%r14, 48(%0) ;"
+ "movq $0, %%rax ;"
+ /******************************************/
+ "adox %%rdx, %%rax ;"
+ "adcx %%r8, %%rax ;"
+ "movq %%rax, 56(%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rdx", "%r8", "%r9", "%r10", "%r11",
+ "%r13", "%r14", "%r15");
+}
+
+static void mul_256x256_integer_bmi2(u64 *const c, const u64 *const a,
+ const u64 *const b)
+{
+ asm volatile(
+ "movq (%1), %%rdx; " /* A[0] */
+ "mulx (%2), %%r8, %%r15; " /* A[0]*B[0] */
+ "movq %%r8, (%0) ;"
+ "mulx 8(%2), %%r10, %%rax; " /* A[0]*B[1] */
+ "addq %%r10, %%r15 ;"
+ "mulx 16(%2), %%r8, %%rbx; " /* A[0]*B[2] */
+ "adcq %%r8, %%rax ;"
+ "mulx 24(%2), %%r10, %%rcx; " /* A[0]*B[3] */
+ "adcq %%r10, %%rbx ;"
+ /******************************************/
+ "adcq $0, %%rcx ;"
+
+ "movq 8(%1), %%rdx; " /* A[1] */
+ "mulx (%2), %%r8, %%r9; " /* A[1]*B[0] */
+ "addq %%r15, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[1]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[1]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%r15; " /* A[1]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%r15 ;"
+
+ "addq %%r9, %%rax ;"
+ "adcq %%r11, %%rbx ;"
+ "adcq %%r13, %%rcx ;"
+ "adcq $0, %%r15 ;"
+
+ "movq 16(%1), %%rdx; " /* A[2] */
+ "mulx (%2), %%r8, %%r9; " /* A[2]*B[0] */
+ "addq %%rax, %%r8 ;"
+ "movq %%r8, 16(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[2]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[2]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%rax; " /* A[2]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rax ;"
+
+ "addq %%r9, %%rbx ;"
+ "adcq %%r11, %%rcx ;"
+ "adcq %%r13, %%r15 ;"
+ "adcq $0, %%rax ;"
+
+ "movq 24(%1), %%rdx; " /* A[3] */
+ "mulx (%2), %%r8, %%r9; " /* A[3]*B[0] */
+ "addq %%rbx, %%r8 ;"
+ "movq %%r8, 24(%0) ;"
+ "mulx 8(%2), %%r10, %%r11; " /* A[3]*B[1] */
+ "adcq %%r10, %%r9 ;"
+ "mulx 16(%2), %%r8, %%r13; " /* A[3]*B[2] */
+ "adcq %%r8, %%r11 ;"
+ "mulx 24(%2), %%r10, %%rbx; " /* A[3]*B[3] */
+ "adcq %%r10, %%r13 ;"
+ /******************************************/
+ "adcq $0, %%rbx ;"
+
+ "addq %%r9, %%rcx ;"
+ "movq %%rcx, 32(%0) ;"
+ "adcq %%r11, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "adcq %%r13, %%rax ;"
+ "movq %%rax, 48(%0) ;"
+ "adcq $0, %%rbx ;"
+ "movq %%rbx, 56(%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11", "%r13", "%r15");
+}
+
+static void sqr_256x256_integer_adx(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movq (%1), %%rdx ;" /* A[0] */
+ "mulx 8(%1), %%r8, %%r14 ;" /* A[1]*A[0] */
+ "xorl %%r15d, %%r15d;"
+ "mulx 16(%1), %%r9, %%r10 ;" /* A[2]*A[0] */
+ "adcx %%r14, %%r9 ;"
+ "mulx 24(%1), %%rax, %%rcx ;" /* A[3]*A[0] */
+ "adcx %%rax, %%r10 ;"
+ "movq 24(%1), %%rdx ;" /* A[3] */
+ "mulx 8(%1), %%r11, %%rbx ;" /* A[1]*A[3] */
+ "adcx %%rcx, %%r11 ;"
+ "mulx 16(%1), %%rax, %%r13 ;" /* A[2]*A[3] */
+ "adcx %%rax, %%rbx ;"
+ "movq 8(%1), %%rdx ;" /* A[1] */
+ "adcx %%r15, %%r13 ;"
+ "mulx 16(%1), %%rax, %%rcx ;" /* A[2]*A[1] */
+ "movq $0, %%r14 ;"
+ /******************************************/
+ "adcx %%r15, %%r14 ;"
+
+ "xorl %%r15d, %%r15d;"
+ "adox %%rax, %%r10 ;"
+ "adcx %%r8, %%r8 ;"
+ "adox %%rcx, %%r11 ;"
+ "adcx %%r9, %%r9 ;"
+ "adox %%r15, %%rbx ;"
+ "adcx %%r10, %%r10 ;"
+ "adox %%r15, %%r13 ;"
+ "adcx %%r11, %%r11 ;"
+ "adox %%r15, %%r14 ;"
+ "adcx %%rbx, %%rbx ;"
+ "adcx %%r13, %%r13 ;"
+ "adcx %%r14, %%r14 ;"
+
+ "movq (%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[0]^2 */
+ /*******************/
+ "movq %%rax, 0(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "movq 8(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 16(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 24(%0) ;"
+ "movq 16(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 32(%0) ;"
+ "adcq %%rcx, %%rbx ;"
+ "movq %%rbx, 40(%0) ;"
+ "movq 24(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 48(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 56(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11", "%r13", "%r14", "%r15");
+}
+
+static void sqr_256x256_integer_bmi2(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movq 8(%1), %%rdx ;" /* A[1] */
+ "mulx (%1), %%r8, %%r9 ;" /* A[0]*A[1] */
+ "mulx 16(%1), %%r10, %%r11 ;" /* A[2]*A[1] */
+ "mulx 24(%1), %%rcx, %%r14 ;" /* A[3]*A[1] */
+
+ "movq 16(%1), %%rdx ;" /* A[2] */
+ "mulx 24(%1), %%r15, %%r13 ;" /* A[3]*A[2] */
+ "mulx (%1), %%rax, %%rdx ;" /* A[0]*A[2] */
+
+ "addq %%rax, %%r9 ;"
+ "adcq %%rdx, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq %%r14, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "movq $0, %%r14 ;"
+ "adcq $0, %%r14 ;"
+
+ "movq (%1), %%rdx ;" /* A[0] */
+ "mulx 24(%1), %%rax, %%rcx ;" /* A[0]*A[3] */
+
+ "addq %%rax, %%r10 ;"
+ "adcq %%rcx, %%r11 ;"
+ "adcq $0, %%r15 ;"
+ "adcq $0, %%r13 ;"
+ "adcq $0, %%r14 ;"
+
+ "shldq $1, %%r13, %%r14 ;"
+ "shldq $1, %%r15, %%r13 ;"
+ "shldq $1, %%r11, %%r15 ;"
+ "shldq $1, %%r10, %%r11 ;"
+ "shldq $1, %%r9, %%r10 ;"
+ "shldq $1, %%r8, %%r9 ;"
+ "shlq $1, %%r8 ;"
+
+ /*******************/
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[0]^2 */
+ /*******************/
+ "movq %%rax, 0(%0) ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, 8(%0) ;"
+ "movq 8(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[1]^2 */
+ "adcq %%rax, %%r9 ;"
+ "movq %%r9, 16(%0) ;"
+ "adcq %%rcx, %%r10 ;"
+ "movq %%r10, 24(%0) ;"
+ "movq 16(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[2]^2 */
+ "adcq %%rax, %%r11 ;"
+ "movq %%r11, 32(%0) ;"
+ "adcq %%rcx, %%r15 ;"
+ "movq %%r15, 40(%0) ;"
+ "movq 24(%1), %%rdx ;"
+ "mulx %%rdx, %%rax, %%rcx ;" /* A[3]^2 */
+ "adcq %%rax, %%r13 ;"
+ "movq %%r13, 48(%0) ;"
+ "adcq %%rcx, %%r14 ;"
+ "movq %%r14, 56(%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10",
+ "%r11", "%r13", "%r14", "%r15");
+}
+
+static void red_eltfp25519_1w_adx(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movl $38, %%edx ;" /* 2*c = 38 = 2^256 */
+ "mulx 32(%1), %%r8, %%r10 ;" /* c*C[4] */
+ "xorl %%ebx, %%ebx ;"
+ "adox (%1), %%r8 ;"
+ "mulx 40(%1), %%r9, %%r11 ;" /* c*C[5] */
+ "adcx %%r10, %%r9 ;"
+ "adox 8(%1), %%r9 ;"
+ "mulx 48(%1), %%r10, %%rax ;" /* c*C[6] */
+ "adcx %%r11, %%r10 ;"
+ "adox 16(%1), %%r10 ;"
+ "mulx 56(%1), %%r11, %%rcx ;" /* c*C[7] */
+ "adcx %%rax, %%r11 ;"
+ "adox 24(%1), %%r11 ;"
+ /***************************************/
+ "adcx %%rbx, %%rcx ;"
+ "adox %%rbx, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0, of=0 */
+ "adcx %%rcx, %%r8 ;"
+ "adcx %%rbx, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcx %%rbx, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcx %%rbx, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9",
+ "%r10", "%r11");
+}
+
+static void red_eltfp25519_1w_bmi2(u64 *const c, const u64 *const a)
+{
+ asm volatile(
+ "movl $38, %%edx ;" /* 2*c = 38 = 2^256 */
+ "mulx 32(%1), %%r8, %%r10 ;" /* c*C[4] */
+ "mulx 40(%1), %%r9, %%r11 ;" /* c*C[5] */
+ "addq %%r10, %%r9 ;"
+ "mulx 48(%1), %%r10, %%rax ;" /* c*C[6] */
+ "adcq %%r11, %%r10 ;"
+ "mulx 56(%1), %%r11, %%rcx ;" /* c*C[7] */
+ "adcq %%rax, %%r11 ;"
+ /***************************************/
+ "adcq $0, %%rcx ;"
+ "addq (%1), %%r8 ;"
+ "adcq 8(%1), %%r9 ;"
+ "adcq 16(%1), %%r10 ;"
+ "adcq 24(%1), %%r11 ;"
+ "adcq $0, %%rcx ;"
+ "imul %%rdx, %%rcx ;" /* c*C[4], cf=0 */
+ "addq %%rcx, %%r8 ;"
+ "adcq $0, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcq $0, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcq $0, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a)
+ : "memory", "cc", "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10",
+ "%r11");
+}
+
+static __always_inline void
+add_eltfp25519_1w_adx(u64 *const c, const u64 *const a, const u64 *const b)
+{
+ asm volatile(
+ "mov $38, %%eax ;"
+ "xorl %%ecx, %%ecx ;"
+ "movq (%2), %%r8 ;"
+ "adcx (%1), %%r8 ;"
+ "movq 8(%2), %%r9 ;"
+ "adcx 8(%1), %%r9 ;"
+ "movq 16(%2), %%r10 ;"
+ "adcx 16(%1), %%r10 ;"
+ "movq 24(%2), %%r11 ;"
+ "adcx 24(%1), %%r11 ;"
+ "cmovc %%eax, %%ecx ;"
+ "xorl %%eax, %%eax ;"
+ "adcx %%rcx, %%r8 ;"
+ "adcx %%rax, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcx %%rax, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcx %%rax, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $38, %%ecx ;"
+ "cmovc %%ecx, %%eax ;"
+ "addq %%rax, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11");
+}
+
+static __always_inline void
+add_eltfp25519_1w_bmi2(u64 *const c, const u64 *const a, const u64 *const b)
+{
+ asm volatile(
+ "mov $38, %%eax ;"
+ "movq (%2), %%r8 ;"
+ "addq (%1), %%r8 ;"
+ "movq 8(%2), %%r9 ;"
+ "adcq 8(%1), %%r9 ;"
+ "movq 16(%2), %%r10 ;"
+ "adcq 16(%1), %%r10 ;"
+ "movq 24(%2), %%r11 ;"
+ "adcq 24(%1), %%r11 ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%eax, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "adcq $0, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcq $0, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcq $0, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%eax, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11");
+}
+
+static __always_inline void
+sub_eltfp25519_1w(u64 *const c, const u64 *const a, const u64 *const b)
+{
+ asm volatile(
+ "mov $38, %%eax ;"
+ "movq (%1), %%r8 ;"
+ "subq (%2), %%r8 ;"
+ "movq 8(%1), %%r9 ;"
+ "sbbq 8(%2), %%r9 ;"
+ "movq 16(%1), %%r10 ;"
+ "sbbq 16(%2), %%r10 ;"
+ "movq 24(%1), %%r11 ;"
+ "sbbq 24(%2), %%r11 ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%eax, %%ecx ;"
+ "subq %%rcx, %%r8 ;"
+ "sbbq $0, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "sbbq $0, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "sbbq $0, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%eax, %%ecx ;"
+ "subq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(b)
+ : "memory", "cc", "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11");
+}
+
+/* Multiplication by a24 = (A+2)/4 = (486662+2)/4 = 121666 */
+static __always_inline void
+mul_a24_eltfp25519_1w(u64 *const c, const u64 *const a)
+{
+ const u64 a24 = 121666;
+ asm volatile(
+ "movq %2, %%rdx ;"
+ "mulx (%1), %%r8, %%r10 ;"
+ "mulx 8(%1), %%r9, %%r11 ;"
+ "addq %%r10, %%r9 ;"
+ "mulx 16(%1), %%r10, %%rax ;"
+ "adcq %%r11, %%r10 ;"
+ "mulx 24(%1), %%r11, %%rcx ;"
+ "adcq %%rax, %%r11 ;"
+ /**************************/
+ "adcq $0, %%rcx ;"
+ "movl $38, %%edx ;" /* 2*c = 38 = 2^256 mod 2^255-19*/
+ "imul %%rdx, %%rcx ;"
+ "addq %%rcx, %%r8 ;"
+ "adcq $0, %%r9 ;"
+ "movq %%r9, 8(%0) ;"
+ "adcq $0, %%r10 ;"
+ "movq %%r10, 16(%0) ;"
+ "adcq $0, %%r11 ;"
+ "movq %%r11, 24(%0) ;"
+ "mov $0, %%ecx ;"
+ "cmovc %%edx, %%ecx ;"
+ "addq %%rcx, %%r8 ;"
+ "movq %%r8, (%0) ;"
+ :
+ : "r"(c), "r"(a), "r"(a24)
+ : "memory", "cc", "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10",
+ "%r11");
+}
+
+static void inv_eltfp25519_1w_adx(u64 *const c, const u64 *const a)
+{
+ struct {
+ eltfp25519_1w_buffer buffer;
+ eltfp25519_1w x0, x1, x2;
+ } __aligned(32) m;
+ u64 *T[4];
+
+ T[0] = m.x0;
+ T[1] = c; /* x^(-1) */
+ T[2] = m.x1;
+ T[3] = m.x2;
+
+ copy_eltfp25519_1w(T[1], a);
+ sqrn_eltfp25519_1w_adx(T[1], 1);
+ copy_eltfp25519_1w(T[2], T[1]);
+ sqrn_eltfp25519_1w_adx(T[2], 2);
+ mul_eltfp25519_1w_adx(T[0], a, T[2]);
+ mul_eltfp25519_1w_adx(T[1], T[1], T[0]);
+ copy_eltfp25519_1w(T[2], T[1]);
+ sqrn_eltfp25519_1w_adx(T[2], 1);
+ mul_eltfp25519_1w_adx(T[0], T[0], T[2]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_adx(T[2], 5);
+ mul_eltfp25519_1w_adx(T[0], T[0], T[2]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_adx(T[2], 10);
+ mul_eltfp25519_1w_adx(T[2], T[2], T[0]);
+ copy_eltfp25519_1w(T[3], T[2]);
+ sqrn_eltfp25519_1w_adx(T[3], 20);
+ mul_eltfp25519_1w_adx(T[3], T[3], T[2]);
+ sqrn_eltfp25519_1w_adx(T[3], 10);
+ mul_eltfp25519_1w_adx(T[3], T[3], T[0]);
+ copy_eltfp25519_1w(T[0], T[3]);
+ sqrn_eltfp25519_1w_adx(T[0], 50);
+ mul_eltfp25519_1w_adx(T[0], T[0], T[3]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_adx(T[2], 100);
+ mul_eltfp25519_1w_adx(T[2], T[2], T[0]);
+ sqrn_eltfp25519_1w_adx(T[2], 50);
+ mul_eltfp25519_1w_adx(T[2], T[2], T[3]);
+ sqrn_eltfp25519_1w_adx(T[2], 5);
+ mul_eltfp25519_1w_adx(T[1], T[1], T[2]);
+
+ memzero_explicit(&m, sizeof(m));
+}
+
+static void inv_eltfp25519_1w_bmi2(u64 *const c, const u64 *const a)
+{
+ struct {
+ eltfp25519_1w_buffer buffer;
+ eltfp25519_1w x0, x1, x2;
+ } __aligned(32) m;
+ u64 *T[5];
+
+ T[0] = m.x0;
+ T[1] = c; /* x^(-1) */
+ T[2] = m.x1;
+ T[3] = m.x2;
+
+ copy_eltfp25519_1w(T[1], a);
+ sqrn_eltfp25519_1w_bmi2(T[1], 1);
+ copy_eltfp25519_1w(T[2], T[1]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 2);
+ mul_eltfp25519_1w_bmi2(T[0], a, T[2]);
+ mul_eltfp25519_1w_bmi2(T[1], T[1], T[0]);
+ copy_eltfp25519_1w(T[2], T[1]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 1);
+ mul_eltfp25519_1w_bmi2(T[0], T[0], T[2]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 5);
+ mul_eltfp25519_1w_bmi2(T[0], T[0], T[2]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 10);
+ mul_eltfp25519_1w_bmi2(T[2], T[2], T[0]);
+ copy_eltfp25519_1w(T[3], T[2]);
+ sqrn_eltfp25519_1w_bmi2(T[3], 20);
+ mul_eltfp25519_1w_bmi2(T[3], T[3], T[2]);
+ sqrn_eltfp25519_1w_bmi2(T[3], 10);
+ mul_eltfp25519_1w_bmi2(T[3], T[3], T[0]);
+ copy_eltfp25519_1w(T[0], T[3]);
+ sqrn_eltfp25519_1w_bmi2(T[0], 50);
+ mul_eltfp25519_1w_bmi2(T[0], T[0], T[3]);
+ copy_eltfp25519_1w(T[2], T[0]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 100);
+ mul_eltfp25519_1w_bmi2(T[2], T[2], T[0]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 50);
+ mul_eltfp25519_1w_bmi2(T[2], T[2], T[3]);
+ sqrn_eltfp25519_1w_bmi2(T[2], 5);
+ mul_eltfp25519_1w_bmi2(T[1], T[1], T[2]);
+
+ memzero_explicit(&m, sizeof(m));
+}
+
+/* Given c, a 256-bit number, fred_eltfp25519_1w updates c
+ * with a number such that 0 <= C < 2**255-19.
+ */
+static __always_inline void fred_eltfp25519_1w(u64 *const c)
+{
+ u64 tmp0 = 38, tmp1 = 19;
+ asm volatile(
+ "btrq $63, %3 ;" /* Put bit 255 in carry flag and clear */
+ "cmovncl %k5, %k4 ;" /* c[255] ? 38 : 19 */
+
+ /* Add either 19 or 38 to c */
+ "addq %4, %0 ;"
+ "adcq $0, %1 ;"
+ "adcq $0, %2 ;"
+ "adcq $0, %3 ;"
+
+ /* Test for bit 255 again; only triggered on overflow modulo 2^255-19 */
+ "movl $0, %k4 ;"
+ "cmovnsl %k5, %k4 ;" /* c[255] ? 0 : 19 */
+ "btrq $63, %3 ;" /* Clear bit 255 */
+
+ /* Subtract 19 if necessary */
+ "subq %4, %0 ;"
+ "sbbq $0, %1 ;"
+ "sbbq $0, %2 ;"
+ "sbbq $0, %3 ;"
+
+ : "+r"(c[0]), "+r"(c[1]), "+r"(c[2]), "+r"(c[3]), "+r"(tmp0),
+ "+r"(tmp1)
+ :
+ : "memory", "cc");
+}
+
+static __always_inline void cswap(u8 bit, u64 *const px, u64 *const py)
+{
+ u64 temp;
+ asm volatile(
+ "test %9, %9 ;"
+ "movq %0, %8 ;"
+ "cmovnzq %4, %0 ;"
+ "cmovnzq %8, %4 ;"
+ "movq %1, %8 ;"
+ "cmovnzq %5, %1 ;"
+ "cmovnzq %8, %5 ;"
+ "movq %2, %8 ;"
+ "cmovnzq %6, %2 ;"
+ "cmovnzq %8, %6 ;"
+ "movq %3, %8 ;"
+ "cmovnzq %7, %3 ;"
+ "cmovnzq %8, %7 ;"
+ : "+r"(px[0]), "+r"(px[1]), "+r"(px[2]), "+r"(px[3]),
+ "+r"(py[0]), "+r"(py[1]), "+r"(py[2]), "+r"(py[3]),
+ "=r"(temp)
+ : "r"(bit)
+ : "cc"
+ );
+}
+
+static __always_inline void cselect(u8 bit, u64 *const px, const u64 *const py)
+{
+ asm volatile(
+ "test %4, %4 ;"
+ "cmovnzq %5, %0 ;"
+ "cmovnzq %6, %1 ;"
+ "cmovnzq %7, %2 ;"
+ "cmovnzq %8, %3 ;"
+ : "+r"(px[0]), "+r"(px[1]), "+r"(px[2]), "+r"(px[3])
+ : "r"(bit), "rm"(py[0]), "rm"(py[1]), "rm"(py[2]), "rm"(py[3])
+ : "cc"
+ );
+}
+
+static __always_inline void clamp_secret(u8 secret[CURVE25519_KEY_SIZE])
+{
+ secret[0] &= 248;
+ secret[31] &= 127;
+ secret[31] |= 64;
+}
+
+static void curve25519_adx(u8 shared[CURVE25519_KEY_SIZE],
+ const u8 private_key[CURVE25519_KEY_SIZE],
+ const u8 session_key[CURVE25519_KEY_SIZE])
+{
+ struct {
+ u64 buffer[4 * NUM_WORDS_ELTFP25519];
+ u64 coordinates[4 * NUM_WORDS_ELTFP25519];
+ u64 workspace[6 * NUM_WORDS_ELTFP25519];
+ u8 session[CURVE25519_KEY_SIZE];
+ u8 private[CURVE25519_KEY_SIZE];
+ } __aligned(32) m;
+
+ int i = 0, j = 0;
+ u64 prev = 0;
+ u64 *const X1 = (u64 *)m.session;
+ u64 *const key = (u64 *)m.private;
+ u64 *const Px = m.coordinates + 0;
+ u64 *const Pz = m.coordinates + 4;
+ u64 *const Qx = m.coordinates + 8;
+ u64 *const Qz = m.coordinates + 12;
+ u64 *const X2 = Qx;
+ u64 *const Z2 = Qz;
+ u64 *const X3 = Px;
+ u64 *const Z3 = Pz;
+ u64 *const X2Z2 = Qx;
+ u64 *const X3Z3 = Px;
+
+ u64 *const A = m.workspace + 0;
+ u64 *const B = m.workspace + 4;
+ u64 *const D = m.workspace + 8;
+ u64 *const C = m.workspace + 12;
+ u64 *const DA = m.workspace + 16;
+ u64 *const CB = m.workspace + 20;
+ u64 *const AB = A;
+ u64 *const DC = D;
+ u64 *const DACB = DA;
+
+ memcpy(m.private, private_key, sizeof(m.private));
+ memcpy(m.session, session_key, sizeof(m.session));
+
+ clamp_secret(m.private);
+
+ /* As in the draft:
+ * When receiving such an array, implementations of curve25519
+ * MUST mask the most-significant bit in the final byte. This
+ * is done to preserve compatibility with point formats which
+ * reserve the sign bit for use in other protocols and to
+ * increase resistance to implementation fingerprinting
+ */
+ m.session[CURVE25519_KEY_SIZE - 1] &= (1 << (255 % 8)) - 1;
+
+ copy_eltfp25519_1w(Px, X1);
+ setzero_eltfp25519_1w(Pz);
+ setzero_eltfp25519_1w(Qx);
+ setzero_eltfp25519_1w(Qz);
+
+ Pz[0] = 1;
+ Qx[0] = 1;
+
+ /* main-loop */
+ prev = 0;
+ j = 62;
+ for (i = 3; i >= 0; --i) {
+ while (j >= 0) {
+ u64 bit = (key[i] >> j) & 0x1;
+ u64 swap = bit ^ prev;
+ prev = bit;
+
+ add_eltfp25519_1w_adx(A, X2, Z2); /* A = (X2+Z2) */
+ sub_eltfp25519_1w(B, X2, Z2); /* B = (X2-Z2) */
+ add_eltfp25519_1w_adx(C, X3, Z3); /* C = (X3+Z3) */
+ sub_eltfp25519_1w(D, X3, Z3); /* D = (X3-Z3) */
+ mul_eltfp25519_2w_adx(DACB, AB, DC); /* [DA|CB] = [A|B]*[D|C] */
+
+ cselect(swap, A, C);
+ cselect(swap, B, D);
+
+ sqr_eltfp25519_2w_adx(AB); /* [AA|BB] = [A^2|B^2] */
+ add_eltfp25519_1w_adx(X3, DA, CB); /* X3 = (DA+CB) */
+ sub_eltfp25519_1w(Z3, DA, CB); /* Z3 = (DA-CB) */
+ sqr_eltfp25519_2w_adx(X3Z3); /* [X3|Z3] = [(DA+CB)|(DA+CB)]^2 */
+
+ copy_eltfp25519_1w(X2, B); /* X2 = B^2 */
+ sub_eltfp25519_1w(Z2, A, B); /* Z2 = E = AA-BB */
+
+ mul_a24_eltfp25519_1w(B, Z2); /* B = a24*E */
+ add_eltfp25519_1w_adx(B, B, X2); /* B = a24*E+B */
+ mul_eltfp25519_2w_adx(X2Z2, X2Z2, AB); /* [X2|Z2] = [B|E]*[A|a24*E+B] */
+ mul_eltfp25519_1w_adx(Z3, Z3, X1); /* Z3 = Z3*X1 */
+ --j;
+ }
+ j = 63;
+ }
+
+ inv_eltfp25519_1w_adx(A, Qz);
+ mul_eltfp25519_1w_adx((u64 *)shared, Qx, A);
+ fred_eltfp25519_1w((u64 *)shared);
+
+ memzero_explicit(&m, sizeof(m));
+}
+
+static void curve25519_adx_base(u8 session_key[CURVE25519_KEY_SIZE],
+ const u8 private_key[CURVE25519_KEY_SIZE])
+{
+ struct {
+ u64 buffer[4 * NUM_WORDS_ELTFP25519];
+ u64 coordinates[4 * NUM_WORDS_ELTFP25519];
+ u64 workspace[4 * NUM_WORDS_ELTFP25519];
+ u8 private[CURVE25519_KEY_SIZE];
+ } __aligned(32) m;
+
+ const int ite[4] = { 64, 64, 64, 63 };
+ const int q = 3;
+ u64 swap = 1;
+
+ int i = 0, j = 0, k = 0;
+ u64 *const key = (u64 *)m.private;
+ u64 *const Ur1 = m.coordinates + 0;
+ u64 *const Zr1 = m.coordinates + 4;
+ u64 *const Ur2 = m.coordinates + 8;
+ u64 *const Zr2 = m.coordinates + 12;
+
+ u64 *const UZr1 = m.coordinates + 0;
+ u64 *const ZUr2 = m.coordinates + 8;
+
+ u64 *const A = m.workspace + 0;
+ u64 *const B = m.workspace + 4;
+ u64 *const C = m.workspace + 8;
+ u64 *const D = m.workspace + 12;
+
+ u64 *const AB = m.workspace + 0;
+ u64 *const CD = m.workspace + 8;
+
+ const u64 *const P = table_ladder_8k;
+
+ memcpy(m.private, private_key, sizeof(m.private));
+
+ clamp_secret(m.private);
+
+ setzero_eltfp25519_1w(Ur1);
+ setzero_eltfp25519_1w(Zr1);
+ setzero_eltfp25519_1w(Zr2);
+ Ur1[0] = 1;
+ Zr1[0] = 1;
+ Zr2[0] = 1;
+
+ /* G-S */
+ Ur2[3] = 0x1eaecdeee27cab34UL;
+ Ur2[2] = 0xadc7a0b9235d48e2UL;
+ Ur2[1] = 0xbbf095ae14b2edf8UL;
+ Ur2[0] = 0x7e94e1fec82faabdUL;
+
+ /* main-loop */
+ j = q;
+ for (i = 0; i < NUM_WORDS_ELTFP25519; ++i) {
+ while (j < ite[i]) {
+ u64 bit = (key[i] >> j) & 0x1;
+ k = (64 * i + j - q);
+ swap = swap ^ bit;
+ cswap(swap, Ur1, Ur2);
+ cswap(swap, Zr1, Zr2);
+ swap = bit;
+ /* Addition */
+ sub_eltfp25519_1w(B, Ur1, Zr1); /* B = Ur1-Zr1 */
+ add_eltfp25519_1w_adx(A, Ur1, Zr1); /* A = Ur1+Zr1 */
+ mul_eltfp25519_1w_adx(C, &P[4 * k], B); /* C = M0-B */
+ sub_eltfp25519_1w(B, A, C); /* B = (Ur1+Zr1) - M*(Ur1-Zr1) */
+ add_eltfp25519_1w_adx(A, A, C); /* A = (Ur1+Zr1) + M*(Ur1-Zr1) */
+ sqr_eltfp25519_2w_adx(AB); /* A = A^2 | B = B^2 */
+ mul_eltfp25519_2w_adx(UZr1, ZUr2, AB); /* Ur1 = Zr2*A | Zr1 = Ur2*B */
+ ++j;
+ }
+ j = 0;
+ }
+
+ /* Doubling */
+ for (i = 0; i < q; ++i) {
+ add_eltfp25519_1w_adx(A, Ur1, Zr1); /* A = Ur1+Zr1 */
+ sub_eltfp25519_1w(B, Ur1, Zr1); /* B = Ur1-Zr1 */
+ sqr_eltfp25519_2w_adx(AB); /* A = A**2 B = B**2 */
+ copy_eltfp25519_1w(C, B); /* C = B */
+ sub_eltfp25519_1w(B, A, B); /* B = A-B */
+ mul_a24_eltfp25519_1w(D, B); /* D = my_a24*B */
+ add_eltfp25519_1w_adx(D, D, C); /* D = D+C */
+ mul_eltfp25519_2w_adx(UZr1, AB, CD); /* Ur1 = A*B Zr1 = Zr1*A */
+ }
+
+ /* Convert to affine coordinates */
+ inv_eltfp25519_1w_adx(A, Zr1);
+ mul_eltfp25519_1w_adx((u64 *)session_key, Ur1, A);
+ fred_eltfp25519_1w((u64 *)session_key);
+
+ memzero_explicit(&m, sizeof(m));
+}
+
+static void curve25519_bmi2(u8 shared[CURVE25519_KEY_SIZE],
+ const u8 private_key[CURVE25519_KEY_SIZE],
+ const u8 session_key[CURVE25519_KEY_SIZE])
+{
+ struct {
+ u64 buffer[4 * NUM_WORDS_ELTFP25519];
+ u64 coordinates[4 * NUM_WORDS_ELTFP25519];
+ u64 workspace[6 * NUM_WORDS_ELTFP25519];
+ u8 session[CURVE25519_KEY_SIZE];
+ u8 private[CURVE25519_KEY_SIZE];
+ } __aligned(32) m;
+
+ int i = 0, j = 0;
+ u64 prev = 0;
+ u64 *const X1 = (u64 *)m.session;
+ u64 *const key = (u64 *)m.private;
+ u64 *const Px = m.coordinates + 0;
+ u64 *const Pz = m.coordinates + 4;
+ u64 *const Qx = m.coordinates + 8;
+ u64 *const Qz = m.coordinates + 12;
+ u64 *const X2 = Qx;
+ u64 *const Z2 = Qz;
+ u64 *const X3 = Px;
+ u64 *const Z3 = Pz;
+ u64 *const X2Z2 = Qx;
+ u64 *const X3Z3 = Px;
+
+ u64 *const A = m.workspace + 0;
+ u64 *const B = m.workspace + 4;
+ u64 *const D = m.workspace + 8;
+ u64 *const C = m.workspace + 12;
+ u64 *const DA = m.workspace + 16;
+ u64 *const CB = m.workspace + 20;
+ u64 *const AB = A;
+ u64 *const DC = D;
+ u64 *const DACB = DA;
+
+ memcpy(m.private, private_key, sizeof(m.private));
+ memcpy(m.session, session_key, sizeof(m.session));
+
+ clamp_secret(m.private);
+
+ /* As in the draft:
+ * When receiving such an array, implementations of curve25519
+ * MUST mask the most-significant bit in the final byte. This
+ * is done to preserve compatibility with point formats which
+ * reserve the sign bit for use in other protocols and to
+ * increase resistance to implementation fingerprinting
+ */
+ m.session[CURVE25519_KEY_SIZE - 1] &= (1 << (255 % 8)) - 1;
+
+ copy_eltfp25519_1w(Px, X1);
+ setzero_eltfp25519_1w(Pz);
+ setzero_eltfp25519_1w(Qx);
+ setzero_eltfp25519_1w(Qz);
+
+ Pz[0] = 1;
+ Qx[0] = 1;
+
+ /* main-loop */
+ prev = 0;
+ j = 62;
+ for (i = 3; i >= 0; --i) {
+ while (j >= 0) {
+ u64 bit = (key[i] >> j) & 0x1;
+ u64 swap = bit ^ prev;
+ prev = bit;
+
+ add_eltfp25519_1w_bmi2(A, X2, Z2); /* A = (X2+Z2) */
+ sub_eltfp25519_1w(B, X2, Z2); /* B = (X2-Z2) */
+ add_eltfp25519_1w_bmi2(C, X3, Z3); /* C = (X3+Z3) */
+ sub_eltfp25519_1w(D, X3, Z3); /* D = (X3-Z3) */
+ mul_eltfp25519_2w_bmi2(DACB, AB, DC); /* [DA|CB] = [A|B]*[D|C] */
+
+ cselect(swap, A, C);
+ cselect(swap, B, D);
+
+ sqr_eltfp25519_2w_bmi2(AB); /* [AA|BB] = [A^2|B^2] */
+ add_eltfp25519_1w_bmi2(X3, DA, CB); /* X3 = (DA+CB) */
+ sub_eltfp25519_1w(Z3, DA, CB); /* Z3 = (DA-CB) */
+ sqr_eltfp25519_2w_bmi2(X3Z3); /* [X3|Z3] = [(DA+CB)|(DA+CB)]^2 */
+
+ copy_eltfp25519_1w(X2, B); /* X2 = B^2 */
+ sub_eltfp25519_1w(Z2, A, B); /* Z2 = E = AA-BB */
+
+ mul_a24_eltfp25519_1w(B, Z2); /* B = a24*E */
+ add_eltfp25519_1w_bmi2(B, B, X2); /* B = a24*E+B */
+ mul_eltfp25519_2w_bmi2(X2Z2, X2Z2, AB); /* [X2|Z2] = [B|E]*[A|a24*E+B] */
+ mul_eltfp25519_1w_bmi2(Z3, Z3, X1); /* Z3 = Z3*X1 */
+ --j;
+ }
+ j = 63;
+ }
+
+ inv_eltfp25519_1w_bmi2(A, Qz);
+ mul_eltfp25519_1w_bmi2((u64 *)shared, Qx, A);
+ fred_eltfp25519_1w((u64 *)shared);
+
+ memzero_explicit(&m, sizeof(m));
+}
+
+static void curve25519_bmi2_base(u8 session_key[CURVE25519_KEY_SIZE],
+ const u8 private_key[CURVE25519_KEY_SIZE])
+{
+ struct {
+ u64 buffer[4 * NUM_WORDS_ELTFP25519];
+ u64 coordinates[4 * NUM_WORDS_ELTFP25519];
+ u64 workspace[4 * NUM_WORDS_ELTFP25519];
+ u8 private[CURVE25519_KEY_SIZE];
+ } __aligned(32) m;
+
+ const int ite[4] = { 64, 64, 64, 63 };
+ const int q = 3;
+ u64 swap = 1;
+
+ int i = 0, j = 0, k = 0;
+ u64 *const key = (u64 *)m.private;
+ u64 *const Ur1 = m.coordinates + 0;
+ u64 *const Zr1 = m.coordinates + 4;
+ u64 *const Ur2 = m.coordinates + 8;
+ u64 *const Zr2 = m.coordinates + 12;
+
+ u64 *const UZr1 = m.coordinates + 0;
+ u64 *const ZUr2 = m.coordinates + 8;
+
+ u64 *const A = m.workspace + 0;
+ u64 *const B = m.workspace + 4;
+ u64 *const C = m.workspace + 8;
+ u64 *const D = m.workspace + 12;
+
+ u64 *const AB = m.workspace + 0;
+ u64 *const CD = m.workspace + 8;
+
+ const u64 *const P = table_ladder_8k;
+
+ memcpy(m.private, private_key, sizeof(m.private));
+
+ clamp_secret(m.private);
+
+ setzero_eltfp25519_1w(Ur1);
+ setzero_eltfp25519_1w(Zr1);
+ setzero_eltfp25519_1w(Zr2);
+ Ur1[0] = 1;
+ Zr1[0] = 1;
+ Zr2[0] = 1;
+
+ /* G-S */
+ Ur2[3] = 0x1eaecdeee27cab34UL;
+ Ur2[2] = 0xadc7a0b9235d48e2UL;
+ Ur2[1] = 0xbbf095ae14b2edf8UL;
+ Ur2[0] = 0x7e94e1fec82faabdUL;
+
+ /* main-loop */
+ j = q;
+ for (i = 0; i < NUM_WORDS_ELTFP25519; ++i) {
+ while (j < ite[i]) {
+ u64 bit = (key[i] >> j) & 0x1;
+ k = (64 * i + j - q);
+ swap = swap ^ bit;
+ cswap(swap, Ur1, Ur2);
+ cswap(swap, Zr1, Zr2);
+ swap = bit;
+ /* Addition */
+ sub_eltfp25519_1w(B, Ur1, Zr1); /* B = Ur1-Zr1 */
+ add_eltfp25519_1w_bmi2(A, Ur1, Zr1); /* A = Ur1+Zr1 */
+ mul_eltfp25519_1w_bmi2(C, &P[4 * k], B);/* C = M0-B */
+ sub_eltfp25519_1w(B, A, C); /* B = (Ur1+Zr1) - M*(Ur1-Zr1) */
+ add_eltfp25519_1w_bmi2(A, A, C); /* A = (Ur1+Zr1) + M*(Ur1-Zr1) */
+ sqr_eltfp25519_2w_bmi2(AB); /* A = A^2 | B = B^2 */
+ mul_eltfp25519_2w_bmi2(UZr1, ZUr2, AB); /* Ur1 = Zr2*A | Zr1 = Ur2*B */
+ ++j;
+ }
+ j = 0;
+ }
+
+ /* Doubling */
+ for (i = 0; i < q; ++i) {
+ add_eltfp25519_1w_bmi2(A, Ur1, Zr1); /* A = Ur1+Zr1 */
+ sub_eltfp25519_1w(B, Ur1, Zr1); /* B = Ur1-Zr1 */
+ sqr_eltfp25519_2w_bmi2(AB); /* A = A**2 B = B**2 */
+ copy_eltfp25519_1w(C, B); /* C = B */
+ sub_eltfp25519_1w(B, A, B); /* B = A-B */
+ mul_a24_eltfp25519_1w(D, B); /* D = my_a24*B */
+ add_eltfp25519_1w_bmi2(D, D, C); /* D = D+C */
+ mul_eltfp25519_2w_bmi2(UZr1, AB, CD); /* Ur1 = A*B Zr1 = Zr1*A */
+ }
+
+ /* Convert to affine coordinates */
+ inv_eltfp25519_1w_bmi2(A, Zr1);
+ mul_eltfp25519_1w_bmi2((u64 *)session_key, Ur1, A);
+ fred_eltfp25519_1w((u64 *)session_key);
+
+ memzero_explicit(&m, sizeof(m));
+}
diff --git a/lib/zinc/curve25519/curve25519.c b/lib/zinc/curve25519/curve25519.c
index 2f613d2a7519..4f9c45ba126d 100644
--- a/lib/zinc/curve25519/curve25519.c
+++ b/lib/zinc/curve25519/curve25519.c
@@ -20,6 +20,9 @@
#include <linux/init.h>
#include <crypto/algapi.h> // For crypto_memneq.
+#if defined(CONFIG_ZINC_ARCH_X86_64)
+#include "curve25519-x86_64-glue.c"
+#else
static bool *const curve25519_nobs[] __initconst = { };
static void __init curve25519_fpu_init(void)
{
@@ -35,6 +38,7 @@ static inline bool curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE],
{
return false;
}
+#endif
static __always_inline void normalize_secret(u8 secret[CURVE25519_KEY_SIZE])
{
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 23/28] zinc: import Bernstein and Schwabe's Curve25519 ARM implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (19 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 22/28] zinc: Curve25519 x86_64 implementation Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 24/28] zinc: " Jason A. Donenfeld
` (2 subsequent siblings)
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Russell King, linux-arm-kernel, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This comes from Dan Bernstein and Peter Schwabe's public domain NEON
code, and is included here in raw form so that subsequent commits that
fix these up for the kernel can see how it has changed. This code does
have some entirely cosmetic formatting differences, adding indentation
and so forth, so that when we actually port it for use in the kernel in
the subsequent commit, it's obvious what's changed in the process.
This code originates from SUPERCOP 20180818, available at
<https://bench.cr.yp.to/supercop.html>.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/curve25519/curve25519-arm-supercop.S | 2105 +++++++++++++++++
1 file changed, 2105 insertions(+)
create mode 100644 lib/zinc/curve25519/curve25519-arm-supercop.S
diff --git a/lib/zinc/curve25519/curve25519-arm-supercop.S b/lib/zinc/curve25519/curve25519-arm-supercop.S
new file mode 100644
index 000000000000..f33b85fef382
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-arm-supercop.S
@@ -0,0 +1,2105 @@
+/*
+ * Public domain code from Daniel J. Bernstein and Peter Schwabe, from
+ * SUPERCOP's curve25519/neon2/scalarmult.s.
+ */
+
+.fpu neon
+.text
+.align 4
+.global _crypto_scalarmult_curve25519_neon2
+.global crypto_scalarmult_curve25519_neon2
+.type _crypto_scalarmult_curve25519_neon2 STT_FUNC
+.type crypto_scalarmult_curve25519_neon2 STT_FUNC
+ _crypto_scalarmult_curve25519_neon2:
+ crypto_scalarmult_curve25519_neon2:
+ vpush {q4, q5, q6, q7}
+ mov r12, sp
+ sub sp, sp, #736
+ and sp, sp, #0xffffffe0
+ strd r4, [sp, #0]
+ strd r6, [sp, #8]
+ strd r8, [sp, #16]
+ strd r10, [sp, #24]
+ str r12, [sp, #480]
+ str r14, [sp, #484]
+ mov r0, r0
+ mov r1, r1
+ mov r2, r2
+ add r3, sp, #32
+ ldr r4, =0
+ ldr r5, =254
+ vmov.i32 q0, #1
+ vshr.u64 q1, q0, #7
+ vshr.u64 q0, q0, #8
+ vmov.i32 d4, #19
+ vmov.i32 d5, #38
+ add r6, sp, #512
+ vst1.8 {d2-d3}, [r6, : 128]
+ add r6, sp, #528
+ vst1.8 {d0-d1}, [r6, : 128]
+ add r6, sp, #544
+ vst1.8 {d4-d5}, [r6, : 128]
+ add r6, r3, #0
+ vmov.i32 q2, #0
+ vst1.8 {d4-d5}, [r6, : 128]!
+ vst1.8 {d4-d5}, [r6, : 128]!
+ vst1.8 d4, [r6, : 64]
+ add r6, r3, #0
+ ldr r7, =960
+ sub r7, r7, #2
+ neg r7, r7
+ sub r7, r7, r7, LSL #7
+ str r7, [r6]
+ add r6, sp, #704
+ vld1.8 {d4-d5}, [r1]!
+ vld1.8 {d6-d7}, [r1]
+ vst1.8 {d4-d5}, [r6, : 128]!
+ vst1.8 {d6-d7}, [r6, : 128]
+ sub r1, r6, #16
+ ldrb r6, [r1]
+ and r6, r6, #248
+ strb r6, [r1]
+ ldrb r6, [r1, #31]
+ and r6, r6, #127
+ orr r6, r6, #64
+ strb r6, [r1, #31]
+ vmov.i64 q2, #0xffffffff
+ vshr.u64 q3, q2, #7
+ vshr.u64 q2, q2, #6
+ vld1.8 {d8}, [r2]
+ vld1.8 {d10}, [r2]
+ add r2, r2, #6
+ vld1.8 {d12}, [r2]
+ vld1.8 {d14}, [r2]
+ add r2, r2, #6
+ vld1.8 {d16}, [r2]
+ add r2, r2, #4
+ vld1.8 {d18}, [r2]
+ vld1.8 {d20}, [r2]
+ add r2, r2, #6
+ vld1.8 {d22}, [r2]
+ add r2, r2, #2
+ vld1.8 {d24}, [r2]
+ vld1.8 {d26}, [r2]
+ vshr.u64 q5, q5, #26
+ vshr.u64 q6, q6, #3
+ vshr.u64 q7, q7, #29
+ vshr.u64 q8, q8, #6
+ vshr.u64 q10, q10, #25
+ vshr.u64 q11, q11, #3
+ vshr.u64 q12, q12, #12
+ vshr.u64 q13, q13, #38
+ vand q4, q4, q2
+ vand q6, q6, q2
+ vand q8, q8, q2
+ vand q10, q10, q2
+ vand q2, q12, q2
+ vand q5, q5, q3
+ vand q7, q7, q3
+ vand q9, q9, q3
+ vand q11, q11, q3
+ vand q3, q13, q3
+ add r2, r3, #48
+ vadd.i64 q12, q4, q1
+ vadd.i64 q13, q10, q1
+ vshr.s64 q12, q12, #26
+ vshr.s64 q13, q13, #26
+ vadd.i64 q5, q5, q12
+ vshl.i64 q12, q12, #26
+ vadd.i64 q14, q5, q0
+ vadd.i64 q11, q11, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q15, q11, q0
+ vsub.i64 q4, q4, q12
+ vshr.s64 q12, q14, #25
+ vsub.i64 q10, q10, q13
+ vshr.s64 q13, q15, #25
+ vadd.i64 q6, q6, q12
+ vshl.i64 q12, q12, #25
+ vadd.i64 q14, q6, q1
+ vadd.i64 q2, q2, q13
+ vsub.i64 q5, q5, q12
+ vshr.s64 q12, q14, #26
+ vshl.i64 q13, q13, #25
+ vadd.i64 q14, q2, q1
+ vadd.i64 q7, q7, q12
+ vshl.i64 q12, q12, #26
+ vadd.i64 q15, q7, q0
+ vsub.i64 q11, q11, q13
+ vshr.s64 q13, q14, #26
+ vsub.i64 q6, q6, q12
+ vshr.s64 q12, q15, #25
+ vadd.i64 q3, q3, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q14, q3, q0
+ vadd.i64 q8, q8, q12
+ vshl.i64 q12, q12, #25
+ vadd.i64 q15, q8, q1
+ add r2, r2, #8
+ vsub.i64 q2, q2, q13
+ vshr.s64 q13, q14, #25
+ vsub.i64 q7, q7, q12
+ vshr.s64 q12, q15, #26
+ vadd.i64 q14, q13, q13
+ vadd.i64 q9, q9, q12
+ vtrn.32 d12, d14
+ vshl.i64 q12, q12, #26
+ vtrn.32 d13, d15
+ vadd.i64 q0, q9, q0
+ vadd.i64 q4, q4, q14
+ vst1.8 d12, [r2, : 64]!
+ vshl.i64 q6, q13, #4
+ vsub.i64 q7, q8, q12
+ vshr.s64 q0, q0, #25
+ vadd.i64 q4, q4, q6
+ vadd.i64 q6, q10, q0
+ vshl.i64 q0, q0, #25
+ vadd.i64 q8, q6, q1
+ vadd.i64 q4, q4, q13
+ vshl.i64 q10, q13, #25
+ vadd.i64 q1, q4, q1
+ vsub.i64 q0, q9, q0
+ vshr.s64 q8, q8, #26
+ vsub.i64 q3, q3, q10
+ vtrn.32 d14, d0
+ vshr.s64 q1, q1, #26
+ vtrn.32 d15, d1
+ vadd.i64 q0, q11, q8
+ vst1.8 d14, [r2, : 64]
+ vshl.i64 q7, q8, #26
+ vadd.i64 q5, q5, q1
+ vtrn.32 d4, d6
+ vshl.i64 q1, q1, #26
+ vtrn.32 d5, d7
+ vsub.i64 q3, q6, q7
+ add r2, r2, #16
+ vsub.i64 q1, q4, q1
+ vst1.8 d4, [r2, : 64]
+ vtrn.32 d6, d0
+ vtrn.32 d7, d1
+ sub r2, r2, #8
+ vtrn.32 d2, d10
+ vtrn.32 d3, d11
+ vst1.8 d6, [r2, : 64]
+ sub r2, r2, #24
+ vst1.8 d2, [r2, : 64]
+ add r2, r3, #96
+ vmov.i32 q0, #0
+ vmov.i64 d2, #0xff
+ vmov.i64 d3, #0
+ vshr.u32 q1, q1, #7
+ vst1.8 {d2-d3}, [r2, : 128]!
+ vst1.8 {d0-d1}, [r2, : 128]!
+ vst1.8 d0, [r2, : 64]
+ add r2, r3, #144
+ vmov.i32 q0, #0
+ vst1.8 {d0-d1}, [r2, : 128]!
+ vst1.8 {d0-d1}, [r2, : 128]!
+ vst1.8 d0, [r2, : 64]
+ add r2, r3, #240
+ vmov.i32 q0, #0
+ vmov.i64 d2, #0xff
+ vmov.i64 d3, #0
+ vshr.u32 q1, q1, #7
+ vst1.8 {d2-d3}, [r2, : 128]!
+ vst1.8 {d0-d1}, [r2, : 128]!
+ vst1.8 d0, [r2, : 64]
+ add r2, r3, #48
+ add r6, r3, #192
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d4}, [r2, : 64]
+ vst1.8 {d0-d1}, [r6, : 128]!
+ vst1.8 {d2-d3}, [r6, : 128]!
+ vst1.8 d4, [r6, : 64]
+._mainloop:
+ mov r2, r5, LSR #3
+ and r6, r5, #7
+ ldrb r2, [r1, r2]
+ mov r2, r2, LSR r6
+ and r2, r2, #1
+ str r5, [sp, #488]
+ eor r4, r4, r2
+ str r2, [sp, #492]
+ neg r2, r4
+ add r4, r3, #96
+ add r5, r3, #192
+ add r6, r3, #144
+ vld1.8 {d8-d9}, [r4, : 128]!
+ add r7, r3, #240
+ vld1.8 {d10-d11}, [r5, : 128]!
+ veor q6, q4, q5
+ vld1.8 {d14-d15}, [r6, : 128]!
+ vdup.i32 q8, r2
+ vld1.8 {d18-d19}, [r7, : 128]!
+ veor q10, q7, q9
+ vld1.8 {d22-d23}, [r4, : 128]!
+ vand q6, q6, q8
+ vld1.8 {d24-d25}, [r5, : 128]!
+ vand q10, q10, q8
+ vld1.8 {d26-d27}, [r6, : 128]!
+ veor q4, q4, q6
+ vld1.8 {d28-d29}, [r7, : 128]!
+ veor q5, q5, q6
+ vld1.8 {d0}, [r4, : 64]
+ veor q6, q7, q10
+ vld1.8 {d2}, [r5, : 64]
+ veor q7, q9, q10
+ vld1.8 {d4}, [r6, : 64]
+ veor q9, q11, q12
+ vld1.8 {d6}, [r7, : 64]
+ veor q10, q0, q1
+ sub r2, r4, #32
+ vand q9, q9, q8
+ sub r4, r5, #32
+ vand q10, q10, q8
+ sub r5, r6, #32
+ veor q11, q11, q9
+ sub r6, r7, #32
+ veor q0, q0, q10
+ veor q9, q12, q9
+ veor q1, q1, q10
+ veor q10, q13, q14
+ veor q12, q2, q3
+ vand q10, q10, q8
+ vand q8, q12, q8
+ veor q12, q13, q10
+ veor q2, q2, q8
+ veor q10, q14, q10
+ veor q3, q3, q8
+ vadd.i32 q8, q4, q6
+ vsub.i32 q4, q4, q6
+ vst1.8 {d16-d17}, [r2, : 128]!
+ vadd.i32 q6, q11, q12
+ vst1.8 {d8-d9}, [r5, : 128]!
+ vsub.i32 q4, q11, q12
+ vst1.8 {d12-d13}, [r2, : 128]!
+ vadd.i32 q6, q0, q2
+ vst1.8 {d8-d9}, [r5, : 128]!
+ vsub.i32 q0, q0, q2
+ vst1.8 d12, [r2, : 64]
+ vadd.i32 q2, q5, q7
+ vst1.8 d0, [r5, : 64]
+ vsub.i32 q0, q5, q7
+ vst1.8 {d4-d5}, [r4, : 128]!
+ vadd.i32 q2, q9, q10
+ vst1.8 {d0-d1}, [r6, : 128]!
+ vsub.i32 q0, q9, q10
+ vst1.8 {d4-d5}, [r4, : 128]!
+ vadd.i32 q2, q1, q3
+ vst1.8 {d0-d1}, [r6, : 128]!
+ vsub.i32 q0, q1, q3
+ vst1.8 d4, [r4, : 64]
+ vst1.8 d0, [r6, : 64]
+ add r2, sp, #544
+ add r4, r3, #96
+ add r5, r3, #144
+ vld1.8 {d0-d1}, [r2, : 128]
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vld1.8 {d4-d5}, [r5, : 128]!
+ vzip.i32 q1, q2
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vld1.8 {d8-d9}, [r5, : 128]!
+ vshl.i32 q5, q1, #1
+ vzip.i32 q3, q4
+ vshl.i32 q6, q2, #1
+ vld1.8 {d14}, [r4, : 64]
+ vshl.i32 q8, q3, #1
+ vld1.8 {d15}, [r5, : 64]
+ vshl.i32 q9, q4, #1
+ vmul.i32 d21, d7, d1
+ vtrn.32 d14, d15
+ vmul.i32 q11, q4, q0
+ vmul.i32 q0, q7, q0
+ vmull.s32 q12, d2, d2
+ vmlal.s32 q12, d11, d1
+ vmlal.s32 q12, d12, d0
+ vmlal.s32 q12, d13, d23
+ vmlal.s32 q12, d16, d22
+ vmlal.s32 q12, d7, d21
+ vmull.s32 q10, d2, d11
+ vmlal.s32 q10, d4, d1
+ vmlal.s32 q10, d13, d0
+ vmlal.s32 q10, d6, d23
+ vmlal.s32 q10, d17, d22
+ vmull.s32 q13, d10, d4
+ vmlal.s32 q13, d11, d3
+ vmlal.s32 q13, d13, d1
+ vmlal.s32 q13, d16, d0
+ vmlal.s32 q13, d17, d23
+ vmlal.s32 q13, d8, d22
+ vmull.s32 q1, d10, d5
+ vmlal.s32 q1, d11, d4
+ vmlal.s32 q1, d6, d1
+ vmlal.s32 q1, d17, d0
+ vmlal.s32 q1, d8, d23
+ vmull.s32 q14, d10, d6
+ vmlal.s32 q14, d11, d13
+ vmlal.s32 q14, d4, d4
+ vmlal.s32 q14, d17, d1
+ vmlal.s32 q14, d18, d0
+ vmlal.s32 q14, d9, d23
+ vmull.s32 q11, d10, d7
+ vmlal.s32 q11, d11, d6
+ vmlal.s32 q11, d12, d5
+ vmlal.s32 q11, d8, d1
+ vmlal.s32 q11, d19, d0
+ vmull.s32 q15, d10, d8
+ vmlal.s32 q15, d11, d17
+ vmlal.s32 q15, d12, d6
+ vmlal.s32 q15, d13, d5
+ vmlal.s32 q15, d19, d1
+ vmlal.s32 q15, d14, d0
+ vmull.s32 q2, d10, d9
+ vmlal.s32 q2, d11, d8
+ vmlal.s32 q2, d12, d7
+ vmlal.s32 q2, d13, d6
+ vmlal.s32 q2, d14, d1
+ vmull.s32 q0, d15, d1
+ vmlal.s32 q0, d10, d14
+ vmlal.s32 q0, d11, d19
+ vmlal.s32 q0, d12, d8
+ vmlal.s32 q0, d13, d17
+ vmlal.s32 q0, d6, d6
+ add r2, sp, #512
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmull.s32 q3, d16, d7
+ vmlal.s32 q3, d10, d15
+ vmlal.s32 q3, d11, d14
+ vmlal.s32 q3, d12, d9
+ vmlal.s32 q3, d13, d8
+ add r2, sp, #528
+ vld1.8 {d8-d9}, [r2, : 128]
+ vadd.i64 q5, q12, q9
+ vadd.i64 q6, q15, q9
+ vshr.s64 q5, q5, #26
+ vshr.s64 q6, q6, #26
+ vadd.i64 q7, q10, q5
+ vshl.i64 q5, q5, #26
+ vadd.i64 q8, q7, q4
+ vadd.i64 q2, q2, q6
+ vshl.i64 q6, q6, #26
+ vadd.i64 q10, q2, q4
+ vsub.i64 q5, q12, q5
+ vshr.s64 q8, q8, #25
+ vsub.i64 q6, q15, q6
+ vshr.s64 q10, q10, #25
+ vadd.i64 q12, q13, q8
+ vshl.i64 q8, q8, #25
+ vadd.i64 q13, q12, q9
+ vadd.i64 q0, q0, q10
+ vsub.i64 q7, q7, q8
+ vshr.s64 q8, q13, #26
+ vshl.i64 q10, q10, #25
+ vadd.i64 q13, q0, q9
+ vadd.i64 q1, q1, q8
+ vshl.i64 q8, q8, #26
+ vadd.i64 q15, q1, q4
+ vsub.i64 q2, q2, q10
+ vshr.s64 q10, q13, #26
+ vsub.i64 q8, q12, q8
+ vshr.s64 q12, q15, #25
+ vadd.i64 q3, q3, q10
+ vshl.i64 q10, q10, #26
+ vadd.i64 q13, q3, q4
+ vadd.i64 q14, q14, q12
+ add r2, r3, #288
+ vshl.i64 q12, q12, #25
+ add r4, r3, #336
+ vadd.i64 q15, q14, q9
+ add r2, r2, #8
+ vsub.i64 q0, q0, q10
+ add r4, r4, #8
+ vshr.s64 q10, q13, #25
+ vsub.i64 q1, q1, q12
+ vshr.s64 q12, q15, #26
+ vadd.i64 q13, q10, q10
+ vadd.i64 q11, q11, q12
+ vtrn.32 d16, d2
+ vshl.i64 q12, q12, #26
+ vtrn.32 d17, d3
+ vadd.i64 q1, q11, q4
+ vadd.i64 q4, q5, q13
+ vst1.8 d16, [r2, : 64]!
+ vshl.i64 q5, q10, #4
+ vst1.8 d17, [r4, : 64]!
+ vsub.i64 q8, q14, q12
+ vshr.s64 q1, q1, #25
+ vadd.i64 q4, q4, q5
+ vadd.i64 q5, q6, q1
+ vshl.i64 q1, q1, #25
+ vadd.i64 q6, q5, q9
+ vadd.i64 q4, q4, q10
+ vshl.i64 q10, q10, #25
+ vadd.i64 q9, q4, q9
+ vsub.i64 q1, q11, q1
+ vshr.s64 q6, q6, #26
+ vsub.i64 q3, q3, q10
+ vtrn.32 d16, d2
+ vshr.s64 q9, q9, #26
+ vtrn.32 d17, d3
+ vadd.i64 q1, q2, q6
+ vst1.8 d16, [r2, : 64]
+ vshl.i64 q2, q6, #26
+ vst1.8 d17, [r4, : 64]
+ vadd.i64 q6, q7, q9
+ vtrn.32 d0, d6
+ vshl.i64 q7, q9, #26
+ vtrn.32 d1, d7
+ vsub.i64 q2, q5, q2
+ add r2, r2, #16
+ vsub.i64 q3, q4, q7
+ vst1.8 d0, [r2, : 64]
+ add r4, r4, #16
+ vst1.8 d1, [r4, : 64]
+ vtrn.32 d4, d2
+ vtrn.32 d5, d3
+ sub r2, r2, #8
+ sub r4, r4, #8
+ vtrn.32 d6, d12
+ vtrn.32 d7, d13
+ vst1.8 d4, [r2, : 64]
+ vst1.8 d5, [r4, : 64]
+ sub r2, r2, #24
+ sub r4, r4, #24
+ vst1.8 d6, [r2, : 64]
+ vst1.8 d7, [r4, : 64]
+ add r2, r3, #240
+ add r4, r3, #96
+ vld1.8 {d0-d1}, [r4, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vld1.8 {d4}, [r4, : 64]
+ add r4, r3, #144
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vtrn.32 q0, q3
+ vld1.8 {d8-d9}, [r4, : 128]!
+ vshl.i32 q5, q0, #4
+ vtrn.32 q1, q4
+ vshl.i32 q6, q3, #4
+ vadd.i32 q5, q5, q0
+ vadd.i32 q6, q6, q3
+ vshl.i32 q7, q1, #4
+ vld1.8 {d5}, [r4, : 64]
+ vshl.i32 q8, q4, #4
+ vtrn.32 d4, d5
+ vadd.i32 q7, q7, q1
+ vadd.i32 q8, q8, q4
+ vld1.8 {d18-d19}, [r2, : 128]!
+ vshl.i32 q10, q2, #4
+ vld1.8 {d22-d23}, [r2, : 128]!
+ vadd.i32 q10, q10, q2
+ vld1.8 {d24}, [r2, : 64]
+ vadd.i32 q5, q5, q0
+ add r2, r3, #192
+ vld1.8 {d26-d27}, [r2, : 128]!
+ vadd.i32 q6, q6, q3
+ vld1.8 {d28-d29}, [r2, : 128]!
+ vadd.i32 q8, q8, q4
+ vld1.8 {d25}, [r2, : 64]
+ vadd.i32 q10, q10, q2
+ vtrn.32 q9, q13
+ vadd.i32 q7, q7, q1
+ vadd.i32 q5, q5, q0
+ vtrn.32 q11, q14
+ vadd.i32 q6, q6, q3
+ add r2, sp, #560
+ vadd.i32 q10, q10, q2
+ vtrn.32 d24, d25
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q6, q13, #1
+ add r2, sp, #576
+ vst1.8 {d20-d21}, [r2, : 128]
+ vshl.i32 q10, q14, #1
+ add r2, sp, #592
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q15, q12, #1
+ vadd.i32 q8, q8, q4
+ vext.32 d10, d31, d30, #0
+ vadd.i32 q7, q7, q1
+ add r2, sp, #608
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q8, d18, d5
+ vmlal.s32 q8, d26, d4
+ vmlal.s32 q8, d19, d9
+ vmlal.s32 q8, d27, d3
+ vmlal.s32 q8, d22, d8
+ vmlal.s32 q8, d28, d2
+ vmlal.s32 q8, d23, d7
+ vmlal.s32 q8, d29, d1
+ vmlal.s32 q8, d24, d6
+ vmlal.s32 q8, d25, d0
+ add r2, sp, #624
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q2, d18, d4
+ vmlal.s32 q2, d12, d9
+ vmlal.s32 q2, d13, d8
+ vmlal.s32 q2, d19, d3
+ vmlal.s32 q2, d22, d2
+ vmlal.s32 q2, d23, d1
+ vmlal.s32 q2, d24, d0
+ add r2, sp, #640
+ vst1.8 {d20-d21}, [r2, : 128]
+ vmull.s32 q7, d18, d9
+ vmlal.s32 q7, d26, d3
+ vmlal.s32 q7, d19, d8
+ vmlal.s32 q7, d27, d2
+ vmlal.s32 q7, d22, d7
+ vmlal.s32 q7, d28, d1
+ vmlal.s32 q7, d23, d6
+ vmlal.s32 q7, d29, d0
+ add r2, sp, #656
+ vst1.8 {d10-d11}, [r2, : 128]
+ vmull.s32 q5, d18, d3
+ vmlal.s32 q5, d19, d2
+ vmlal.s32 q5, d22, d1
+ vmlal.s32 q5, d23, d0
+ vmlal.s32 q5, d12, d8
+ add r2, sp, #672
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q4, d18, d8
+ vmlal.s32 q4, d26, d2
+ vmlal.s32 q4, d19, d7
+ vmlal.s32 q4, d27, d1
+ vmlal.s32 q4, d22, d6
+ vmlal.s32 q4, d28, d0
+ vmull.s32 q8, d18, d7
+ vmlal.s32 q8, d26, d1
+ vmlal.s32 q8, d19, d6
+ vmlal.s32 q8, d27, d0
+ add r2, sp, #576
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q7, d24, d21
+ vmlal.s32 q7, d25, d20
+ vmlal.s32 q4, d23, d21
+ vmlal.s32 q4, d29, d20
+ vmlal.s32 q8, d22, d21
+ vmlal.s32 q8, d28, d20
+ vmlal.s32 q5, d24, d20
+ add r2, sp, #576
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q7, d18, d6
+ vmlal.s32 q7, d26, d0
+ add r2, sp, #656
+ vld1.8 {d30-d31}, [r2, : 128]
+ vmlal.s32 q2, d30, d21
+ vmlal.s32 q7, d19, d21
+ vmlal.s32 q7, d27, d20
+ add r2, sp, #624
+ vld1.8 {d26-d27}, [r2, : 128]
+ vmlal.s32 q4, d25, d27
+ vmlal.s32 q8, d29, d27
+ vmlal.s32 q8, d25, d26
+ vmlal.s32 q7, d28, d27
+ vmlal.s32 q7, d29, d26
+ add r2, sp, #608
+ vld1.8 {d28-d29}, [r2, : 128]
+ vmlal.s32 q4, d24, d29
+ vmlal.s32 q8, d23, d29
+ vmlal.s32 q8, d24, d28
+ vmlal.s32 q7, d22, d29
+ vmlal.s32 q7, d23, d28
+ add r2, sp, #608
+ vst1.8 {d8-d9}, [r2, : 128]
+ add r2, sp, #560
+ vld1.8 {d8-d9}, [r2, : 128]
+ vmlal.s32 q7, d24, d9
+ vmlal.s32 q7, d25, d31
+ vmull.s32 q1, d18, d2
+ vmlal.s32 q1, d19, d1
+ vmlal.s32 q1, d22, d0
+ vmlal.s32 q1, d24, d27
+ vmlal.s32 q1, d23, d20
+ vmlal.s32 q1, d12, d7
+ vmlal.s32 q1, d13, d6
+ vmull.s32 q6, d18, d1
+ vmlal.s32 q6, d19, d0
+ vmlal.s32 q6, d23, d27
+ vmlal.s32 q6, d22, d20
+ vmlal.s32 q6, d24, d26
+ vmull.s32 q0, d18, d0
+ vmlal.s32 q0, d22, d27
+ vmlal.s32 q0, d23, d26
+ vmlal.s32 q0, d24, d31
+ vmlal.s32 q0, d19, d20
+ add r2, sp, #640
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q2, d18, d7
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d18, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d18, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d18, d28
+ vmlal.s32 q0, d19, d9
+ vmlal.s32 q6, d18, d29
+ vmlal.s32 q6, d19, d28
+ add r2, sp, #592
+ vld1.8 {d18-d19}, [r2, : 128]
+ add r2, sp, #512
+ vld1.8 {d22-d23}, [r2, : 128]
+ vmlal.s32 q5, d19, d7
+ vmlal.s32 q0, d18, d21
+ vmlal.s32 q0, d19, d29
+ vmlal.s32 q6, d18, d6
+ add r2, sp, #528
+ vld1.8 {d6-d7}, [r2, : 128]
+ vmlal.s32 q6, d19, d21
+ add r2, sp, #576
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q0, d30, d8
+ add r2, sp, #672
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q5, d30, d29
+ add r2, sp, #608
+ vld1.8 {d24-d25}, [r2, : 128]
+ vmlal.s32 q1, d30, d28
+ vadd.i64 q13, q0, q11
+ vadd.i64 q14, q5, q11
+ vmlal.s32 q6, d30, d9
+ vshr.s64 q4, q13, #26
+ vshr.s64 q13, q14, #26
+ vadd.i64 q7, q7, q4
+ vshl.i64 q4, q4, #26
+ vadd.i64 q14, q7, q3
+ vadd.i64 q9, q9, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q15, q9, q3
+ vsub.i64 q0, q0, q4
+ vshr.s64 q4, q14, #25
+ vsub.i64 q5, q5, q13
+ vshr.s64 q13, q15, #25
+ vadd.i64 q6, q6, q4
+ vshl.i64 q4, q4, #25
+ vadd.i64 q14, q6, q11
+ vadd.i64 q2, q2, q13
+ vsub.i64 q4, q7, q4
+ vshr.s64 q7, q14, #26
+ vshl.i64 q13, q13, #25
+ vadd.i64 q14, q2, q11
+ vadd.i64 q8, q8, q7
+ vshl.i64 q7, q7, #26
+ vadd.i64 q15, q8, q3
+ vsub.i64 q9, q9, q13
+ vshr.s64 q13, q14, #26
+ vsub.i64 q6, q6, q7
+ vshr.s64 q7, q15, #25
+ vadd.i64 q10, q10, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q14, q10, q3
+ vadd.i64 q1, q1, q7
+ add r2, r3, #144
+ vshl.i64 q7, q7, #25
+ add r4, r3, #96
+ vadd.i64 q15, q1, q11
+ add r2, r2, #8
+ vsub.i64 q2, q2, q13
+ add r4, r4, #8
+ vshr.s64 q13, q14, #25
+ vsub.i64 q7, q8, q7
+ vshr.s64 q8, q15, #26
+ vadd.i64 q14, q13, q13
+ vadd.i64 q12, q12, q8
+ vtrn.32 d12, d14
+ vshl.i64 q8, q8, #26
+ vtrn.32 d13, d15
+ vadd.i64 q3, q12, q3
+ vadd.i64 q0, q0, q14
+ vst1.8 d12, [r2, : 64]!
+ vshl.i64 q7, q13, #4
+ vst1.8 d13, [r4, : 64]!
+ vsub.i64 q1, q1, q8
+ vshr.s64 q3, q3, #25
+ vadd.i64 q0, q0, q7
+ vadd.i64 q5, q5, q3
+ vshl.i64 q3, q3, #25
+ vadd.i64 q6, q5, q11
+ vadd.i64 q0, q0, q13
+ vshl.i64 q7, q13, #25
+ vadd.i64 q8, q0, q11
+ vsub.i64 q3, q12, q3
+ vshr.s64 q6, q6, #26
+ vsub.i64 q7, q10, q7
+ vtrn.32 d2, d6
+ vshr.s64 q8, q8, #26
+ vtrn.32 d3, d7
+ vadd.i64 q3, q9, q6
+ vst1.8 d2, [r2, : 64]
+ vshl.i64 q6, q6, #26
+ vst1.8 d3, [r4, : 64]
+ vadd.i64 q1, q4, q8
+ vtrn.32 d4, d14
+ vshl.i64 q4, q8, #26
+ vtrn.32 d5, d15
+ vsub.i64 q5, q5, q6
+ add r2, r2, #16
+ vsub.i64 q0, q0, q4
+ vst1.8 d4, [r2, : 64]
+ add r4, r4, #16
+ vst1.8 d5, [r4, : 64]
+ vtrn.32 d10, d6
+ vtrn.32 d11, d7
+ sub r2, r2, #8
+ sub r4, r4, #8
+ vtrn.32 d0, d2
+ vtrn.32 d1, d3
+ vst1.8 d10, [r2, : 64]
+ vst1.8 d11, [r4, : 64]
+ sub r2, r2, #24
+ sub r4, r4, #24
+ vst1.8 d0, [r2, : 64]
+ vst1.8 d1, [r4, : 64]
+ add r2, r3, #288
+ add r4, r3, #336
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vsub.i32 q0, q0, q1
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d4-d5}, [r4, : 128]!
+ vsub.i32 q1, q1, q2
+ add r5, r3, #240
+ vld1.8 {d4}, [r2, : 64]
+ vld1.8 {d6}, [r4, : 64]
+ vsub.i32 q2, q2, q3
+ vst1.8 {d0-d1}, [r5, : 128]!
+ vst1.8 {d2-d3}, [r5, : 128]!
+ vst1.8 d4, [r5, : 64]
+ add r2, r3, #144
+ add r4, r3, #96
+ add r5, r3, #144
+ add r6, r3, #192
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vsub.i32 q2, q0, q1
+ vadd.i32 q0, q0, q1
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vsub.i32 q4, q1, q3
+ vadd.i32 q1, q1, q3
+ vld1.8 {d6}, [r2, : 64]
+ vld1.8 {d10}, [r4, : 64]
+ vsub.i32 q6, q3, q5
+ vadd.i32 q3, q3, q5
+ vst1.8 {d4-d5}, [r5, : 128]!
+ vst1.8 {d0-d1}, [r6, : 128]!
+ vst1.8 {d8-d9}, [r5, : 128]!
+ vst1.8 {d2-d3}, [r6, : 128]!
+ vst1.8 d12, [r5, : 64]
+ vst1.8 d6, [r6, : 64]
+ add r2, r3, #0
+ add r4, r3, #240
+ vld1.8 {d0-d1}, [r4, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vld1.8 {d4}, [r4, : 64]
+ add r4, r3, #336
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vtrn.32 q0, q3
+ vld1.8 {d8-d9}, [r4, : 128]!
+ vshl.i32 q5, q0, #4
+ vtrn.32 q1, q4
+ vshl.i32 q6, q3, #4
+ vadd.i32 q5, q5, q0
+ vadd.i32 q6, q6, q3
+ vshl.i32 q7, q1, #4
+ vld1.8 {d5}, [r4, : 64]
+ vshl.i32 q8, q4, #4
+ vtrn.32 d4, d5
+ vadd.i32 q7, q7, q1
+ vadd.i32 q8, q8, q4
+ vld1.8 {d18-d19}, [r2, : 128]!
+ vshl.i32 q10, q2, #4
+ vld1.8 {d22-d23}, [r2, : 128]!
+ vadd.i32 q10, q10, q2
+ vld1.8 {d24}, [r2, : 64]
+ vadd.i32 q5, q5, q0
+ add r2, r3, #288
+ vld1.8 {d26-d27}, [r2, : 128]!
+ vadd.i32 q6, q6, q3
+ vld1.8 {d28-d29}, [r2, : 128]!
+ vadd.i32 q8, q8, q4
+ vld1.8 {d25}, [r2, : 64]
+ vadd.i32 q10, q10, q2
+ vtrn.32 q9, q13
+ vadd.i32 q7, q7, q1
+ vadd.i32 q5, q5, q0
+ vtrn.32 q11, q14
+ vadd.i32 q6, q6, q3
+ add r2, sp, #560
+ vadd.i32 q10, q10, q2
+ vtrn.32 d24, d25
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q6, q13, #1
+ add r2, sp, #576
+ vst1.8 {d20-d21}, [r2, : 128]
+ vshl.i32 q10, q14, #1
+ add r2, sp, #592
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q15, q12, #1
+ vadd.i32 q8, q8, q4
+ vext.32 d10, d31, d30, #0
+ vadd.i32 q7, q7, q1
+ add r2, sp, #608
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q8, d18, d5
+ vmlal.s32 q8, d26, d4
+ vmlal.s32 q8, d19, d9
+ vmlal.s32 q8, d27, d3
+ vmlal.s32 q8, d22, d8
+ vmlal.s32 q8, d28, d2
+ vmlal.s32 q8, d23, d7
+ vmlal.s32 q8, d29, d1
+ vmlal.s32 q8, d24, d6
+ vmlal.s32 q8, d25, d0
+ add r2, sp, #624
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q2, d18, d4
+ vmlal.s32 q2, d12, d9
+ vmlal.s32 q2, d13, d8
+ vmlal.s32 q2, d19, d3
+ vmlal.s32 q2, d22, d2
+ vmlal.s32 q2, d23, d1
+ vmlal.s32 q2, d24, d0
+ add r2, sp, #640
+ vst1.8 {d20-d21}, [r2, : 128]
+ vmull.s32 q7, d18, d9
+ vmlal.s32 q7, d26, d3
+ vmlal.s32 q7, d19, d8
+ vmlal.s32 q7, d27, d2
+ vmlal.s32 q7, d22, d7
+ vmlal.s32 q7, d28, d1
+ vmlal.s32 q7, d23, d6
+ vmlal.s32 q7, d29, d0
+ add r2, sp, #656
+ vst1.8 {d10-d11}, [r2, : 128]
+ vmull.s32 q5, d18, d3
+ vmlal.s32 q5, d19, d2
+ vmlal.s32 q5, d22, d1
+ vmlal.s32 q5, d23, d0
+ vmlal.s32 q5, d12, d8
+ add r2, sp, #672
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q4, d18, d8
+ vmlal.s32 q4, d26, d2
+ vmlal.s32 q4, d19, d7
+ vmlal.s32 q4, d27, d1
+ vmlal.s32 q4, d22, d6
+ vmlal.s32 q4, d28, d0
+ vmull.s32 q8, d18, d7
+ vmlal.s32 q8, d26, d1
+ vmlal.s32 q8, d19, d6
+ vmlal.s32 q8, d27, d0
+ add r2, sp, #576
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q7, d24, d21
+ vmlal.s32 q7, d25, d20
+ vmlal.s32 q4, d23, d21
+ vmlal.s32 q4, d29, d20
+ vmlal.s32 q8, d22, d21
+ vmlal.s32 q8, d28, d20
+ vmlal.s32 q5, d24, d20
+ add r2, sp, #576
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q7, d18, d6
+ vmlal.s32 q7, d26, d0
+ add r2, sp, #656
+ vld1.8 {d30-d31}, [r2, : 128]
+ vmlal.s32 q2, d30, d21
+ vmlal.s32 q7, d19, d21
+ vmlal.s32 q7, d27, d20
+ add r2, sp, #624
+ vld1.8 {d26-d27}, [r2, : 128]
+ vmlal.s32 q4, d25, d27
+ vmlal.s32 q8, d29, d27
+ vmlal.s32 q8, d25, d26
+ vmlal.s32 q7, d28, d27
+ vmlal.s32 q7, d29, d26
+ add r2, sp, #608
+ vld1.8 {d28-d29}, [r2, : 128]
+ vmlal.s32 q4, d24, d29
+ vmlal.s32 q8, d23, d29
+ vmlal.s32 q8, d24, d28
+ vmlal.s32 q7, d22, d29
+ vmlal.s32 q7, d23, d28
+ add r2, sp, #608
+ vst1.8 {d8-d9}, [r2, : 128]
+ add r2, sp, #560
+ vld1.8 {d8-d9}, [r2, : 128]
+ vmlal.s32 q7, d24, d9
+ vmlal.s32 q7, d25, d31
+ vmull.s32 q1, d18, d2
+ vmlal.s32 q1, d19, d1
+ vmlal.s32 q1, d22, d0
+ vmlal.s32 q1, d24, d27
+ vmlal.s32 q1, d23, d20
+ vmlal.s32 q1, d12, d7
+ vmlal.s32 q1, d13, d6
+ vmull.s32 q6, d18, d1
+ vmlal.s32 q6, d19, d0
+ vmlal.s32 q6, d23, d27
+ vmlal.s32 q6, d22, d20
+ vmlal.s32 q6, d24, d26
+ vmull.s32 q0, d18, d0
+ vmlal.s32 q0, d22, d27
+ vmlal.s32 q0, d23, d26
+ vmlal.s32 q0, d24, d31
+ vmlal.s32 q0, d19, d20
+ add r2, sp, #640
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q2, d18, d7
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d18, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d18, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d18, d28
+ vmlal.s32 q0, d19, d9
+ vmlal.s32 q6, d18, d29
+ vmlal.s32 q6, d19, d28
+ add r2, sp, #592
+ vld1.8 {d18-d19}, [r2, : 128]
+ add r2, sp, #512
+ vld1.8 {d22-d23}, [r2, : 128]
+ vmlal.s32 q5, d19, d7
+ vmlal.s32 q0, d18, d21
+ vmlal.s32 q0, d19, d29
+ vmlal.s32 q6, d18, d6
+ add r2, sp, #528
+ vld1.8 {d6-d7}, [r2, : 128]
+ vmlal.s32 q6, d19, d21
+ add r2, sp, #576
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q0, d30, d8
+ add r2, sp, #672
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q5, d30, d29
+ add r2, sp, #608
+ vld1.8 {d24-d25}, [r2, : 128]
+ vmlal.s32 q1, d30, d28
+ vadd.i64 q13, q0, q11
+ vadd.i64 q14, q5, q11
+ vmlal.s32 q6, d30, d9
+ vshr.s64 q4, q13, #26
+ vshr.s64 q13, q14, #26
+ vadd.i64 q7, q7, q4
+ vshl.i64 q4, q4, #26
+ vadd.i64 q14, q7, q3
+ vadd.i64 q9, q9, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q15, q9, q3
+ vsub.i64 q0, q0, q4
+ vshr.s64 q4, q14, #25
+ vsub.i64 q5, q5, q13
+ vshr.s64 q13, q15, #25
+ vadd.i64 q6, q6, q4
+ vshl.i64 q4, q4, #25
+ vadd.i64 q14, q6, q11
+ vadd.i64 q2, q2, q13
+ vsub.i64 q4, q7, q4
+ vshr.s64 q7, q14, #26
+ vshl.i64 q13, q13, #25
+ vadd.i64 q14, q2, q11
+ vadd.i64 q8, q8, q7
+ vshl.i64 q7, q7, #26
+ vadd.i64 q15, q8, q3
+ vsub.i64 q9, q9, q13
+ vshr.s64 q13, q14, #26
+ vsub.i64 q6, q6, q7
+ vshr.s64 q7, q15, #25
+ vadd.i64 q10, q10, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q14, q10, q3
+ vadd.i64 q1, q1, q7
+ add r2, r3, #288
+ vshl.i64 q7, q7, #25
+ add r4, r3, #96
+ vadd.i64 q15, q1, q11
+ add r2, r2, #8
+ vsub.i64 q2, q2, q13
+ add r4, r4, #8
+ vshr.s64 q13, q14, #25
+ vsub.i64 q7, q8, q7
+ vshr.s64 q8, q15, #26
+ vadd.i64 q14, q13, q13
+ vadd.i64 q12, q12, q8
+ vtrn.32 d12, d14
+ vshl.i64 q8, q8, #26
+ vtrn.32 d13, d15
+ vadd.i64 q3, q12, q3
+ vadd.i64 q0, q0, q14
+ vst1.8 d12, [r2, : 64]!
+ vshl.i64 q7, q13, #4
+ vst1.8 d13, [r4, : 64]!
+ vsub.i64 q1, q1, q8
+ vshr.s64 q3, q3, #25
+ vadd.i64 q0, q0, q7
+ vadd.i64 q5, q5, q3
+ vshl.i64 q3, q3, #25
+ vadd.i64 q6, q5, q11
+ vadd.i64 q0, q0, q13
+ vshl.i64 q7, q13, #25
+ vadd.i64 q8, q0, q11
+ vsub.i64 q3, q12, q3
+ vshr.s64 q6, q6, #26
+ vsub.i64 q7, q10, q7
+ vtrn.32 d2, d6
+ vshr.s64 q8, q8, #26
+ vtrn.32 d3, d7
+ vadd.i64 q3, q9, q6
+ vst1.8 d2, [r2, : 64]
+ vshl.i64 q6, q6, #26
+ vst1.8 d3, [r4, : 64]
+ vadd.i64 q1, q4, q8
+ vtrn.32 d4, d14
+ vshl.i64 q4, q8, #26
+ vtrn.32 d5, d15
+ vsub.i64 q5, q5, q6
+ add r2, r2, #16
+ vsub.i64 q0, q0, q4
+ vst1.8 d4, [r2, : 64]
+ add r4, r4, #16
+ vst1.8 d5, [r4, : 64]
+ vtrn.32 d10, d6
+ vtrn.32 d11, d7
+ sub r2, r2, #8
+ sub r4, r4, #8
+ vtrn.32 d0, d2
+ vtrn.32 d1, d3
+ vst1.8 d10, [r2, : 64]
+ vst1.8 d11, [r4, : 64]
+ sub r2, r2, #24
+ sub r4, r4, #24
+ vst1.8 d0, [r2, : 64]
+ vst1.8 d1, [r4, : 64]
+ add r2, sp, #544
+ add r4, r3, #144
+ add r5, r3, #192
+ vld1.8 {d0-d1}, [r2, : 128]
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vld1.8 {d4-d5}, [r5, : 128]!
+ vzip.i32 q1, q2
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vld1.8 {d8-d9}, [r5, : 128]!
+ vshl.i32 q5, q1, #1
+ vzip.i32 q3, q4
+ vshl.i32 q6, q2, #1
+ vld1.8 {d14}, [r4, : 64]
+ vshl.i32 q8, q3, #1
+ vld1.8 {d15}, [r5, : 64]
+ vshl.i32 q9, q4, #1
+ vmul.i32 d21, d7, d1
+ vtrn.32 d14, d15
+ vmul.i32 q11, q4, q0
+ vmul.i32 q0, q7, q0
+ vmull.s32 q12, d2, d2
+ vmlal.s32 q12, d11, d1
+ vmlal.s32 q12, d12, d0
+ vmlal.s32 q12, d13, d23
+ vmlal.s32 q12, d16, d22
+ vmlal.s32 q12, d7, d21
+ vmull.s32 q10, d2, d11
+ vmlal.s32 q10, d4, d1
+ vmlal.s32 q10, d13, d0
+ vmlal.s32 q10, d6, d23
+ vmlal.s32 q10, d17, d22
+ vmull.s32 q13, d10, d4
+ vmlal.s32 q13, d11, d3
+ vmlal.s32 q13, d13, d1
+ vmlal.s32 q13, d16, d0
+ vmlal.s32 q13, d17, d23
+ vmlal.s32 q13, d8, d22
+ vmull.s32 q1, d10, d5
+ vmlal.s32 q1, d11, d4
+ vmlal.s32 q1, d6, d1
+ vmlal.s32 q1, d17, d0
+ vmlal.s32 q1, d8, d23
+ vmull.s32 q14, d10, d6
+ vmlal.s32 q14, d11, d13
+ vmlal.s32 q14, d4, d4
+ vmlal.s32 q14, d17, d1
+ vmlal.s32 q14, d18, d0
+ vmlal.s32 q14, d9, d23
+ vmull.s32 q11, d10, d7
+ vmlal.s32 q11, d11, d6
+ vmlal.s32 q11, d12, d5
+ vmlal.s32 q11, d8, d1
+ vmlal.s32 q11, d19, d0
+ vmull.s32 q15, d10, d8
+ vmlal.s32 q15, d11, d17
+ vmlal.s32 q15, d12, d6
+ vmlal.s32 q15, d13, d5
+ vmlal.s32 q15, d19, d1
+ vmlal.s32 q15, d14, d0
+ vmull.s32 q2, d10, d9
+ vmlal.s32 q2, d11, d8
+ vmlal.s32 q2, d12, d7
+ vmlal.s32 q2, d13, d6
+ vmlal.s32 q2, d14, d1
+ vmull.s32 q0, d15, d1
+ vmlal.s32 q0, d10, d14
+ vmlal.s32 q0, d11, d19
+ vmlal.s32 q0, d12, d8
+ vmlal.s32 q0, d13, d17
+ vmlal.s32 q0, d6, d6
+ add r2, sp, #512
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmull.s32 q3, d16, d7
+ vmlal.s32 q3, d10, d15
+ vmlal.s32 q3, d11, d14
+ vmlal.s32 q3, d12, d9
+ vmlal.s32 q3, d13, d8
+ add r2, sp, #528
+ vld1.8 {d8-d9}, [r2, : 128]
+ vadd.i64 q5, q12, q9
+ vadd.i64 q6, q15, q9
+ vshr.s64 q5, q5, #26
+ vshr.s64 q6, q6, #26
+ vadd.i64 q7, q10, q5
+ vshl.i64 q5, q5, #26
+ vadd.i64 q8, q7, q4
+ vadd.i64 q2, q2, q6
+ vshl.i64 q6, q6, #26
+ vadd.i64 q10, q2, q4
+ vsub.i64 q5, q12, q5
+ vshr.s64 q8, q8, #25
+ vsub.i64 q6, q15, q6
+ vshr.s64 q10, q10, #25
+ vadd.i64 q12, q13, q8
+ vshl.i64 q8, q8, #25
+ vadd.i64 q13, q12, q9
+ vadd.i64 q0, q0, q10
+ vsub.i64 q7, q7, q8
+ vshr.s64 q8, q13, #26
+ vshl.i64 q10, q10, #25
+ vadd.i64 q13, q0, q9
+ vadd.i64 q1, q1, q8
+ vshl.i64 q8, q8, #26
+ vadd.i64 q15, q1, q4
+ vsub.i64 q2, q2, q10
+ vshr.s64 q10, q13, #26
+ vsub.i64 q8, q12, q8
+ vshr.s64 q12, q15, #25
+ vadd.i64 q3, q3, q10
+ vshl.i64 q10, q10, #26
+ vadd.i64 q13, q3, q4
+ vadd.i64 q14, q14, q12
+ add r2, r3, #144
+ vshl.i64 q12, q12, #25
+ add r4, r3, #192
+ vadd.i64 q15, q14, q9
+ add r2, r2, #8
+ vsub.i64 q0, q0, q10
+ add r4, r4, #8
+ vshr.s64 q10, q13, #25
+ vsub.i64 q1, q1, q12
+ vshr.s64 q12, q15, #26
+ vadd.i64 q13, q10, q10
+ vadd.i64 q11, q11, q12
+ vtrn.32 d16, d2
+ vshl.i64 q12, q12, #26
+ vtrn.32 d17, d3
+ vadd.i64 q1, q11, q4
+ vadd.i64 q4, q5, q13
+ vst1.8 d16, [r2, : 64]!
+ vshl.i64 q5, q10, #4
+ vst1.8 d17, [r4, : 64]!
+ vsub.i64 q8, q14, q12
+ vshr.s64 q1, q1, #25
+ vadd.i64 q4, q4, q5
+ vadd.i64 q5, q6, q1
+ vshl.i64 q1, q1, #25
+ vadd.i64 q6, q5, q9
+ vadd.i64 q4, q4, q10
+ vshl.i64 q10, q10, #25
+ vadd.i64 q9, q4, q9
+ vsub.i64 q1, q11, q1
+ vshr.s64 q6, q6, #26
+ vsub.i64 q3, q3, q10
+ vtrn.32 d16, d2
+ vshr.s64 q9, q9, #26
+ vtrn.32 d17, d3
+ vadd.i64 q1, q2, q6
+ vst1.8 d16, [r2, : 64]
+ vshl.i64 q2, q6, #26
+ vst1.8 d17, [r4, : 64]
+ vadd.i64 q6, q7, q9
+ vtrn.32 d0, d6
+ vshl.i64 q7, q9, #26
+ vtrn.32 d1, d7
+ vsub.i64 q2, q5, q2
+ add r2, r2, #16
+ vsub.i64 q3, q4, q7
+ vst1.8 d0, [r2, : 64]
+ add r4, r4, #16
+ vst1.8 d1, [r4, : 64]
+ vtrn.32 d4, d2
+ vtrn.32 d5, d3
+ sub r2, r2, #8
+ sub r4, r4, #8
+ vtrn.32 d6, d12
+ vtrn.32 d7, d13
+ vst1.8 d4, [r2, : 64]
+ vst1.8 d5, [r4, : 64]
+ sub r2, r2, #24
+ sub r4, r4, #24
+ vst1.8 d6, [r2, : 64]
+ vst1.8 d7, [r4, : 64]
+ add r2, r3, #336
+ add r4, r3, #288
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vadd.i32 q0, q0, q1
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d4-d5}, [r4, : 128]!
+ vadd.i32 q1, q1, q2
+ add r5, r3, #288
+ vld1.8 {d4}, [r2, : 64]
+ vld1.8 {d6}, [r4, : 64]
+ vadd.i32 q2, q2, q3
+ vst1.8 {d0-d1}, [r5, : 128]!
+ vst1.8 {d2-d3}, [r5, : 128]!
+ vst1.8 d4, [r5, : 64]
+ add r2, r3, #48
+ add r4, r3, #144
+ vld1.8 {d0-d1}, [r4, : 128]!
+ vld1.8 {d2-d3}, [r4, : 128]!
+ vld1.8 {d4}, [r4, : 64]
+ add r4, r3, #288
+ vld1.8 {d6-d7}, [r4, : 128]!
+ vtrn.32 q0, q3
+ vld1.8 {d8-d9}, [r4, : 128]!
+ vshl.i32 q5, q0, #4
+ vtrn.32 q1, q4
+ vshl.i32 q6, q3, #4
+ vadd.i32 q5, q5, q0
+ vadd.i32 q6, q6, q3
+ vshl.i32 q7, q1, #4
+ vld1.8 {d5}, [r4, : 64]
+ vshl.i32 q8, q4, #4
+ vtrn.32 d4, d5
+ vadd.i32 q7, q7, q1
+ vadd.i32 q8, q8, q4
+ vld1.8 {d18-d19}, [r2, : 128]!
+ vshl.i32 q10, q2, #4
+ vld1.8 {d22-d23}, [r2, : 128]!
+ vadd.i32 q10, q10, q2
+ vld1.8 {d24}, [r2, : 64]
+ vadd.i32 q5, q5, q0
+ add r2, r3, #240
+ vld1.8 {d26-d27}, [r2, : 128]!
+ vadd.i32 q6, q6, q3
+ vld1.8 {d28-d29}, [r2, : 128]!
+ vadd.i32 q8, q8, q4
+ vld1.8 {d25}, [r2, : 64]
+ vadd.i32 q10, q10, q2
+ vtrn.32 q9, q13
+ vadd.i32 q7, q7, q1
+ vadd.i32 q5, q5, q0
+ vtrn.32 q11, q14
+ vadd.i32 q6, q6, q3
+ add r2, sp, #560
+ vadd.i32 q10, q10, q2
+ vtrn.32 d24, d25
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q6, q13, #1
+ add r2, sp, #576
+ vst1.8 {d20-d21}, [r2, : 128]
+ vshl.i32 q10, q14, #1
+ add r2, sp, #592
+ vst1.8 {d12-d13}, [r2, : 128]
+ vshl.i32 q15, q12, #1
+ vadd.i32 q8, q8, q4
+ vext.32 d10, d31, d30, #0
+ vadd.i32 q7, q7, q1
+ add r2, sp, #608
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q8, d18, d5
+ vmlal.s32 q8, d26, d4
+ vmlal.s32 q8, d19, d9
+ vmlal.s32 q8, d27, d3
+ vmlal.s32 q8, d22, d8
+ vmlal.s32 q8, d28, d2
+ vmlal.s32 q8, d23, d7
+ vmlal.s32 q8, d29, d1
+ vmlal.s32 q8, d24, d6
+ vmlal.s32 q8, d25, d0
+ add r2, sp, #624
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q2, d18, d4
+ vmlal.s32 q2, d12, d9
+ vmlal.s32 q2, d13, d8
+ vmlal.s32 q2, d19, d3
+ vmlal.s32 q2, d22, d2
+ vmlal.s32 q2, d23, d1
+ vmlal.s32 q2, d24, d0
+ add r2, sp, #640
+ vst1.8 {d20-d21}, [r2, : 128]
+ vmull.s32 q7, d18, d9
+ vmlal.s32 q7, d26, d3
+ vmlal.s32 q7, d19, d8
+ vmlal.s32 q7, d27, d2
+ vmlal.s32 q7, d22, d7
+ vmlal.s32 q7, d28, d1
+ vmlal.s32 q7, d23, d6
+ vmlal.s32 q7, d29, d0
+ add r2, sp, #656
+ vst1.8 {d10-d11}, [r2, : 128]
+ vmull.s32 q5, d18, d3
+ vmlal.s32 q5, d19, d2
+ vmlal.s32 q5, d22, d1
+ vmlal.s32 q5, d23, d0
+ vmlal.s32 q5, d12, d8
+ add r2, sp, #672
+ vst1.8 {d16-d17}, [r2, : 128]
+ vmull.s32 q4, d18, d8
+ vmlal.s32 q4, d26, d2
+ vmlal.s32 q4, d19, d7
+ vmlal.s32 q4, d27, d1
+ vmlal.s32 q4, d22, d6
+ vmlal.s32 q4, d28, d0
+ vmull.s32 q8, d18, d7
+ vmlal.s32 q8, d26, d1
+ vmlal.s32 q8, d19, d6
+ vmlal.s32 q8, d27, d0
+ add r2, sp, #576
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q7, d24, d21
+ vmlal.s32 q7, d25, d20
+ vmlal.s32 q4, d23, d21
+ vmlal.s32 q4, d29, d20
+ vmlal.s32 q8, d22, d21
+ vmlal.s32 q8, d28, d20
+ vmlal.s32 q5, d24, d20
+ add r2, sp, #576
+ vst1.8 {d14-d15}, [r2, : 128]
+ vmull.s32 q7, d18, d6
+ vmlal.s32 q7, d26, d0
+ add r2, sp, #656
+ vld1.8 {d30-d31}, [r2, : 128]
+ vmlal.s32 q2, d30, d21
+ vmlal.s32 q7, d19, d21
+ vmlal.s32 q7, d27, d20
+ add r2, sp, #624
+ vld1.8 {d26-d27}, [r2, : 128]
+ vmlal.s32 q4, d25, d27
+ vmlal.s32 q8, d29, d27
+ vmlal.s32 q8, d25, d26
+ vmlal.s32 q7, d28, d27
+ vmlal.s32 q7, d29, d26
+ add r2, sp, #608
+ vld1.8 {d28-d29}, [r2, : 128]
+ vmlal.s32 q4, d24, d29
+ vmlal.s32 q8, d23, d29
+ vmlal.s32 q8, d24, d28
+ vmlal.s32 q7, d22, d29
+ vmlal.s32 q7, d23, d28
+ add r2, sp, #608
+ vst1.8 {d8-d9}, [r2, : 128]
+ add r2, sp, #560
+ vld1.8 {d8-d9}, [r2, : 128]
+ vmlal.s32 q7, d24, d9
+ vmlal.s32 q7, d25, d31
+ vmull.s32 q1, d18, d2
+ vmlal.s32 q1, d19, d1
+ vmlal.s32 q1, d22, d0
+ vmlal.s32 q1, d24, d27
+ vmlal.s32 q1, d23, d20
+ vmlal.s32 q1, d12, d7
+ vmlal.s32 q1, d13, d6
+ vmull.s32 q6, d18, d1
+ vmlal.s32 q6, d19, d0
+ vmlal.s32 q6, d23, d27
+ vmlal.s32 q6, d22, d20
+ vmlal.s32 q6, d24, d26
+ vmull.s32 q0, d18, d0
+ vmlal.s32 q0, d22, d27
+ vmlal.s32 q0, d23, d26
+ vmlal.s32 q0, d24, d31
+ vmlal.s32 q0, d19, d20
+ add r2, sp, #640
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q2, d18, d7
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d18, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d18, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d18, d28
+ vmlal.s32 q0, d19, d9
+ vmlal.s32 q6, d18, d29
+ vmlal.s32 q6, d19, d28
+ add r2, sp, #592
+ vld1.8 {d18-d19}, [r2, : 128]
+ add r2, sp, #512
+ vld1.8 {d22-d23}, [r2, : 128]
+ vmlal.s32 q5, d19, d7
+ vmlal.s32 q0, d18, d21
+ vmlal.s32 q0, d19, d29
+ vmlal.s32 q6, d18, d6
+ add r2, sp, #528
+ vld1.8 {d6-d7}, [r2, : 128]
+ vmlal.s32 q6, d19, d21
+ add r2, sp, #576
+ vld1.8 {d18-d19}, [r2, : 128]
+ vmlal.s32 q0, d30, d8
+ add r2, sp, #672
+ vld1.8 {d20-d21}, [r2, : 128]
+ vmlal.s32 q5, d30, d29
+ add r2, sp, #608
+ vld1.8 {d24-d25}, [r2, : 128]
+ vmlal.s32 q1, d30, d28
+ vadd.i64 q13, q0, q11
+ vadd.i64 q14, q5, q11
+ vmlal.s32 q6, d30, d9
+ vshr.s64 q4, q13, #26
+ vshr.s64 q13, q14, #26
+ vadd.i64 q7, q7, q4
+ vshl.i64 q4, q4, #26
+ vadd.i64 q14, q7, q3
+ vadd.i64 q9, q9, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q15, q9, q3
+ vsub.i64 q0, q0, q4
+ vshr.s64 q4, q14, #25
+ vsub.i64 q5, q5, q13
+ vshr.s64 q13, q15, #25
+ vadd.i64 q6, q6, q4
+ vshl.i64 q4, q4, #25
+ vadd.i64 q14, q6, q11
+ vadd.i64 q2, q2, q13
+ vsub.i64 q4, q7, q4
+ vshr.s64 q7, q14, #26
+ vshl.i64 q13, q13, #25
+ vadd.i64 q14, q2, q11
+ vadd.i64 q8, q8, q7
+ vshl.i64 q7, q7, #26
+ vadd.i64 q15, q8, q3
+ vsub.i64 q9, q9, q13
+ vshr.s64 q13, q14, #26
+ vsub.i64 q6, q6, q7
+ vshr.s64 q7, q15, #25
+ vadd.i64 q10, q10, q13
+ vshl.i64 q13, q13, #26
+ vadd.i64 q14, q10, q3
+ vadd.i64 q1, q1, q7
+ add r2, r3, #240
+ vshl.i64 q7, q7, #25
+ add r4, r3, #144
+ vadd.i64 q15, q1, q11
+ add r2, r2, #8
+ vsub.i64 q2, q2, q13
+ add r4, r4, #8
+ vshr.s64 q13, q14, #25
+ vsub.i64 q7, q8, q7
+ vshr.s64 q8, q15, #26
+ vadd.i64 q14, q13, q13
+ vadd.i64 q12, q12, q8
+ vtrn.32 d12, d14
+ vshl.i64 q8, q8, #26
+ vtrn.32 d13, d15
+ vadd.i64 q3, q12, q3
+ vadd.i64 q0, q0, q14
+ vst1.8 d12, [r2, : 64]!
+ vshl.i64 q7, q13, #4
+ vst1.8 d13, [r4, : 64]!
+ vsub.i64 q1, q1, q8
+ vshr.s64 q3, q3, #25
+ vadd.i64 q0, q0, q7
+ vadd.i64 q5, q5, q3
+ vshl.i64 q3, q3, #25
+ vadd.i64 q6, q5, q11
+ vadd.i64 q0, q0, q13
+ vshl.i64 q7, q13, #25
+ vadd.i64 q8, q0, q11
+ vsub.i64 q3, q12, q3
+ vshr.s64 q6, q6, #26
+ vsub.i64 q7, q10, q7
+ vtrn.32 d2, d6
+ vshr.s64 q8, q8, #26
+ vtrn.32 d3, d7
+ vadd.i64 q3, q9, q6
+ vst1.8 d2, [r2, : 64]
+ vshl.i64 q6, q6, #26
+ vst1.8 d3, [r4, : 64]
+ vadd.i64 q1, q4, q8
+ vtrn.32 d4, d14
+ vshl.i64 q4, q8, #26
+ vtrn.32 d5, d15
+ vsub.i64 q5, q5, q6
+ add r2, r2, #16
+ vsub.i64 q0, q0, q4
+ vst1.8 d4, [r2, : 64]
+ add r4, r4, #16
+ vst1.8 d5, [r4, : 64]
+ vtrn.32 d10, d6
+ vtrn.32 d11, d7
+ sub r2, r2, #8
+ sub r4, r4, #8
+ vtrn.32 d0, d2
+ vtrn.32 d1, d3
+ vst1.8 d10, [r2, : 64]
+ vst1.8 d11, [r4, : 64]
+ sub r2, r2, #24
+ sub r4, r4, #24
+ vst1.8 d0, [r2, : 64]
+ vst1.8 d1, [r4, : 64]
+ ldr r2, [sp, #488]
+ ldr r4, [sp, #492]
+ subs r5, r2, #1
+ bge ._mainloop
+ add r1, r3, #144
+ add r2, r3, #336
+ vld1.8 {d0-d1}, [r1, : 128]!
+ vld1.8 {d2-d3}, [r1, : 128]!
+ vld1.8 {d4}, [r1, : 64]
+ vst1.8 {d0-d1}, [r2, : 128]!
+ vst1.8 {d2-d3}, [r2, : 128]!
+ vst1.8 d4, [r2, : 64]
+ ldr r1, =0
+._invertloop:
+ add r2, r3, #144
+ ldr r4, =0
+ ldr r5, =2
+ cmp r1, #1
+ ldreq r5, =1
+ addeq r2, r3, #336
+ addeq r4, r3, #48
+ cmp r1, #2
+ ldreq r5, =1
+ addeq r2, r3, #48
+ cmp r1, #3
+ ldreq r5, =5
+ addeq r4, r3, #336
+ cmp r1, #4
+ ldreq r5, =10
+ cmp r1, #5
+ ldreq r5, =20
+ cmp r1, #6
+ ldreq r5, =10
+ addeq r2, r3, #336
+ addeq r4, r3, #336
+ cmp r1, #7
+ ldreq r5, =50
+ cmp r1, #8
+ ldreq r5, =100
+ cmp r1, #9
+ ldreq r5, =50
+ addeq r2, r3, #336
+ cmp r1, #10
+ ldreq r5, =5
+ addeq r2, r3, #48
+ cmp r1, #11
+ ldreq r5, =0
+ addeq r2, r3, #96
+ add r6, r3, #144
+ add r7, r3, #288
+ vld1.8 {d0-d1}, [r6, : 128]!
+ vld1.8 {d2-d3}, [r6, : 128]!
+ vld1.8 {d4}, [r6, : 64]
+ vst1.8 {d0-d1}, [r7, : 128]!
+ vst1.8 {d2-d3}, [r7, : 128]!
+ vst1.8 d4, [r7, : 64]
+ cmp r5, #0
+ beq ._skipsquaringloop
+._squaringloop:
+ add r6, r3, #288
+ add r7, r3, #288
+ add r8, r3, #288
+ vmov.i32 q0, #19
+ vmov.i32 q1, #0
+ vmov.i32 q2, #1
+ vzip.i32 q1, q2
+ vld1.8 {d4-d5}, [r7, : 128]!
+ vld1.8 {d6-d7}, [r7, : 128]!
+ vld1.8 {d9}, [r7, : 64]
+ vld1.8 {d10-d11}, [r6, : 128]!
+ add r7, sp, #416
+ vld1.8 {d12-d13}, [r6, : 128]!
+ vmul.i32 q7, q2, q0
+ vld1.8 {d8}, [r6, : 64]
+ vext.32 d17, d11, d10, #1
+ vmul.i32 q9, q3, q0
+ vext.32 d16, d10, d8, #1
+ vshl.u32 q10, q5, q1
+ vext.32 d22, d14, d4, #1
+ vext.32 d24, d18, d6, #1
+ vshl.u32 q13, q6, q1
+ vshl.u32 d28, d8, d2
+ vrev64.i32 d22, d22
+ vmul.i32 d1, d9, d1
+ vrev64.i32 d24, d24
+ vext.32 d29, d8, d13, #1
+ vext.32 d0, d1, d9, #1
+ vrev64.i32 d0, d0
+ vext.32 d2, d9, d1, #1
+ vext.32 d23, d15, d5, #1
+ vmull.s32 q4, d20, d4
+ vrev64.i32 d23, d23
+ vmlal.s32 q4, d21, d1
+ vrev64.i32 d2, d2
+ vmlal.s32 q4, d26, d19
+ vext.32 d3, d5, d15, #1
+ vmlal.s32 q4, d27, d18
+ vrev64.i32 d3, d3
+ vmlal.s32 q4, d28, d15
+ vext.32 d14, d12, d11, #1
+ vmull.s32 q5, d16, d23
+ vext.32 d15, d13, d12, #1
+ vmlal.s32 q5, d17, d4
+ vst1.8 d8, [r7, : 64]!
+ vmlal.s32 q5, d14, d1
+ vext.32 d12, d9, d8, #0
+ vmlal.s32 q5, d15, d19
+ vmov.i64 d13, #0
+ vmlal.s32 q5, d29, d18
+ vext.32 d25, d19, d7, #1
+ vmlal.s32 q6, d20, d5
+ vrev64.i32 d25, d25
+ vmlal.s32 q6, d21, d4
+ vst1.8 d11, [r7, : 64]!
+ vmlal.s32 q6, d26, d1
+ vext.32 d9, d10, d10, #0
+ vmlal.s32 q6, d27, d19
+ vmov.i64 d8, #0
+ vmlal.s32 q6, d28, d18
+ vmlal.s32 q4, d16, d24
+ vmlal.s32 q4, d17, d5
+ vmlal.s32 q4, d14, d4
+ vst1.8 d12, [r7, : 64]!
+ vmlal.s32 q4, d15, d1
+ vext.32 d10, d13, d12, #0
+ vmlal.s32 q4, d29, d19
+ vmov.i64 d11, #0
+ vmlal.s32 q5, d20, d6
+ vmlal.s32 q5, d21, d5
+ vmlal.s32 q5, d26, d4
+ vext.32 d13, d8, d8, #0
+ vmlal.s32 q5, d27, d1
+ vmov.i64 d12, #0
+ vmlal.s32 q5, d28, d19
+ vst1.8 d9, [r7, : 64]!
+ vmlal.s32 q6, d16, d25
+ vmlal.s32 q6, d17, d6
+ vst1.8 d10, [r7, : 64]
+ vmlal.s32 q6, d14, d5
+ vext.32 d8, d11, d10, #0
+ vmlal.s32 q6, d15, d4
+ vmov.i64 d9, #0
+ vmlal.s32 q6, d29, d1
+ vmlal.s32 q4, d20, d7
+ vmlal.s32 q4, d21, d6
+ vmlal.s32 q4, d26, d5
+ vext.32 d11, d12, d12, #0
+ vmlal.s32 q4, d27, d4
+ vmov.i64 d10, #0
+ vmlal.s32 q4, d28, d1
+ vmlal.s32 q5, d16, d0
+ sub r6, r7, #32
+ vmlal.s32 q5, d17, d7
+ vmlal.s32 q5, d14, d6
+ vext.32 d30, d9, d8, #0
+ vmlal.s32 q5, d15, d5
+ vld1.8 {d31}, [r6, : 64]!
+ vmlal.s32 q5, d29, d4
+ vmlal.s32 q15, d20, d0
+ vext.32 d0, d6, d18, #1
+ vmlal.s32 q15, d21, d25
+ vrev64.i32 d0, d0
+ vmlal.s32 q15, d26, d24
+ vext.32 d1, d7, d19, #1
+ vext.32 d7, d10, d10, #0
+ vmlal.s32 q15, d27, d23
+ vrev64.i32 d1, d1
+ vld1.8 {d6}, [r6, : 64]
+ vmlal.s32 q15, d28, d22
+ vmlal.s32 q3, d16, d4
+ add r6, r6, #24
+ vmlal.s32 q3, d17, d2
+ vext.32 d4, d31, d30, #0
+ vmov d17, d11
+ vmlal.s32 q3, d14, d1
+ vext.32 d11, d13, d13, #0
+ vext.32 d13, d30, d30, #0
+ vmlal.s32 q3, d15, d0
+ vext.32 d1, d8, d8, #0
+ vmlal.s32 q3, d29, d3
+ vld1.8 {d5}, [r6, : 64]
+ sub r6, r6, #16
+ vext.32 d10, d6, d6, #0
+ vmov.i32 q1, #0xffffffff
+ vshl.i64 q4, q1, #25
+ add r7, sp, #512
+ vld1.8 {d14-d15}, [r7, : 128]
+ vadd.i64 q9, q2, q7
+ vshl.i64 q1, q1, #26
+ vshr.s64 q10, q9, #26
+ vld1.8 {d0}, [r6, : 64]!
+ vadd.i64 q5, q5, q10
+ vand q9, q9, q1
+ vld1.8 {d16}, [r6, : 64]!
+ add r6, sp, #528
+ vld1.8 {d20-d21}, [r6, : 128]
+ vadd.i64 q11, q5, q10
+ vsub.i64 q2, q2, q9
+ vshr.s64 q9, q11, #25
+ vext.32 d12, d5, d4, #0
+ vand q11, q11, q4
+ vadd.i64 q0, q0, q9
+ vmov d19, d7
+ vadd.i64 q3, q0, q7
+ vsub.i64 q5, q5, q11
+ vshr.s64 q11, q3, #26
+ vext.32 d18, d11, d10, #0
+ vand q3, q3, q1
+ vadd.i64 q8, q8, q11
+ vadd.i64 q11, q8, q10
+ vsub.i64 q0, q0, q3
+ vshr.s64 q3, q11, #25
+ vand q11, q11, q4
+ vadd.i64 q3, q6, q3
+ vadd.i64 q6, q3, q7
+ vsub.i64 q8, q8, q11
+ vshr.s64 q11, q6, #26
+ vand q6, q6, q1
+ vadd.i64 q9, q9, q11
+ vadd.i64 d25, d19, d21
+ vsub.i64 q3, q3, q6
+ vshr.s64 d23, d25, #25
+ vand q4, q12, q4
+ vadd.i64 d21, d23, d23
+ vshl.i64 d25, d23, #4
+ vadd.i64 d21, d21, d23
+ vadd.i64 d25, d25, d21
+ vadd.i64 d4, d4, d25
+ vzip.i32 q0, q8
+ vadd.i64 d12, d4, d14
+ add r6, r8, #8
+ vst1.8 d0, [r6, : 64]
+ vsub.i64 d19, d19, d9
+ add r6, r6, #16
+ vst1.8 d16, [r6, : 64]
+ vshr.s64 d22, d12, #26
+ vand q0, q6, q1
+ vadd.i64 d10, d10, d22
+ vzip.i32 q3, q9
+ vsub.i64 d4, d4, d0
+ sub r6, r6, #8
+ vst1.8 d6, [r6, : 64]
+ add r6, r6, #16
+ vst1.8 d18, [r6, : 64]
+ vzip.i32 q2, q5
+ sub r6, r6, #32
+ vst1.8 d4, [r6, : 64]
+ subs r5, r5, #1
+ bhi ._squaringloop
+._skipsquaringloop:
+ mov r2, r2
+ add r5, r3, #288
+ add r6, r3, #144
+ vmov.i32 q0, #19
+ vmov.i32 q1, #0
+ vmov.i32 q2, #1
+ vzip.i32 q1, q2
+ vld1.8 {d4-d5}, [r5, : 128]!
+ vld1.8 {d6-d7}, [r5, : 128]!
+ vld1.8 {d9}, [r5, : 64]
+ vld1.8 {d10-d11}, [r2, : 128]!
+ add r5, sp, #416
+ vld1.8 {d12-d13}, [r2, : 128]!
+ vmul.i32 q7, q2, q0
+ vld1.8 {d8}, [r2, : 64]
+ vext.32 d17, d11, d10, #1
+ vmul.i32 q9, q3, q0
+ vext.32 d16, d10, d8, #1
+ vshl.u32 q10, q5, q1
+ vext.32 d22, d14, d4, #1
+ vext.32 d24, d18, d6, #1
+ vshl.u32 q13, q6, q1
+ vshl.u32 d28, d8, d2
+ vrev64.i32 d22, d22
+ vmul.i32 d1, d9, d1
+ vrev64.i32 d24, d24
+ vext.32 d29, d8, d13, #1
+ vext.32 d0, d1, d9, #1
+ vrev64.i32 d0, d0
+ vext.32 d2, d9, d1, #1
+ vext.32 d23, d15, d5, #1
+ vmull.s32 q4, d20, d4
+ vrev64.i32 d23, d23
+ vmlal.s32 q4, d21, d1
+ vrev64.i32 d2, d2
+ vmlal.s32 q4, d26, d19
+ vext.32 d3, d5, d15, #1
+ vmlal.s32 q4, d27, d18
+ vrev64.i32 d3, d3
+ vmlal.s32 q4, d28, d15
+ vext.32 d14, d12, d11, #1
+ vmull.s32 q5, d16, d23
+ vext.32 d15, d13, d12, #1
+ vmlal.s32 q5, d17, d4
+ vst1.8 d8, [r5, : 64]!
+ vmlal.s32 q5, d14, d1
+ vext.32 d12, d9, d8, #0
+ vmlal.s32 q5, d15, d19
+ vmov.i64 d13, #0
+ vmlal.s32 q5, d29, d18
+ vext.32 d25, d19, d7, #1
+ vmlal.s32 q6, d20, d5
+ vrev64.i32 d25, d25
+ vmlal.s32 q6, d21, d4
+ vst1.8 d11, [r5, : 64]!
+ vmlal.s32 q6, d26, d1
+ vext.32 d9, d10, d10, #0
+ vmlal.s32 q6, d27, d19
+ vmov.i64 d8, #0
+ vmlal.s32 q6, d28, d18
+ vmlal.s32 q4, d16, d24
+ vmlal.s32 q4, d17, d5
+ vmlal.s32 q4, d14, d4
+ vst1.8 d12, [r5, : 64]!
+ vmlal.s32 q4, d15, d1
+ vext.32 d10, d13, d12, #0
+ vmlal.s32 q4, d29, d19
+ vmov.i64 d11, #0
+ vmlal.s32 q5, d20, d6
+ vmlal.s32 q5, d21, d5
+ vmlal.s32 q5, d26, d4
+ vext.32 d13, d8, d8, #0
+ vmlal.s32 q5, d27, d1
+ vmov.i64 d12, #0
+ vmlal.s32 q5, d28, d19
+ vst1.8 d9, [r5, : 64]!
+ vmlal.s32 q6, d16, d25
+ vmlal.s32 q6, d17, d6
+ vst1.8 d10, [r5, : 64]
+ vmlal.s32 q6, d14, d5
+ vext.32 d8, d11, d10, #0
+ vmlal.s32 q6, d15, d4
+ vmov.i64 d9, #0
+ vmlal.s32 q6, d29, d1
+ vmlal.s32 q4, d20, d7
+ vmlal.s32 q4, d21, d6
+ vmlal.s32 q4, d26, d5
+ vext.32 d11, d12, d12, #0
+ vmlal.s32 q4, d27, d4
+ vmov.i64 d10, #0
+ vmlal.s32 q4, d28, d1
+ vmlal.s32 q5, d16, d0
+ sub r2, r5, #32
+ vmlal.s32 q5, d17, d7
+ vmlal.s32 q5, d14, d6
+ vext.32 d30, d9, d8, #0
+ vmlal.s32 q5, d15, d5
+ vld1.8 {d31}, [r2, : 64]!
+ vmlal.s32 q5, d29, d4
+ vmlal.s32 q15, d20, d0
+ vext.32 d0, d6, d18, #1
+ vmlal.s32 q15, d21, d25
+ vrev64.i32 d0, d0
+ vmlal.s32 q15, d26, d24
+ vext.32 d1, d7, d19, #1
+ vext.32 d7, d10, d10, #0
+ vmlal.s32 q15, d27, d23
+ vrev64.i32 d1, d1
+ vld1.8 {d6}, [r2, : 64]
+ vmlal.s32 q15, d28, d22
+ vmlal.s32 q3, d16, d4
+ add r2, r2, #24
+ vmlal.s32 q3, d17, d2
+ vext.32 d4, d31, d30, #0
+ vmov d17, d11
+ vmlal.s32 q3, d14, d1
+ vext.32 d11, d13, d13, #0
+ vext.32 d13, d30, d30, #0
+ vmlal.s32 q3, d15, d0
+ vext.32 d1, d8, d8, #0
+ vmlal.s32 q3, d29, d3
+ vld1.8 {d5}, [r2, : 64]
+ sub r2, r2, #16
+ vext.32 d10, d6, d6, #0
+ vmov.i32 q1, #0xffffffff
+ vshl.i64 q4, q1, #25
+ add r5, sp, #512
+ vld1.8 {d14-d15}, [r5, : 128]
+ vadd.i64 q9, q2, q7
+ vshl.i64 q1, q1, #26
+ vshr.s64 q10, q9, #26
+ vld1.8 {d0}, [r2, : 64]!
+ vadd.i64 q5, q5, q10
+ vand q9, q9, q1
+ vld1.8 {d16}, [r2, : 64]!
+ add r2, sp, #528
+ vld1.8 {d20-d21}, [r2, : 128]
+ vadd.i64 q11, q5, q10
+ vsub.i64 q2, q2, q9
+ vshr.s64 q9, q11, #25
+ vext.32 d12, d5, d4, #0
+ vand q11, q11, q4
+ vadd.i64 q0, q0, q9
+ vmov d19, d7
+ vadd.i64 q3, q0, q7
+ vsub.i64 q5, q5, q11
+ vshr.s64 q11, q3, #26
+ vext.32 d18, d11, d10, #0
+ vand q3, q3, q1
+ vadd.i64 q8, q8, q11
+ vadd.i64 q11, q8, q10
+ vsub.i64 q0, q0, q3
+ vshr.s64 q3, q11, #25
+ vand q11, q11, q4
+ vadd.i64 q3, q6, q3
+ vadd.i64 q6, q3, q7
+ vsub.i64 q8, q8, q11
+ vshr.s64 q11, q6, #26
+ vand q6, q6, q1
+ vadd.i64 q9, q9, q11
+ vadd.i64 d25, d19, d21
+ vsub.i64 q3, q3, q6
+ vshr.s64 d23, d25, #25
+ vand q4, q12, q4
+ vadd.i64 d21, d23, d23
+ vshl.i64 d25, d23, #4
+ vadd.i64 d21, d21, d23
+ vadd.i64 d25, d25, d21
+ vadd.i64 d4, d4, d25
+ vzip.i32 q0, q8
+ vadd.i64 d12, d4, d14
+ add r2, r6, #8
+ vst1.8 d0, [r2, : 64]
+ vsub.i64 d19, d19, d9
+ add r2, r2, #16
+ vst1.8 d16, [r2, : 64]
+ vshr.s64 d22, d12, #26
+ vand q0, q6, q1
+ vadd.i64 d10, d10, d22
+ vzip.i32 q3, q9
+ vsub.i64 d4, d4, d0
+ sub r2, r2, #8
+ vst1.8 d6, [r2, : 64]
+ add r2, r2, #16
+ vst1.8 d18, [r2, : 64]
+ vzip.i32 q2, q5
+ sub r2, r2, #32
+ vst1.8 d4, [r2, : 64]
+ cmp r4, #0
+ beq ._skippostcopy
+ add r2, r3, #144
+ mov r4, r4
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d4}, [r2, : 64]
+ vst1.8 {d0-d1}, [r4, : 128]!
+ vst1.8 {d2-d3}, [r4, : 128]!
+ vst1.8 d4, [r4, : 64]
+._skippostcopy:
+ cmp r1, #1
+ bne ._skipfinalcopy
+ add r2, r3, #288
+ add r4, r3, #144
+ vld1.8 {d0-d1}, [r2, : 128]!
+ vld1.8 {d2-d3}, [r2, : 128]!
+ vld1.8 {d4}, [r2, : 64]
+ vst1.8 {d0-d1}, [r4, : 128]!
+ vst1.8 {d2-d3}, [r4, : 128]!
+ vst1.8 d4, [r4, : 64]
+._skipfinalcopy:
+ add r1, r1, #1
+ cmp r1, #12
+ blo ._invertloop
+ add r1, r3, #144
+ ldr r2, [r1], #4
+ ldr r3, [r1], #4
+ ldr r4, [r1], #4
+ ldr r5, [r1], #4
+ ldr r6, [r1], #4
+ ldr r7, [r1], #4
+ ldr r8, [r1], #4
+ ldr r9, [r1], #4
+ ldr r10, [r1], #4
+ ldr r1, [r1]
+ add r11, r1, r1, LSL #4
+ add r11, r11, r1, LSL #1
+ add r11, r11, #16777216
+ mov r11, r11, ASR #25
+ add r11, r11, r2
+ mov r11, r11, ASR #26
+ add r11, r11, r3
+ mov r11, r11, ASR #25
+ add r11, r11, r4
+ mov r11, r11, ASR #26
+ add r11, r11, r5
+ mov r11, r11, ASR #25
+ add r11, r11, r6
+ mov r11, r11, ASR #26
+ add r11, r11, r7
+ mov r11, r11, ASR #25
+ add r11, r11, r8
+ mov r11, r11, ASR #26
+ add r11, r11, r9
+ mov r11, r11, ASR #25
+ add r11, r11, r10
+ mov r11, r11, ASR #26
+ add r11, r11, r1
+ mov r11, r11, ASR #25
+ add r2, r2, r11
+ add r2, r2, r11, LSL #1
+ add r2, r2, r11, LSL #4
+ mov r11, r2, ASR #26
+ add r3, r3, r11
+ sub r2, r2, r11, LSL #26
+ mov r11, r3, ASR #25
+ add r4, r4, r11
+ sub r3, r3, r11, LSL #25
+ mov r11, r4, ASR #26
+ add r5, r5, r11
+ sub r4, r4, r11, LSL #26
+ mov r11, r5, ASR #25
+ add r6, r6, r11
+ sub r5, r5, r11, LSL #25
+ mov r11, r6, ASR #26
+ add r7, r7, r11
+ sub r6, r6, r11, LSL #26
+ mov r11, r7, ASR #25
+ add r8, r8, r11
+ sub r7, r7, r11, LSL #25
+ mov r11, r8, ASR #26
+ add r9, r9, r11
+ sub r8, r8, r11, LSL #26
+ mov r11, r9, ASR #25
+ add r10, r10, r11
+ sub r9, r9, r11, LSL #25
+ mov r11, r10, ASR #26
+ add r1, r1, r11
+ sub r10, r10, r11, LSL #26
+ mov r11, r1, ASR #25
+ sub r1, r1, r11, LSL #25
+ add r2, r2, r3, LSL #26
+ mov r3, r3, LSR #6
+ add r3, r3, r4, LSL #19
+ mov r4, r4, LSR #13
+ add r4, r4, r5, LSL #13
+ mov r5, r5, LSR #19
+ add r5, r5, r6, LSL #6
+ add r6, r7, r8, LSL #25
+ mov r7, r8, LSR #7
+ add r7, r7, r9, LSL #19
+ mov r8, r9, LSR #13
+ add r8, r8, r10, LSL #12
+ mov r9, r10, LSR #20
+ add r1, r9, r1, LSL #6
+ str r2, [r0], #4
+ str r3, [r0], #4
+ str r4, [r0], #4
+ str r5, [r0], #4
+ str r6, [r0], #4
+ str r7, [r0], #4
+ str r8, [r0], #4
+ str r1, [r0]
+ ldrd r4, [sp, #0]
+ ldrd r6, [sp, #8]
+ ldrd r8, [sp, #16]
+ ldrd r10, [sp, #24]
+ ldr r12, [sp, #480]
+ ldr r14, [sp, #484]
+ ldr r0, =0
+ mov sp, r12
+ vpop {q4, q5, q6, q7}
+ bx lr
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 24/28] zinc: Curve25519 ARM implementation
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (20 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 23/28] zinc: import Bernstein and Schwabe's Curve25519 ARM implementation Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 26/28] crypto: port ChaCha20 " Jason A. Donenfeld
23 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Russell King, linux-arm-kernel, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
This ports the SUPERCOP implementation for usage in kernel space. In
addition to the usual header, macro, and style changes required for
kernel space, it makes a few small changes to the code:
- The stack alignment is relaxed to 16 bytes.
- Superfluous mov statements have been removed.
- ldr for constants has been replaced with movw.
- ldreq has been replaced with moveq.
- The str epilogue has been made more idiomatic.
- SIMD registers are not pushed and popped at the beginning and end.
- The prologue and epilogue have been made idiomatic.
- A hole has been removed from the stack, saving 32 bytes.
- We write-back the base register whenever possible for vld1.8.
- Some multiplications have been reordered for better A7 performance.
There are more opportunities for cleanup, since this code is from qhasm,
which doesn't always do the most opportune thing. But even prior to
extensive hand optimizations, this code delivers significant performance
improvements (given in get_cycles() per call):
----------- -------------
| generic C | this commit |
------------ ----------- -------------
| Cortex-A7 | 49136 | 22395 |
------------ ----------- -------------
| Cortex-A17 | 17326 | 4983 |
------------ ----------- -------------
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
lib/zinc/Makefile | 1 +
lib/zinc/curve25519/curve25519-arm-glue.c | 43 +++
...e25519-arm-supercop.S => curve25519-arm.S} | 349 ++++++++----------
lib/zinc/curve25519/curve25519.c | 2 +
4 files changed, 200 insertions(+), 195 deletions(-)
create mode 100644 lib/zinc/curve25519/curve25519-arm-glue.c
rename lib/zinc/curve25519/{curve25519-arm-supercop.S => curve25519-arm.S} (92%)
diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
index 65440438c6e5..be73c342f9ba 100644
--- a/lib/zinc/Makefile
+++ b/lib/zinc/Makefile
@@ -27,4 +27,5 @@ zinc_blake2s-$(CONFIG_ZINC_ARCH_X86_64) += blake2s/blake2s-x86_64.o
obj-$(CONFIG_ZINC_BLAKE2S) += zinc_blake2s.o
zinc_curve25519-y := curve25519/curve25519.o
+zinc_curve25519-$(CONFIG_ZINC_ARCH_ARM) += curve25519/curve25519-arm.o
obj-$(CONFIG_ZINC_CURVE25519) += zinc_curve25519.o
diff --git a/lib/zinc/curve25519/curve25519-arm-glue.c b/lib/zinc/curve25519/curve25519-arm-glue.c
new file mode 100644
index 000000000000..c71c981c3ba9
--- /dev/null
+++ b/lib/zinc/curve25519/curve25519-arm-glue.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <linux/simd.h>
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+
+asmlinkage void curve25519_neon(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE]);
+
+static bool curve25519_use_neon __ro_after_init;
+static bool *const curve25519_nobs[] __initconst = { &curve25519_use_neon };
+static void __init curve25519_fpu_init(void)
+{
+ curve25519_use_neon = elf_hwcap & HWCAP_NEON;
+}
+
+static inline bool curve25519_arch(u8 mypublic[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE],
+ const u8 basepoint[CURVE25519_KEY_SIZE])
+{
+ simd_context_t simd_context;
+ bool used_arch = false;
+
+ simd_get(&simd_context);
+ if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
+ !IS_ENABLED(CONFIG_CPU_BIG_ENDIAN) && curve25519_use_neon &&
+ simd_use(&simd_context)) {
+ curve25519_neon(mypublic, secret, basepoint);
+ used_arch = true;
+ }
+ simd_put(&simd_context);
+ return used_arch;
+}
+
+static inline bool curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE],
+ const u8 secret[CURVE25519_KEY_SIZE])
+{
+ return false;
+}
diff --git a/lib/zinc/curve25519/curve25519-arm-supercop.S b/lib/zinc/curve25519/curve25519-arm.S
similarity index 92%
rename from lib/zinc/curve25519/curve25519-arm-supercop.S
rename to lib/zinc/curve25519/curve25519-arm.S
index f33b85fef382..b63ac48e7f8d 100644
--- a/lib/zinc/curve25519/curve25519-arm-supercop.S
+++ b/lib/zinc/curve25519/curve25519-arm.S
@@ -1,43 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
/*
- * Public domain code from Daniel J. Bernstein and Peter Schwabe, from
- * SUPERCOP's curve25519/neon2/scalarmult.s.
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * Based on public domain code from Daniel J. Bernstein and Peter Schwabe. This
+ * began from SUPERCOP's curve25519/neon2/scalarmult.s, but has subsequently been
+ * manually reworked for use in kernel space.
*/
-.fpu neon
+#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(__ARMEB__)
+#include <linux/linkage.h>
+
.text
+.fpu neon
+.arch armv7-a
.align 4
-.global _crypto_scalarmult_curve25519_neon2
-.global crypto_scalarmult_curve25519_neon2
-.type _crypto_scalarmult_curve25519_neon2 STT_FUNC
-.type crypto_scalarmult_curve25519_neon2 STT_FUNC
- _crypto_scalarmult_curve25519_neon2:
- crypto_scalarmult_curve25519_neon2:
- vpush {q4, q5, q6, q7}
- mov r12, sp
- sub sp, sp, #736
- and sp, sp, #0xffffffe0
- strd r4, [sp, #0]
- strd r6, [sp, #8]
- strd r8, [sp, #16]
- strd r10, [sp, #24]
- str r12, [sp, #480]
- str r14, [sp, #484]
- mov r0, r0
- mov r1, r1
- mov r2, r2
- add r3, sp, #32
- ldr r4, =0
- ldr r5, =254
+
+ENTRY(curve25519_neon)
+ push {r4-r11, lr}
+ mov ip, sp
+ sub r3, sp, #704
+ and r3, r3, #0xfffffff0
+ mov sp, r3
+ movw r4, #0
+ movw r5, #254
vmov.i32 q0, #1
vshr.u64 q1, q0, #7
vshr.u64 q0, q0, #8
vmov.i32 d4, #19
vmov.i32 d5, #38
- add r6, sp, #512
- vst1.8 {d2-d3}, [r6, : 128]
- add r6, sp, #528
- vst1.8 {d0-d1}, [r6, : 128]
- add r6, sp, #544
+ add r6, sp, #480
+ vst1.8 {d2-d3}, [r6, : 128]!
+ vst1.8 {d0-d1}, [r6, : 128]!
vst1.8 {d4-d5}, [r6, : 128]
add r6, r3, #0
vmov.i32 q2, #0
@@ -45,12 +38,12 @@
vst1.8 {d4-d5}, [r6, : 128]!
vst1.8 d4, [r6, : 64]
add r6, r3, #0
- ldr r7, =960
+ movw r7, #960
sub r7, r7, #2
neg r7, r7
sub r7, r7, r7, LSL #7
str r7, [r6]
- add r6, sp, #704
+ add r6, sp, #672
vld1.8 {d4-d5}, [r1]!
vld1.8 {d6-d7}, [r1]
vst1.8 {d4-d5}, [r6, : 128]!
@@ -212,15 +205,15 @@
vst1.8 {d0-d1}, [r6, : 128]!
vst1.8 {d2-d3}, [r6, : 128]!
vst1.8 d4, [r6, : 64]
-._mainloop:
+.Lmainloop:
mov r2, r5, LSR #3
and r6, r5, #7
ldrb r2, [r1, r2]
mov r2, r2, LSR r6
and r2, r2, #1
- str r5, [sp, #488]
+ str r5, [sp, #456]
eor r4, r4, r2
- str r2, [sp, #492]
+ str r2, [sp, #460]
neg r2, r4
add r4, r3, #96
add r5, r3, #192
@@ -291,7 +284,7 @@
vsub.i32 q0, q1, q3
vst1.8 d4, [r4, : 64]
vst1.8 d0, [r6, : 64]
- add r2, sp, #544
+ add r2, sp, #512
add r4, r3, #96
add r5, r3, #144
vld1.8 {d0-d1}, [r2, : 128]
@@ -361,14 +354,13 @@
vmlal.s32 q0, d12, d8
vmlal.s32 q0, d13, d17
vmlal.s32 q0, d6, d6
- add r2, sp, #512
- vld1.8 {d18-d19}, [r2, : 128]
+ add r2, sp, #480
+ vld1.8 {d18-d19}, [r2, : 128]!
vmull.s32 q3, d16, d7
vmlal.s32 q3, d10, d15
vmlal.s32 q3, d11, d14
vmlal.s32 q3, d12, d9
vmlal.s32 q3, d13, d8
- add r2, sp, #528
vld1.8 {d8-d9}, [r2, : 128]
vadd.i64 q5, q12, q9
vadd.i64 q6, q15, q9
@@ -502,22 +494,19 @@
vadd.i32 q5, q5, q0
vtrn.32 q11, q14
vadd.i32 q6, q6, q3
- add r2, sp, #560
+ add r2, sp, #528
vadd.i32 q10, q10, q2
vtrn.32 d24, d25
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q6, q13, #1
- add r2, sp, #576
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vshl.i32 q10, q14, #1
- add r2, sp, #592
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q15, q12, #1
vadd.i32 q8, q8, q4
vext.32 d10, d31, d30, #0
vadd.i32 q7, q7, q1
- add r2, sp, #608
- vst1.8 {d16-d17}, [r2, : 128]
+ vst1.8 {d16-d17}, [r2, : 128]!
vmull.s32 q8, d18, d5
vmlal.s32 q8, d26, d4
vmlal.s32 q8, d19, d9
@@ -528,8 +517,7 @@
vmlal.s32 q8, d29, d1
vmlal.s32 q8, d24, d6
vmlal.s32 q8, d25, d0
- add r2, sp, #624
- vst1.8 {d14-d15}, [r2, : 128]
+ vst1.8 {d14-d15}, [r2, : 128]!
vmull.s32 q2, d18, d4
vmlal.s32 q2, d12, d9
vmlal.s32 q2, d13, d8
@@ -537,8 +525,7 @@
vmlal.s32 q2, d22, d2
vmlal.s32 q2, d23, d1
vmlal.s32 q2, d24, d0
- add r2, sp, #640
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vmull.s32 q7, d18, d9
vmlal.s32 q7, d26, d3
vmlal.s32 q7, d19, d8
@@ -547,14 +534,12 @@
vmlal.s32 q7, d28, d1
vmlal.s32 q7, d23, d6
vmlal.s32 q7, d29, d0
- add r2, sp, #656
- vst1.8 {d10-d11}, [r2, : 128]
+ vst1.8 {d10-d11}, [r2, : 128]!
vmull.s32 q5, d18, d3
vmlal.s32 q5, d19, d2
vmlal.s32 q5, d22, d1
vmlal.s32 q5, d23, d0
vmlal.s32 q5, d12, d8
- add r2, sp, #672
vst1.8 {d16-d17}, [r2, : 128]
vmull.s32 q4, d18, d8
vmlal.s32 q4, d26, d2
@@ -566,7 +551,7 @@
vmlal.s32 q8, d26, d1
vmlal.s32 q8, d19, d6
vmlal.s32 q8, d27, d0
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q7, d24, d21
vmlal.s32 q7, d25, d20
@@ -575,32 +560,30 @@
vmlal.s32 q8, d22, d21
vmlal.s32 q8, d28, d20
vmlal.s32 q5, d24, d20
- add r2, sp, #576
vst1.8 {d14-d15}, [r2, : 128]
vmull.s32 q7, d18, d6
vmlal.s32 q7, d26, d0
- add r2, sp, #656
+ add r2, sp, #624
vld1.8 {d30-d31}, [r2, : 128]
vmlal.s32 q2, d30, d21
vmlal.s32 q7, d19, d21
vmlal.s32 q7, d27, d20
- add r2, sp, #624
+ add r2, sp, #592
vld1.8 {d26-d27}, [r2, : 128]
vmlal.s32 q4, d25, d27
vmlal.s32 q8, d29, d27
vmlal.s32 q8, d25, d26
vmlal.s32 q7, d28, d27
vmlal.s32 q7, d29, d26
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d28-d29}, [r2, : 128]
vmlal.s32 q4, d24, d29
vmlal.s32 q8, d23, d29
vmlal.s32 q8, d24, d28
vmlal.s32 q7, d22, d29
vmlal.s32 q7, d23, d28
- add r2, sp, #608
vst1.8 {d8-d9}, [r2, : 128]
- add r2, sp, #560
+ add r2, sp, #528
vld1.8 {d8-d9}, [r2, : 128]
vmlal.s32 q7, d24, d9
vmlal.s32 q7, d25, d31
@@ -621,36 +604,36 @@
vmlal.s32 q0, d23, d26
vmlal.s32 q0, d24, d31
vmlal.s32 q0, d19, d20
- add r2, sp, #640
+ add r2, sp, #608
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q2, d18, d7
- vmlal.s32 q2, d19, d6
vmlal.s32 q5, d18, d6
- vmlal.s32 q5, d19, d21
vmlal.s32 q1, d18, d21
- vmlal.s32 q1, d19, d29
vmlal.s32 q0, d18, d28
- vmlal.s32 q0, d19, d9
vmlal.s32 q6, d18, d29
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d19, d9
vmlal.s32 q6, d19, d28
- add r2, sp, #592
+ add r2, sp, #560
vld1.8 {d18-d19}, [r2, : 128]
- add r2, sp, #512
+ add r2, sp, #480
vld1.8 {d22-d23}, [r2, : 128]
vmlal.s32 q5, d19, d7
vmlal.s32 q0, d18, d21
vmlal.s32 q0, d19, d29
vmlal.s32 q6, d18, d6
- add r2, sp, #528
+ add r2, sp, #496
vld1.8 {d6-d7}, [r2, : 128]
vmlal.s32 q6, d19, d21
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q0, d30, d8
- add r2, sp, #672
+ add r2, sp, #640
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q5, d30, d29
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d24-d25}, [r2, : 128]
vmlal.s32 q1, d30, d28
vadd.i64 q13, q0, q11
@@ -823,22 +806,19 @@
vadd.i32 q5, q5, q0
vtrn.32 q11, q14
vadd.i32 q6, q6, q3
- add r2, sp, #560
+ add r2, sp, #528
vadd.i32 q10, q10, q2
vtrn.32 d24, d25
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q6, q13, #1
- add r2, sp, #576
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vshl.i32 q10, q14, #1
- add r2, sp, #592
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q15, q12, #1
vadd.i32 q8, q8, q4
vext.32 d10, d31, d30, #0
vadd.i32 q7, q7, q1
- add r2, sp, #608
- vst1.8 {d16-d17}, [r2, : 128]
+ vst1.8 {d16-d17}, [r2, : 128]!
vmull.s32 q8, d18, d5
vmlal.s32 q8, d26, d4
vmlal.s32 q8, d19, d9
@@ -849,8 +829,7 @@
vmlal.s32 q8, d29, d1
vmlal.s32 q8, d24, d6
vmlal.s32 q8, d25, d0
- add r2, sp, #624
- vst1.8 {d14-d15}, [r2, : 128]
+ vst1.8 {d14-d15}, [r2, : 128]!
vmull.s32 q2, d18, d4
vmlal.s32 q2, d12, d9
vmlal.s32 q2, d13, d8
@@ -858,8 +837,7 @@
vmlal.s32 q2, d22, d2
vmlal.s32 q2, d23, d1
vmlal.s32 q2, d24, d0
- add r2, sp, #640
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vmull.s32 q7, d18, d9
vmlal.s32 q7, d26, d3
vmlal.s32 q7, d19, d8
@@ -868,15 +846,13 @@
vmlal.s32 q7, d28, d1
vmlal.s32 q7, d23, d6
vmlal.s32 q7, d29, d0
- add r2, sp, #656
- vst1.8 {d10-d11}, [r2, : 128]
+ vst1.8 {d10-d11}, [r2, : 128]!
vmull.s32 q5, d18, d3
vmlal.s32 q5, d19, d2
vmlal.s32 q5, d22, d1
vmlal.s32 q5, d23, d0
vmlal.s32 q5, d12, d8
- add r2, sp, #672
- vst1.8 {d16-d17}, [r2, : 128]
+ vst1.8 {d16-d17}, [r2, : 128]!
vmull.s32 q4, d18, d8
vmlal.s32 q4, d26, d2
vmlal.s32 q4, d19, d7
@@ -887,7 +863,7 @@
vmlal.s32 q8, d26, d1
vmlal.s32 q8, d19, d6
vmlal.s32 q8, d27, d0
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q7, d24, d21
vmlal.s32 q7, d25, d20
@@ -896,32 +872,30 @@
vmlal.s32 q8, d22, d21
vmlal.s32 q8, d28, d20
vmlal.s32 q5, d24, d20
- add r2, sp, #576
vst1.8 {d14-d15}, [r2, : 128]
vmull.s32 q7, d18, d6
vmlal.s32 q7, d26, d0
- add r2, sp, #656
+ add r2, sp, #624
vld1.8 {d30-d31}, [r2, : 128]
vmlal.s32 q2, d30, d21
vmlal.s32 q7, d19, d21
vmlal.s32 q7, d27, d20
- add r2, sp, #624
+ add r2, sp, #592
vld1.8 {d26-d27}, [r2, : 128]
vmlal.s32 q4, d25, d27
vmlal.s32 q8, d29, d27
vmlal.s32 q8, d25, d26
vmlal.s32 q7, d28, d27
vmlal.s32 q7, d29, d26
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d28-d29}, [r2, : 128]
vmlal.s32 q4, d24, d29
vmlal.s32 q8, d23, d29
vmlal.s32 q8, d24, d28
vmlal.s32 q7, d22, d29
vmlal.s32 q7, d23, d28
- add r2, sp, #608
vst1.8 {d8-d9}, [r2, : 128]
- add r2, sp, #560
+ add r2, sp, #528
vld1.8 {d8-d9}, [r2, : 128]
vmlal.s32 q7, d24, d9
vmlal.s32 q7, d25, d31
@@ -942,36 +916,36 @@
vmlal.s32 q0, d23, d26
vmlal.s32 q0, d24, d31
vmlal.s32 q0, d19, d20
- add r2, sp, #640
+ add r2, sp, #608
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q2, d18, d7
- vmlal.s32 q2, d19, d6
vmlal.s32 q5, d18, d6
- vmlal.s32 q5, d19, d21
vmlal.s32 q1, d18, d21
- vmlal.s32 q1, d19, d29
vmlal.s32 q0, d18, d28
- vmlal.s32 q0, d19, d9
vmlal.s32 q6, d18, d29
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d19, d9
vmlal.s32 q6, d19, d28
- add r2, sp, #592
+ add r2, sp, #560
vld1.8 {d18-d19}, [r2, : 128]
- add r2, sp, #512
+ add r2, sp, #480
vld1.8 {d22-d23}, [r2, : 128]
vmlal.s32 q5, d19, d7
vmlal.s32 q0, d18, d21
vmlal.s32 q0, d19, d29
vmlal.s32 q6, d18, d6
- add r2, sp, #528
+ add r2, sp, #496
vld1.8 {d6-d7}, [r2, : 128]
vmlal.s32 q6, d19, d21
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q0, d30, d8
- add r2, sp, #672
+ add r2, sp, #640
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q5, d30, d29
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d24-d25}, [r2, : 128]
vmlal.s32 q1, d30, d28
vadd.i64 q13, q0, q11
@@ -1069,7 +1043,7 @@
sub r4, r4, #24
vst1.8 d0, [r2, : 64]
vst1.8 d1, [r4, : 64]
- add r2, sp, #544
+ add r2, sp, #512
add r4, r3, #144
add r5, r3, #192
vld1.8 {d0-d1}, [r2, : 128]
@@ -1139,14 +1113,13 @@
vmlal.s32 q0, d12, d8
vmlal.s32 q0, d13, d17
vmlal.s32 q0, d6, d6
- add r2, sp, #512
- vld1.8 {d18-d19}, [r2, : 128]
+ add r2, sp, #480
+ vld1.8 {d18-d19}, [r2, : 128]!
vmull.s32 q3, d16, d7
vmlal.s32 q3, d10, d15
vmlal.s32 q3, d11, d14
vmlal.s32 q3, d12, d9
vmlal.s32 q3, d13, d8
- add r2, sp, #528
vld1.8 {d8-d9}, [r2, : 128]
vadd.i64 q5, q12, q9
vadd.i64 q6, q15, q9
@@ -1295,22 +1268,19 @@
vadd.i32 q5, q5, q0
vtrn.32 q11, q14
vadd.i32 q6, q6, q3
- add r2, sp, #560
+ add r2, sp, #528
vadd.i32 q10, q10, q2
vtrn.32 d24, d25
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q6, q13, #1
- add r2, sp, #576
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vshl.i32 q10, q14, #1
- add r2, sp, #592
- vst1.8 {d12-d13}, [r2, : 128]
+ vst1.8 {d12-d13}, [r2, : 128]!
vshl.i32 q15, q12, #1
vadd.i32 q8, q8, q4
vext.32 d10, d31, d30, #0
vadd.i32 q7, q7, q1
- add r2, sp, #608
- vst1.8 {d16-d17}, [r2, : 128]
+ vst1.8 {d16-d17}, [r2, : 128]!
vmull.s32 q8, d18, d5
vmlal.s32 q8, d26, d4
vmlal.s32 q8, d19, d9
@@ -1321,8 +1291,7 @@
vmlal.s32 q8, d29, d1
vmlal.s32 q8, d24, d6
vmlal.s32 q8, d25, d0
- add r2, sp, #624
- vst1.8 {d14-d15}, [r2, : 128]
+ vst1.8 {d14-d15}, [r2, : 128]!
vmull.s32 q2, d18, d4
vmlal.s32 q2, d12, d9
vmlal.s32 q2, d13, d8
@@ -1330,8 +1299,7 @@
vmlal.s32 q2, d22, d2
vmlal.s32 q2, d23, d1
vmlal.s32 q2, d24, d0
- add r2, sp, #640
- vst1.8 {d20-d21}, [r2, : 128]
+ vst1.8 {d20-d21}, [r2, : 128]!
vmull.s32 q7, d18, d9
vmlal.s32 q7, d26, d3
vmlal.s32 q7, d19, d8
@@ -1340,15 +1308,13 @@
vmlal.s32 q7, d28, d1
vmlal.s32 q7, d23, d6
vmlal.s32 q7, d29, d0
- add r2, sp, #656
- vst1.8 {d10-d11}, [r2, : 128]
+ vst1.8 {d10-d11}, [r2, : 128]!
vmull.s32 q5, d18, d3
vmlal.s32 q5, d19, d2
vmlal.s32 q5, d22, d1
vmlal.s32 q5, d23, d0
vmlal.s32 q5, d12, d8
- add r2, sp, #672
- vst1.8 {d16-d17}, [r2, : 128]
+ vst1.8 {d16-d17}, [r2, : 128]!
vmull.s32 q4, d18, d8
vmlal.s32 q4, d26, d2
vmlal.s32 q4, d19, d7
@@ -1359,7 +1325,7 @@
vmlal.s32 q8, d26, d1
vmlal.s32 q8, d19, d6
vmlal.s32 q8, d27, d0
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q7, d24, d21
vmlal.s32 q7, d25, d20
@@ -1368,32 +1334,30 @@
vmlal.s32 q8, d22, d21
vmlal.s32 q8, d28, d20
vmlal.s32 q5, d24, d20
- add r2, sp, #576
vst1.8 {d14-d15}, [r2, : 128]
vmull.s32 q7, d18, d6
vmlal.s32 q7, d26, d0
- add r2, sp, #656
+ add r2, sp, #624
vld1.8 {d30-d31}, [r2, : 128]
vmlal.s32 q2, d30, d21
vmlal.s32 q7, d19, d21
vmlal.s32 q7, d27, d20
- add r2, sp, #624
+ add r2, sp, #592
vld1.8 {d26-d27}, [r2, : 128]
vmlal.s32 q4, d25, d27
vmlal.s32 q8, d29, d27
vmlal.s32 q8, d25, d26
vmlal.s32 q7, d28, d27
vmlal.s32 q7, d29, d26
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d28-d29}, [r2, : 128]
vmlal.s32 q4, d24, d29
vmlal.s32 q8, d23, d29
vmlal.s32 q8, d24, d28
vmlal.s32 q7, d22, d29
vmlal.s32 q7, d23, d28
- add r2, sp, #608
vst1.8 {d8-d9}, [r2, : 128]
- add r2, sp, #560
+ add r2, sp, #528
vld1.8 {d8-d9}, [r2, : 128]
vmlal.s32 q7, d24, d9
vmlal.s32 q7, d25, d31
@@ -1414,36 +1378,36 @@
vmlal.s32 q0, d23, d26
vmlal.s32 q0, d24, d31
vmlal.s32 q0, d19, d20
- add r2, sp, #640
+ add r2, sp, #608
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q2, d18, d7
- vmlal.s32 q2, d19, d6
vmlal.s32 q5, d18, d6
- vmlal.s32 q5, d19, d21
vmlal.s32 q1, d18, d21
- vmlal.s32 q1, d19, d29
vmlal.s32 q0, d18, d28
- vmlal.s32 q0, d19, d9
vmlal.s32 q6, d18, d29
+ vmlal.s32 q2, d19, d6
+ vmlal.s32 q5, d19, d21
+ vmlal.s32 q1, d19, d29
+ vmlal.s32 q0, d19, d9
vmlal.s32 q6, d19, d28
- add r2, sp, #592
+ add r2, sp, #560
vld1.8 {d18-d19}, [r2, : 128]
- add r2, sp, #512
+ add r2, sp, #480
vld1.8 {d22-d23}, [r2, : 128]
vmlal.s32 q5, d19, d7
vmlal.s32 q0, d18, d21
vmlal.s32 q0, d19, d29
vmlal.s32 q6, d18, d6
- add r2, sp, #528
+ add r2, sp, #496
vld1.8 {d6-d7}, [r2, : 128]
vmlal.s32 q6, d19, d21
- add r2, sp, #576
+ add r2, sp, #544
vld1.8 {d18-d19}, [r2, : 128]
vmlal.s32 q0, d30, d8
- add r2, sp, #672
+ add r2, sp, #640
vld1.8 {d20-d21}, [r2, : 128]
vmlal.s32 q5, d30, d29
- add r2, sp, #608
+ add r2, sp, #576
vld1.8 {d24-d25}, [r2, : 128]
vmlal.s32 q1, d30, d28
vadd.i64 q13, q0, q11
@@ -1541,10 +1505,10 @@
sub r4, r4, #24
vst1.8 d0, [r2, : 64]
vst1.8 d1, [r4, : 64]
- ldr r2, [sp, #488]
- ldr r4, [sp, #492]
+ ldr r2, [sp, #456]
+ ldr r4, [sp, #460]
subs r5, r2, #1
- bge ._mainloop
+ bge .Lmainloop
add r1, r3, #144
add r2, r3, #336
vld1.8 {d0-d1}, [r1, : 128]!
@@ -1553,41 +1517,41 @@
vst1.8 {d0-d1}, [r2, : 128]!
vst1.8 {d2-d3}, [r2, : 128]!
vst1.8 d4, [r2, : 64]
- ldr r1, =0
-._invertloop:
+ movw r1, #0
+.Linvertloop:
add r2, r3, #144
- ldr r4, =0
- ldr r5, =2
+ movw r4, #0
+ movw r5, #2
cmp r1, #1
- ldreq r5, =1
+ moveq r5, #1
addeq r2, r3, #336
addeq r4, r3, #48
cmp r1, #2
- ldreq r5, =1
+ moveq r5, #1
addeq r2, r3, #48
cmp r1, #3
- ldreq r5, =5
+ moveq r5, #5
addeq r4, r3, #336
cmp r1, #4
- ldreq r5, =10
+ moveq r5, #10
cmp r1, #5
- ldreq r5, =20
+ moveq r5, #20
cmp r1, #6
- ldreq r5, =10
+ moveq r5, #10
addeq r2, r3, #336
addeq r4, r3, #336
cmp r1, #7
- ldreq r5, =50
+ moveq r5, #50
cmp r1, #8
- ldreq r5, =100
+ moveq r5, #100
cmp r1, #9
- ldreq r5, =50
+ moveq r5, #50
addeq r2, r3, #336
cmp r1, #10
- ldreq r5, =5
+ moveq r5, #5
addeq r2, r3, #48
cmp r1, #11
- ldreq r5, =0
+ moveq r5, #0
addeq r2, r3, #96
add r6, r3, #144
add r7, r3, #288
@@ -1598,8 +1562,8 @@
vst1.8 {d2-d3}, [r7, : 128]!
vst1.8 d4, [r7, : 64]
cmp r5, #0
- beq ._skipsquaringloop
-._squaringloop:
+ beq .Lskipsquaringloop
+.Lsquaringloop:
add r6, r3, #288
add r7, r3, #288
add r8, r3, #288
@@ -1611,7 +1575,7 @@
vld1.8 {d6-d7}, [r7, : 128]!
vld1.8 {d9}, [r7, : 64]
vld1.8 {d10-d11}, [r6, : 128]!
- add r7, sp, #416
+ add r7, sp, #384
vld1.8 {d12-d13}, [r6, : 128]!
vmul.i32 q7, q2, q0
vld1.8 {d8}, [r6, : 64]
@@ -1726,7 +1690,7 @@
vext.32 d10, d6, d6, #0
vmov.i32 q1, #0xffffffff
vshl.i64 q4, q1, #25
- add r7, sp, #512
+ add r7, sp, #480
vld1.8 {d14-d15}, [r7, : 128]
vadd.i64 q9, q2, q7
vshl.i64 q1, q1, #26
@@ -1735,7 +1699,7 @@
vadd.i64 q5, q5, q10
vand q9, q9, q1
vld1.8 {d16}, [r6, : 64]!
- add r6, sp, #528
+ add r6, sp, #496
vld1.8 {d20-d21}, [r6, : 128]
vadd.i64 q11, q5, q10
vsub.i64 q2, q2, q9
@@ -1789,8 +1753,8 @@
sub r6, r6, #32
vst1.8 d4, [r6, : 64]
subs r5, r5, #1
- bhi ._squaringloop
-._skipsquaringloop:
+ bhi .Lsquaringloop
+.Lskipsquaringloop:
mov r2, r2
add r5, r3, #288
add r6, r3, #144
@@ -1802,7 +1766,7 @@
vld1.8 {d6-d7}, [r5, : 128]!
vld1.8 {d9}, [r5, : 64]
vld1.8 {d10-d11}, [r2, : 128]!
- add r5, sp, #416
+ add r5, sp, #384
vld1.8 {d12-d13}, [r2, : 128]!
vmul.i32 q7, q2, q0
vld1.8 {d8}, [r2, : 64]
@@ -1917,7 +1881,7 @@
vext.32 d10, d6, d6, #0
vmov.i32 q1, #0xffffffff
vshl.i64 q4, q1, #25
- add r5, sp, #512
+ add r5, sp, #480
vld1.8 {d14-d15}, [r5, : 128]
vadd.i64 q9, q2, q7
vshl.i64 q1, q1, #26
@@ -1926,7 +1890,7 @@
vadd.i64 q5, q5, q10
vand q9, q9, q1
vld1.8 {d16}, [r2, : 64]!
- add r2, sp, #528
+ add r2, sp, #496
vld1.8 {d20-d21}, [r2, : 128]
vadd.i64 q11, q5, q10
vsub.i64 q2, q2, q9
@@ -1980,7 +1944,7 @@
sub r2, r2, #32
vst1.8 d4, [r2, : 64]
cmp r4, #0
- beq ._skippostcopy
+ beq .Lskippostcopy
add r2, r3, #144
mov r4, r4
vld1.8 {d0-d1}, [r2, : 128]!
@@ -1989,9 +1953,9 @@
vst1.8 {d0-d1}, [r4, : 128]!
vst1.8 {d2-d3}, [r4, : 128]!
vst1.8 d4, [r4, : 64]
-._skippostcopy:
+.Lskippostcopy:
cmp r1, #1
- bne ._skipfinalcopy
+ bne .Lskipfinalcopy
add r2, r3, #288
add r4, r3, #144
vld1.8 {d0-d1}, [r2, : 128]!
@@ -2000,10 +1964,10 @@
vst1.8 {d0-d1}, [r4, : 128]!
vst1.8 {d2-d3}, [r4, : 128]!
vst1.8 d4, [r4, : 64]
-._skipfinalcopy:
+.Lskipfinalcopy:
add r1, r1, #1
cmp r1, #12
- blo ._invertloop
+ blo .Linvertloop
add r1, r3, #144
ldr r2, [r1], #4
ldr r3, [r1], #4
@@ -2085,21 +2049,16 @@
add r8, r8, r10, LSL #12
mov r9, r10, LSR #20
add r1, r9, r1, LSL #6
- str r2, [r0], #4
- str r3, [r0], #4
- str r4, [r0], #4
- str r5, [r0], #4
- str r6, [r0], #4
- str r7, [r0], #4
- str r8, [r0], #4
- str r1, [r0]
- ldrd r4, [sp, #0]
- ldrd r6, [sp, #8]
- ldrd r8, [sp, #16]
- ldrd r10, [sp, #24]
- ldr r12, [sp, #480]
- ldr r14, [sp, #484]
- ldr r0, =0
- mov sp, r12
- vpop {q4, q5, q6, q7}
- bx lr
+ str r2, [r0]
+ str r3, [r0, #4]
+ str r4, [r0, #8]
+ str r5, [r0, #12]
+ str r6, [r0, #16]
+ str r7, [r0, #20]
+ str r8, [r0, #24]
+ str r1, [r0, #28]
+ movw r0, #0
+ mov sp, ip
+ pop {r4-r11, pc}
+ENDPROC(curve25519_neon)
+#endif
diff --git a/lib/zinc/curve25519/curve25519.c b/lib/zinc/curve25519/curve25519.c
index 4f9c45ba126d..30dd5c93d130 100644
--- a/lib/zinc/curve25519/curve25519.c
+++ b/lib/zinc/curve25519/curve25519.c
@@ -22,6 +22,8 @@
#if defined(CONFIG_ZINC_ARCH_X86_64)
#include "curve25519-x86_64-glue.c"
+#elif defined(CONFIG_ZINC_ARCH_ARM)
+#include "curve25519-arm-glue.c"
#else
static bool *const curve25519_nobs[] __initconst = { };
static void __init curve25519_fpu_init(void)
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (21 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 24/28] zinc: " Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-08 23:21 ` Eric Biggers
2018-10-06 2:57 ` [PATCH net-next v7 26/28] crypto: port ChaCha20 " Jason A. Donenfeld
23 siblings, 1 reply; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Andy Lutomirski, linux-crypto
Now that Poly1305 is in Zinc, we can have the crypto API code simply
call into it. We have to do a little bit of book keeping here, because
the crypto API receives the key in the first few calls to update.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: linux-crypto@vger.kernel.org
---
arch/x86/crypto/Makefile | 3 -
arch/x86/crypto/poly1305-avx2-x86_64.S | 388 ----------------
arch/x86/crypto/poly1305-sse2-x86_64.S | 584 -------------------------
arch/x86/crypto/poly1305_glue.c | 205 ---------
crypto/Kconfig | 15 +-
crypto/Makefile | 2 +-
crypto/chacha20poly1305.c | 12 +-
crypto/poly1305_generic.c | 304 -------------
crypto/poly1305_zinc.c | 98 +++++
include/crypto/poly1305.h | 40 --
10 files changed, 107 insertions(+), 1544 deletions(-)
delete mode 100644 arch/x86/crypto/poly1305-avx2-x86_64.S
delete mode 100644 arch/x86/crypto/poly1305-sse2-x86_64.S
delete mode 100644 arch/x86/crypto/poly1305_glue.c
delete mode 100644 crypto/poly1305_generic.c
create mode 100644 crypto/poly1305_zinc.c
delete mode 100644 include/crypto/poly1305.h
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index a450ad573dcb..cf830219846b 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -34,7 +34,6 @@ obj-$(CONFIG_CRYPTO_CRC32_PCLMUL) += crc32-pclmul.o
obj-$(CONFIG_CRYPTO_SHA256_SSSE3) += sha256-ssse3.o
obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
obj-$(CONFIG_CRYPTO_CRCT10DIF_PCLMUL) += crct10dif-pclmul.o
-obj-$(CONFIG_CRYPTO_POLY1305_X86_64) += poly1305-x86_64.o
obj-$(CONFIG_CRYPTO_AEGIS128_AESNI_SSE2) += aegis128-aesni.o
obj-$(CONFIG_CRYPTO_AEGIS128L_AESNI_SSE2) += aegis128l-aesni.o
@@ -110,10 +109,8 @@ aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o fpu.o
aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o
ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o
sha1-ssse3-y := sha1_ssse3_asm.o sha1_ssse3_glue.o
-poly1305-x86_64-y := poly1305-sse2-x86_64.o poly1305_glue.o
ifeq ($(avx2_supported),yes)
sha1-ssse3-y += sha1_avx2_x86_64_asm.o
-poly1305-x86_64-y += poly1305-avx2-x86_64.o
endif
ifeq ($(sha1_ni_supported),yes)
sha1-ssse3-y += sha1_ni_asm.o
diff --git a/arch/x86/crypto/poly1305-avx2-x86_64.S b/arch/x86/crypto/poly1305-avx2-x86_64.S
deleted file mode 100644
index 3b6e70d085da..000000000000
--- a/arch/x86/crypto/poly1305-avx2-x86_64.S
+++ /dev/null
@@ -1,388 +0,0 @@
-/*
- * Poly1305 authenticator algorithm, RFC7539, x64 AVX2 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
-.section .rodata.cst32.ANMASK, "aM", @progbits, 32
-.align 32
-ANMASK: .octa 0x0000000003ffffff0000000003ffffff
- .octa 0x0000000003ffffff0000000003ffffff
-
-.section .rodata.cst32.ORMASK, "aM", @progbits, 32
-.align 32
-ORMASK: .octa 0x00000000010000000000000001000000
- .octa 0x00000000010000000000000001000000
-
-.text
-
-#define h0 0x00(%rdi)
-#define h1 0x04(%rdi)
-#define h2 0x08(%rdi)
-#define h3 0x0c(%rdi)
-#define h4 0x10(%rdi)
-#define r0 0x00(%rdx)
-#define r1 0x04(%rdx)
-#define r2 0x08(%rdx)
-#define r3 0x0c(%rdx)
-#define r4 0x10(%rdx)
-#define u0 0x00(%r8)
-#define u1 0x04(%r8)
-#define u2 0x08(%r8)
-#define u3 0x0c(%r8)
-#define u4 0x10(%r8)
-#define w0 0x14(%r8)
-#define w1 0x18(%r8)
-#define w2 0x1c(%r8)
-#define w3 0x20(%r8)
-#define w4 0x24(%r8)
-#define y0 0x28(%r8)
-#define y1 0x2c(%r8)
-#define y2 0x30(%r8)
-#define y3 0x34(%r8)
-#define y4 0x38(%r8)
-#define m %rsi
-#define hc0 %ymm0
-#define hc1 %ymm1
-#define hc2 %ymm2
-#define hc3 %ymm3
-#define hc4 %ymm4
-#define hc0x %xmm0
-#define hc1x %xmm1
-#define hc2x %xmm2
-#define hc3x %xmm3
-#define hc4x %xmm4
-#define t1 %ymm5
-#define t2 %ymm6
-#define t1x %xmm5
-#define t2x %xmm6
-#define ruwy0 %ymm7
-#define ruwy1 %ymm8
-#define ruwy2 %ymm9
-#define ruwy3 %ymm10
-#define ruwy4 %ymm11
-#define ruwy0x %xmm7
-#define ruwy1x %xmm8
-#define ruwy2x %xmm9
-#define ruwy3x %xmm10
-#define ruwy4x %xmm11
-#define svxz1 %ymm12
-#define svxz2 %ymm13
-#define svxz3 %ymm14
-#define svxz4 %ymm15
-#define d0 %r9
-#define d1 %r10
-#define d2 %r11
-#define d3 %r12
-#define d4 %r13
-
-ENTRY(poly1305_4block_avx2)
- # %rdi: Accumulator h[5]
- # %rsi: 64 byte input block m
- # %rdx: Poly1305 key r[5]
- # %rcx: Quadblock count
- # %r8: Poly1305 derived key r^2 u[5], r^3 w[5], r^4 y[5],
-
- # This four-block variant uses loop unrolled block processing. It
- # requires 4 Poly1305 keys: r, r^2, r^3 and r^4:
- # h = (h + m) * r => h = (h + m1) * r^4 + m2 * r^3 + m3 * r^2 + m4 * r
-
- vzeroupper
- push %rbx
- push %r12
- push %r13
-
- # combine r0,u0,w0,y0
- vmovd y0,ruwy0x
- vmovd w0,t1x
- vpunpcklqdq t1,ruwy0,ruwy0
- vmovd u0,t1x
- vmovd r0,t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,ruwy0,ruwy0
-
- # combine r1,u1,w1,y1 and s1=r1*5,v1=u1*5,x1=w1*5,z1=y1*5
- vmovd y1,ruwy1x
- vmovd w1,t1x
- vpunpcklqdq t1,ruwy1,ruwy1
- vmovd u1,t1x
- vmovd r1,t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,ruwy1,ruwy1
- vpslld $2,ruwy1,svxz1
- vpaddd ruwy1,svxz1,svxz1
-
- # combine r2,u2,w2,y2 and s2=r2*5,v2=u2*5,x2=w2*5,z2=y2*5
- vmovd y2,ruwy2x
- vmovd w2,t1x
- vpunpcklqdq t1,ruwy2,ruwy2
- vmovd u2,t1x
- vmovd r2,t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,ruwy2,ruwy2
- vpslld $2,ruwy2,svxz2
- vpaddd ruwy2,svxz2,svxz2
-
- # combine r3,u3,w3,y3 and s3=r3*5,v3=u3*5,x3=w3*5,z3=y3*5
- vmovd y3,ruwy3x
- vmovd w3,t1x
- vpunpcklqdq t1,ruwy3,ruwy3
- vmovd u3,t1x
- vmovd r3,t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,ruwy3,ruwy3
- vpslld $2,ruwy3,svxz3
- vpaddd ruwy3,svxz3,svxz3
-
- # combine r4,u4,w4,y4 and s4=r4*5,v4=u4*5,x4=w4*5,z4=y4*5
- vmovd y4,ruwy4x
- vmovd w4,t1x
- vpunpcklqdq t1,ruwy4,ruwy4
- vmovd u4,t1x
- vmovd r4,t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,ruwy4,ruwy4
- vpslld $2,ruwy4,svxz4
- vpaddd ruwy4,svxz4,svxz4
-
-.Ldoblock4:
- # hc0 = [m[48-51] & 0x3ffffff, m[32-35] & 0x3ffffff,
- # m[16-19] & 0x3ffffff, m[ 0- 3] & 0x3ffffff + h0]
- vmovd 0x00(m),hc0x
- vmovd 0x10(m),t1x
- vpunpcklqdq t1,hc0,hc0
- vmovd 0x20(m),t1x
- vmovd 0x30(m),t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,hc0,hc0
- vpand ANMASK(%rip),hc0,hc0
- vmovd h0,t1x
- vpaddd t1,hc0,hc0
- # hc1 = [(m[51-54] >> 2) & 0x3ffffff, (m[35-38] >> 2) & 0x3ffffff,
- # (m[19-22] >> 2) & 0x3ffffff, (m[ 3- 6] >> 2) & 0x3ffffff + h1]
- vmovd 0x03(m),hc1x
- vmovd 0x13(m),t1x
- vpunpcklqdq t1,hc1,hc1
- vmovd 0x23(m),t1x
- vmovd 0x33(m),t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,hc1,hc1
- vpsrld $2,hc1,hc1
- vpand ANMASK(%rip),hc1,hc1
- vmovd h1,t1x
- vpaddd t1,hc1,hc1
- # hc2 = [(m[54-57] >> 4) & 0x3ffffff, (m[38-41] >> 4) & 0x3ffffff,
- # (m[22-25] >> 4) & 0x3ffffff, (m[ 6- 9] >> 4) & 0x3ffffff + h2]
- vmovd 0x06(m),hc2x
- vmovd 0x16(m),t1x
- vpunpcklqdq t1,hc2,hc2
- vmovd 0x26(m),t1x
- vmovd 0x36(m),t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,hc2,hc2
- vpsrld $4,hc2,hc2
- vpand ANMASK(%rip),hc2,hc2
- vmovd h2,t1x
- vpaddd t1,hc2,hc2
- # hc3 = [(m[57-60] >> 6) & 0x3ffffff, (m[41-44] >> 6) & 0x3ffffff,
- # (m[25-28] >> 6) & 0x3ffffff, (m[ 9-12] >> 6) & 0x3ffffff + h3]
- vmovd 0x09(m),hc3x
- vmovd 0x19(m),t1x
- vpunpcklqdq t1,hc3,hc3
- vmovd 0x29(m),t1x
- vmovd 0x39(m),t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,hc3,hc3
- vpsrld $6,hc3,hc3
- vpand ANMASK(%rip),hc3,hc3
- vmovd h3,t1x
- vpaddd t1,hc3,hc3
- # hc4 = [(m[60-63] >> 8) | (1<<24), (m[44-47] >> 8) | (1<<24),
- # (m[28-31] >> 8) | (1<<24), (m[12-15] >> 8) | (1<<24) + h4]
- vmovd 0x0c(m),hc4x
- vmovd 0x1c(m),t1x
- vpunpcklqdq t1,hc4,hc4
- vmovd 0x2c(m),t1x
- vmovd 0x3c(m),t2x
- vpunpcklqdq t2,t1,t1
- vperm2i128 $0x20,t1,hc4,hc4
- vpsrld $8,hc4,hc4
- vpor ORMASK(%rip),hc4,hc4
- vmovd h4,t1x
- vpaddd t1,hc4,hc4
-
- # t1 = [ hc0[3] * r0, hc0[2] * u0, hc0[1] * w0, hc0[0] * y0 ]
- vpmuludq hc0,ruwy0,t1
- # t1 += [ hc1[3] * s4, hc1[2] * v4, hc1[1] * x4, hc1[0] * z4 ]
- vpmuludq hc1,svxz4,t2
- vpaddq t2,t1,t1
- # t1 += [ hc2[3] * s3, hc2[2] * v3, hc2[1] * x3, hc2[0] * z3 ]
- vpmuludq hc2,svxz3,t2
- vpaddq t2,t1,t1
- # t1 += [ hc3[3] * s2, hc3[2] * v2, hc3[1] * x2, hc3[0] * z2 ]
- vpmuludq hc3,svxz2,t2
- vpaddq t2,t1,t1
- # t1 += [ hc4[3] * s1, hc4[2] * v1, hc4[1] * x1, hc4[0] * z1 ]
- vpmuludq hc4,svxz1,t2
- vpaddq t2,t1,t1
- # d0 = t1[0] + t1[1] + t[2] + t[3]
- vpermq $0xee,t1,t2
- vpaddq t2,t1,t1
- vpsrldq $8,t1,t2
- vpaddq t2,t1,t1
- vmovq t1x,d0
-
- # t1 = [ hc0[3] * r1, hc0[2] * u1,hc0[1] * w1, hc0[0] * y1 ]
- vpmuludq hc0,ruwy1,t1
- # t1 += [ hc1[3] * r0, hc1[2] * u0, hc1[1] * w0, hc1[0] * y0 ]
- vpmuludq hc1,ruwy0,t2
- vpaddq t2,t1,t1
- # t1 += [ hc2[3] * s4, hc2[2] * v4, hc2[1] * x4, hc2[0] * z4 ]
- vpmuludq hc2,svxz4,t2
- vpaddq t2,t1,t1
- # t1 += [ hc3[3] * s3, hc3[2] * v3, hc3[1] * x3, hc3[0] * z3 ]
- vpmuludq hc3,svxz3,t2
- vpaddq t2,t1,t1
- # t1 += [ hc4[3] * s2, hc4[2] * v2, hc4[1] * x2, hc4[0] * z2 ]
- vpmuludq hc4,svxz2,t2
- vpaddq t2,t1,t1
- # d1 = t1[0] + t1[1] + t1[3] + t1[4]
- vpermq $0xee,t1,t2
- vpaddq t2,t1,t1
- vpsrldq $8,t1,t2
- vpaddq t2,t1,t1
- vmovq t1x,d1
-
- # t1 = [ hc0[3] * r2, hc0[2] * u2, hc0[1] * w2, hc0[0] * y2 ]
- vpmuludq hc0,ruwy2,t1
- # t1 += [ hc1[3] * r1, hc1[2] * u1, hc1[1] * w1, hc1[0] * y1 ]
- vpmuludq hc1,ruwy1,t2
- vpaddq t2,t1,t1
- # t1 += [ hc2[3] * r0, hc2[2] * u0, hc2[1] * w0, hc2[0] * y0 ]
- vpmuludq hc2,ruwy0,t2
- vpaddq t2,t1,t1
- # t1 += [ hc3[3] * s4, hc3[2] * v4, hc3[1] * x4, hc3[0] * z4 ]
- vpmuludq hc3,svxz4,t2
- vpaddq t2,t1,t1
- # t1 += [ hc4[3] * s3, hc4[2] * v3, hc4[1] * x3, hc4[0] * z3 ]
- vpmuludq hc4,svxz3,t2
- vpaddq t2,t1,t1
- # d2 = t1[0] + t1[1] + t1[2] + t1[3]
- vpermq $0xee,t1,t2
- vpaddq t2,t1,t1
- vpsrldq $8,t1,t2
- vpaddq t2,t1,t1
- vmovq t1x,d2
-
- # t1 = [ hc0[3] * r3, hc0[2] * u3, hc0[1] * w3, hc0[0] * y3 ]
- vpmuludq hc0,ruwy3,t1
- # t1 += [ hc1[3] * r2, hc1[2] * u2, hc1[1] * w2, hc1[0] * y2 ]
- vpmuludq hc1,ruwy2,t2
- vpaddq t2,t1,t1
- # t1 += [ hc2[3] * r1, hc2[2] * u1, hc2[1] * w1, hc2[0] * y1 ]
- vpmuludq hc2,ruwy1,t2
- vpaddq t2,t1,t1
- # t1 += [ hc3[3] * r0, hc3[2] * u0, hc3[1] * w0, hc3[0] * y0 ]
- vpmuludq hc3,ruwy0,t2
- vpaddq t2,t1,t1
- # t1 += [ hc4[3] * s4, hc4[2] * v4, hc4[1] * x4, hc4[0] * z4 ]
- vpmuludq hc4,svxz4,t2
- vpaddq t2,t1,t1
- # d3 = t1[0] + t1[1] + t1[2] + t1[3]
- vpermq $0xee,t1,t2
- vpaddq t2,t1,t1
- vpsrldq $8,t1,t2
- vpaddq t2,t1,t1
- vmovq t1x,d3
-
- # t1 = [ hc0[3] * r4, hc0[2] * u4, hc0[1] * w4, hc0[0] * y4 ]
- vpmuludq hc0,ruwy4,t1
- # t1 += [ hc1[3] * r3, hc1[2] * u3, hc1[1] * w3, hc1[0] * y3 ]
- vpmuludq hc1,ruwy3,t2
- vpaddq t2,t1,t1
- # t1 += [ hc2[3] * r2, hc2[2] * u2, hc2[1] * w2, hc2[0] * y2 ]
- vpmuludq hc2,ruwy2,t2
- vpaddq t2,t1,t1
- # t1 += [ hc3[3] * r1, hc3[2] * u1, hc3[1] * w1, hc3[0] * y1 ]
- vpmuludq hc3,ruwy1,t2
- vpaddq t2,t1,t1
- # t1 += [ hc4[3] * r0, hc4[2] * u0, hc4[1] * w0, hc4[0] * y0 ]
- vpmuludq hc4,ruwy0,t2
- vpaddq t2,t1,t1
- # d4 = t1[0] + t1[1] + t1[2] + t1[3]
- vpermq $0xee,t1,t2
- vpaddq t2,t1,t1
- vpsrldq $8,t1,t2
- vpaddq t2,t1,t1
- vmovq t1x,d4
-
- # d1 += d0 >> 26
- mov d0,%rax
- shr $26,%rax
- add %rax,d1
- # h0 = d0 & 0x3ffffff
- mov d0,%rbx
- and $0x3ffffff,%ebx
-
- # d2 += d1 >> 26
- mov d1,%rax
- shr $26,%rax
- add %rax,d2
- # h1 = d1 & 0x3ffffff
- mov d1,%rax
- and $0x3ffffff,%eax
- mov %eax,h1
-
- # d3 += d2 >> 26
- mov d2,%rax
- shr $26,%rax
- add %rax,d3
- # h2 = d2 & 0x3ffffff
- mov d2,%rax
- and $0x3ffffff,%eax
- mov %eax,h2
-
- # d4 += d3 >> 26
- mov d3,%rax
- shr $26,%rax
- add %rax,d4
- # h3 = d3 & 0x3ffffff
- mov d3,%rax
- and $0x3ffffff,%eax
- mov %eax,h3
-
- # h0 += (d4 >> 26) * 5
- mov d4,%rax
- shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
- # h4 = d4 & 0x3ffffff
- mov d4,%rax
- and $0x3ffffff,%eax
- mov %eax,h4
-
- # h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
- add %eax,h1
- # h0 = h0 & 0x3ffffff
- andl $0x3ffffff,%ebx
- mov %ebx,h0
-
- add $0x40,m
- dec %rcx
- jnz .Ldoblock4
-
- vzeroupper
- pop %r13
- pop %r12
- pop %rbx
- ret
-ENDPROC(poly1305_4block_avx2)
diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86/crypto/poly1305-sse2-x86_64.S
deleted file mode 100644
index c88c670cb5fc..000000000000
--- a/arch/x86/crypto/poly1305-sse2-x86_64.S
+++ /dev/null
@@ -1,584 +0,0 @@
-/*
- * Poly1305 authenticator algorithm, RFC7539, x64 SSE2 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
-.section .rodata.cst16.ANMASK, "aM", @progbits, 16
-.align 16
-ANMASK: .octa 0x0000000003ffffff0000000003ffffff
-
-.section .rodata.cst16.ORMASK, "aM", @progbits, 16
-.align 16
-ORMASK: .octa 0x00000000010000000000000001000000
-
-.text
-
-#define h0 0x00(%rdi)
-#define h1 0x04(%rdi)
-#define h2 0x08(%rdi)
-#define h3 0x0c(%rdi)
-#define h4 0x10(%rdi)
-#define r0 0x00(%rdx)
-#define r1 0x04(%rdx)
-#define r2 0x08(%rdx)
-#define r3 0x0c(%rdx)
-#define r4 0x10(%rdx)
-#define s1 0x00(%rsp)
-#define s2 0x04(%rsp)
-#define s3 0x08(%rsp)
-#define s4 0x0c(%rsp)
-#define m %rsi
-#define h01 %xmm0
-#define h23 %xmm1
-#define h44 %xmm2
-#define t1 %xmm3
-#define t2 %xmm4
-#define t3 %xmm5
-#define t4 %xmm6
-#define mask %xmm7
-#define d0 %r8
-#define d1 %r9
-#define d2 %r10
-#define d3 %r11
-#define d4 %r12
-
-ENTRY(poly1305_block_sse2)
- # %rdi: Accumulator h[5]
- # %rsi: 16 byte input block m
- # %rdx: Poly1305 key r[5]
- # %rcx: Block count
-
- # This single block variant tries to improve performance by doing two
- # multiplications in parallel using SSE instructions. There is quite
- # some quardword packing involved, hence the speedup is marginal.
-
- push %rbx
- push %r12
- sub $0x10,%rsp
-
- # s1..s4 = r1..r4 * 5
- mov r1,%eax
- lea (%eax,%eax,4),%eax
- mov %eax,s1
- mov r2,%eax
- lea (%eax,%eax,4),%eax
- mov %eax,s2
- mov r3,%eax
- lea (%eax,%eax,4),%eax
- mov %eax,s3
- mov r4,%eax
- lea (%eax,%eax,4),%eax
- mov %eax,s4
-
- movdqa ANMASK(%rip),mask
-
-.Ldoblock:
- # h01 = [0, h1, 0, h0]
- # h23 = [0, h3, 0, h2]
- # h44 = [0, h4, 0, h4]
- movd h0,h01
- movd h1,t1
- movd h2,h23
- movd h3,t2
- movd h4,h44
- punpcklqdq t1,h01
- punpcklqdq t2,h23
- punpcklqdq h44,h44
-
- # h01 += [ (m[3-6] >> 2) & 0x3ffffff, m[0-3] & 0x3ffffff ]
- movd 0x00(m),t1
- movd 0x03(m),t2
- psrld $2,t2
- punpcklqdq t2,t1
- pand mask,t1
- paddd t1,h01
- # h23 += [ (m[9-12] >> 6) & 0x3ffffff, (m[6-9] >> 4) & 0x3ffffff ]
- movd 0x06(m),t1
- movd 0x09(m),t2
- psrld $4,t1
- psrld $6,t2
- punpcklqdq t2,t1
- pand mask,t1
- paddd t1,h23
- # h44 += [ (m[12-15] >> 8) | (1 << 24), (m[12-15] >> 8) | (1 << 24) ]
- mov 0x0c(m),%eax
- shr $8,%eax
- or $0x01000000,%eax
- movd %eax,t1
- pshufd $0xc4,t1,t1
- paddd t1,h44
-
- # t1[0] = h0 * r0 + h2 * s3
- # t1[1] = h1 * s4 + h3 * s2
- movd r0,t1
- movd s4,t2
- punpcklqdq t2,t1
- pmuludq h01,t1
- movd s3,t2
- movd s2,t3
- punpcklqdq t3,t2
- pmuludq h23,t2
- paddq t2,t1
- # t2[0] = h0 * r1 + h2 * s4
- # t2[1] = h1 * r0 + h3 * s3
- movd r1,t2
- movd r0,t3
- punpcklqdq t3,t2
- pmuludq h01,t2
- movd s4,t3
- movd s3,t4
- punpcklqdq t4,t3
- pmuludq h23,t3
- paddq t3,t2
- # t3[0] = h4 * s1
- # t3[1] = h4 * s2
- movd s1,t3
- movd s2,t4
- punpcklqdq t4,t3
- pmuludq h44,t3
- # d0 = t1[0] + t1[1] + t3[0]
- # d1 = t2[0] + t2[1] + t3[1]
- movdqa t1,t4
- punpcklqdq t2,t4
- punpckhqdq t2,t1
- paddq t4,t1
- paddq t3,t1
- movq t1,d0
- psrldq $8,t1
- movq t1,d1
-
- # t1[0] = h0 * r2 + h2 * r0
- # t1[1] = h1 * r1 + h3 * s4
- movd r2,t1
- movd r1,t2
- punpcklqdq t2,t1
- pmuludq h01,t1
- movd r0,t2
- movd s4,t3
- punpcklqdq t3,t2
- pmuludq h23,t2
- paddq t2,t1
- # t2[0] = h0 * r3 + h2 * r1
- # t2[1] = h1 * r2 + h3 * r0
- movd r3,t2
- movd r2,t3
- punpcklqdq t3,t2
- pmuludq h01,t2
- movd r1,t3
- movd r0,t4
- punpcklqdq t4,t3
- pmuludq h23,t3
- paddq t3,t2
- # t3[0] = h4 * s3
- # t3[1] = h4 * s4
- movd s3,t3
- movd s4,t4
- punpcklqdq t4,t3
- pmuludq h44,t3
- # d2 = t1[0] + t1[1] + t3[0]
- # d3 = t2[0] + t2[1] + t3[1]
- movdqa t1,t4
- punpcklqdq t2,t4
- punpckhqdq t2,t1
- paddq t4,t1
- paddq t3,t1
- movq t1,d2
- psrldq $8,t1
- movq t1,d3
-
- # t1[0] = h0 * r4 + h2 * r2
- # t1[1] = h1 * r3 + h3 * r1
- movd r4,t1
- movd r3,t2
- punpcklqdq t2,t1
- pmuludq h01,t1
- movd r2,t2
- movd r1,t3
- punpcklqdq t3,t2
- pmuludq h23,t2
- paddq t2,t1
- # t3[0] = h4 * r0
- movd r0,t3
- pmuludq h44,t3
- # d4 = t1[0] + t1[1] + t3[0]
- movdqa t1,t4
- psrldq $8,t4
- paddq t4,t1
- paddq t3,t1
- movq t1,d4
-
- # d1 += d0 >> 26
- mov d0,%rax
- shr $26,%rax
- add %rax,d1
- # h0 = d0 & 0x3ffffff
- mov d0,%rbx
- and $0x3ffffff,%ebx
-
- # d2 += d1 >> 26
- mov d1,%rax
- shr $26,%rax
- add %rax,d2
- # h1 = d1 & 0x3ffffff
- mov d1,%rax
- and $0x3ffffff,%eax
- mov %eax,h1
-
- # d3 += d2 >> 26
- mov d2,%rax
- shr $26,%rax
- add %rax,d3
- # h2 = d2 & 0x3ffffff
- mov d2,%rax
- and $0x3ffffff,%eax
- mov %eax,h2
-
- # d4 += d3 >> 26
- mov d3,%rax
- shr $26,%rax
- add %rax,d4
- # h3 = d3 & 0x3ffffff
- mov d3,%rax
- and $0x3ffffff,%eax
- mov %eax,h3
-
- # h0 += (d4 >> 26) * 5
- mov d4,%rax
- shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
- # h4 = d4 & 0x3ffffff
- mov d4,%rax
- and $0x3ffffff,%eax
- mov %eax,h4
-
- # h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
- add %eax,h1
- # h0 = h0 & 0x3ffffff
- andl $0x3ffffff,%ebx
- mov %ebx,h0
-
- add $0x10,m
- dec %rcx
- jnz .Ldoblock
-
- add $0x10,%rsp
- pop %r12
- pop %rbx
- ret
-ENDPROC(poly1305_block_sse2)
-
-
-#define u0 0x00(%r8)
-#define u1 0x04(%r8)
-#define u2 0x08(%r8)
-#define u3 0x0c(%r8)
-#define u4 0x10(%r8)
-#define hc0 %xmm0
-#define hc1 %xmm1
-#define hc2 %xmm2
-#define hc3 %xmm5
-#define hc4 %xmm6
-#define ru0 %xmm7
-#define ru1 %xmm8
-#define ru2 %xmm9
-#define ru3 %xmm10
-#define ru4 %xmm11
-#define sv1 %xmm12
-#define sv2 %xmm13
-#define sv3 %xmm14
-#define sv4 %xmm15
-#undef d0
-#define d0 %r13
-
-ENTRY(poly1305_2block_sse2)
- # %rdi: Accumulator h[5]
- # %rsi: 16 byte input block m
- # %rdx: Poly1305 key r[5]
- # %rcx: Doubleblock count
- # %r8: Poly1305 derived key r^2 u[5]
-
- # This two-block variant further improves performance by using loop
- # unrolled block processing. This is more straight forward and does
- # less byte shuffling, but requires a second Poly1305 key r^2:
- # h = (h + m) * r => h = (h + m1) * r^2 + m2 * r
-
- push %rbx
- push %r12
- push %r13
-
- # combine r0,u0
- movd u0,ru0
- movd r0,t1
- punpcklqdq t1,ru0
-
- # combine r1,u1 and s1=r1*5,v1=u1*5
- movd u1,ru1
- movd r1,t1
- punpcklqdq t1,ru1
- movdqa ru1,sv1
- pslld $2,sv1
- paddd ru1,sv1
-
- # combine r2,u2 and s2=r2*5,v2=u2*5
- movd u2,ru2
- movd r2,t1
- punpcklqdq t1,ru2
- movdqa ru2,sv2
- pslld $2,sv2
- paddd ru2,sv2
-
- # combine r3,u3 and s3=r3*5,v3=u3*5
- movd u3,ru3
- movd r3,t1
- punpcklqdq t1,ru3
- movdqa ru3,sv3
- pslld $2,sv3
- paddd ru3,sv3
-
- # combine r4,u4 and s4=r4*5,v4=u4*5
- movd u4,ru4
- movd r4,t1
- punpcklqdq t1,ru4
- movdqa ru4,sv4
- pslld $2,sv4
- paddd ru4,sv4
-
-.Ldoblock2:
- # hc0 = [ m[16-19] & 0x3ffffff, h0 + m[0-3] & 0x3ffffff ]
- movd 0x00(m),hc0
- movd 0x10(m),t1
- punpcklqdq t1,hc0
- pand ANMASK(%rip),hc0
- movd h0,t1
- paddd t1,hc0
- # hc1 = [ (m[19-22] >> 2) & 0x3ffffff, h1 + (m[3-6] >> 2) & 0x3ffffff ]
- movd 0x03(m),hc1
- movd 0x13(m),t1
- punpcklqdq t1,hc1
- psrld $2,hc1
- pand ANMASK(%rip),hc1
- movd h1,t1
- paddd t1,hc1
- # hc2 = [ (m[22-25] >> 4) & 0x3ffffff, h2 + (m[6-9] >> 4) & 0x3ffffff ]
- movd 0x06(m),hc2
- movd 0x16(m),t1
- punpcklqdq t1,hc2
- psrld $4,hc2
- pand ANMASK(%rip),hc2
- movd h2,t1
- paddd t1,hc2
- # hc3 = [ (m[25-28] >> 6) & 0x3ffffff, h3 + (m[9-12] >> 6) & 0x3ffffff ]
- movd 0x09(m),hc3
- movd 0x19(m),t1
- punpcklqdq t1,hc3
- psrld $6,hc3
- pand ANMASK(%rip),hc3
- movd h3,t1
- paddd t1,hc3
- # hc4 = [ (m[28-31] >> 8) | (1<<24), h4 + (m[12-15] >> 8) | (1<<24) ]
- movd 0x0c(m),hc4
- movd 0x1c(m),t1
- punpcklqdq t1,hc4
- psrld $8,hc4
- por ORMASK(%rip),hc4
- movd h4,t1
- paddd t1,hc4
-
- # t1 = [ hc0[1] * r0, hc0[0] * u0 ]
- movdqa ru0,t1
- pmuludq hc0,t1
- # t1 += [ hc1[1] * s4, hc1[0] * v4 ]
- movdqa sv4,t2
- pmuludq hc1,t2
- paddq t2,t1
- # t1 += [ hc2[1] * s3, hc2[0] * v3 ]
- movdqa sv3,t2
- pmuludq hc2,t2
- paddq t2,t1
- # t1 += [ hc3[1] * s2, hc3[0] * v2 ]
- movdqa sv2,t2
- pmuludq hc3,t2
- paddq t2,t1
- # t1 += [ hc4[1] * s1, hc4[0] * v1 ]
- movdqa sv1,t2
- pmuludq hc4,t2
- paddq t2,t1
- # d0 = t1[0] + t1[1]
- movdqa t1,t2
- psrldq $8,t2
- paddq t2,t1
- movq t1,d0
-
- # t1 = [ hc0[1] * r1, hc0[0] * u1 ]
- movdqa ru1,t1
- pmuludq hc0,t1
- # t1 += [ hc1[1] * r0, hc1[0] * u0 ]
- movdqa ru0,t2
- pmuludq hc1,t2
- paddq t2,t1
- # t1 += [ hc2[1] * s4, hc2[0] * v4 ]
- movdqa sv4,t2
- pmuludq hc2,t2
- paddq t2,t1
- # t1 += [ hc3[1] * s3, hc3[0] * v3 ]
- movdqa sv3,t2
- pmuludq hc3,t2
- paddq t2,t1
- # t1 += [ hc4[1] * s2, hc4[0] * v2 ]
- movdqa sv2,t2
- pmuludq hc4,t2
- paddq t2,t1
- # d1 = t1[0] + t1[1]
- movdqa t1,t2
- psrldq $8,t2
- paddq t2,t1
- movq t1,d1
-
- # t1 = [ hc0[1] * r2, hc0[0] * u2 ]
- movdqa ru2,t1
- pmuludq hc0,t1
- # t1 += [ hc1[1] * r1, hc1[0] * u1 ]
- movdqa ru1,t2
- pmuludq hc1,t2
- paddq t2,t1
- # t1 += [ hc2[1] * r0, hc2[0] * u0 ]
- movdqa ru0,t2
- pmuludq hc2,t2
- paddq t2,t1
- # t1 += [ hc3[1] * s4, hc3[0] * v4 ]
- movdqa sv4,t2
- pmuludq hc3,t2
- paddq t2,t1
- # t1 += [ hc4[1] * s3, hc4[0] * v3 ]
- movdqa sv3,t2
- pmuludq hc4,t2
- paddq t2,t1
- # d2 = t1[0] + t1[1]
- movdqa t1,t2
- psrldq $8,t2
- paddq t2,t1
- movq t1,d2
-
- # t1 = [ hc0[1] * r3, hc0[0] * u3 ]
- movdqa ru3,t1
- pmuludq hc0,t1
- # t1 += [ hc1[1] * r2, hc1[0] * u2 ]
- movdqa ru2,t2
- pmuludq hc1,t2
- paddq t2,t1
- # t1 += [ hc2[1] * r1, hc2[0] * u1 ]
- movdqa ru1,t2
- pmuludq hc2,t2
- paddq t2,t1
- # t1 += [ hc3[1] * r0, hc3[0] * u0 ]
- movdqa ru0,t2
- pmuludq hc3,t2
- paddq t2,t1
- # t1 += [ hc4[1] * s4, hc4[0] * v4 ]
- movdqa sv4,t2
- pmuludq hc4,t2
- paddq t2,t1
- # d3 = t1[0] + t1[1]
- movdqa t1,t2
- psrldq $8,t2
- paddq t2,t1
- movq t1,d3
-
- # t1 = [ hc0[1] * r4, hc0[0] * u4 ]
- movdqa ru4,t1
- pmuludq hc0,t1
- # t1 += [ hc1[1] * r3, hc1[0] * u3 ]
- movdqa ru3,t2
- pmuludq hc1,t2
- paddq t2,t1
- # t1 += [ hc2[1] * r2, hc2[0] * u2 ]
- movdqa ru2,t2
- pmuludq hc2,t2
- paddq t2,t1
- # t1 += [ hc3[1] * r1, hc3[0] * u1 ]
- movdqa ru1,t2
- pmuludq hc3,t2
- paddq t2,t1
- # t1 += [ hc4[1] * r0, hc4[0] * u0 ]
- movdqa ru0,t2
- pmuludq hc4,t2
- paddq t2,t1
- # d4 = t1[0] + t1[1]
- movdqa t1,t2
- psrldq $8,t2
- paddq t2,t1
- movq t1,d4
-
- # d1 += d0 >> 26
- mov d0,%rax
- shr $26,%rax
- add %rax,d1
- # h0 = d0 & 0x3ffffff
- mov d0,%rbx
- and $0x3ffffff,%ebx
-
- # d2 += d1 >> 26
- mov d1,%rax
- shr $26,%rax
- add %rax,d2
- # h1 = d1 & 0x3ffffff
- mov d1,%rax
- and $0x3ffffff,%eax
- mov %eax,h1
-
- # d3 += d2 >> 26
- mov d2,%rax
- shr $26,%rax
- add %rax,d3
- # h2 = d2 & 0x3ffffff
- mov d2,%rax
- and $0x3ffffff,%eax
- mov %eax,h2
-
- # d4 += d3 >> 26
- mov d3,%rax
- shr $26,%rax
- add %rax,d4
- # h3 = d3 & 0x3ffffff
- mov d3,%rax
- and $0x3ffffff,%eax
- mov %eax,h3
-
- # h0 += (d4 >> 26) * 5
- mov d4,%rax
- shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
- # h4 = d4 & 0x3ffffff
- mov d4,%rax
- and $0x3ffffff,%eax
- mov %eax,h4
-
- # h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
- add %eax,h1
- # h0 = h0 & 0x3ffffff
- andl $0x3ffffff,%ebx
- mov %ebx,h0
-
- add $0x20,m
- dec %rcx
- jnz .Ldoblock2
-
- pop %r13
- pop %r12
- pop %rbx
- ret
-ENDPROC(poly1305_2block_sse2)
diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c
deleted file mode 100644
index f012b7e28ad1..000000000000
--- a/arch/x86/crypto/poly1305_glue.c
+++ /dev/null
@@ -1,205 +0,0 @@
-/*
- * Poly1305 authenticator algorithm, RFC7539, SIMD glue code
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <crypto/algapi.h>
-#include <crypto/internal/hash.h>
-#include <crypto/poly1305.h>
-#include <linux/crypto.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <asm/fpu/api.h>
-#include <asm/simd.h>
-
-struct poly1305_simd_desc_ctx {
- struct poly1305_desc_ctx base;
- /* derived key u set? */
- bool uset;
-#ifdef CONFIG_AS_AVX2
- /* derived keys r^3, r^4 set? */
- bool wset;
-#endif
- /* derived Poly1305 key r^2 */
- u32 u[5];
- /* ... silently appended r^3 and r^4 when using AVX2 */
-};
-
-asmlinkage void poly1305_block_sse2(u32 *h, const u8 *src,
- const u32 *r, unsigned int blocks);
-asmlinkage void poly1305_2block_sse2(u32 *h, const u8 *src, const u32 *r,
- unsigned int blocks, const u32 *u);
-#ifdef CONFIG_AS_AVX2
-asmlinkage void poly1305_4block_avx2(u32 *h, const u8 *src, const u32 *r,
- unsigned int blocks, const u32 *u);
-static bool poly1305_use_avx2;
-#endif
-
-static int poly1305_simd_init(struct shash_desc *desc)
-{
- struct poly1305_simd_desc_ctx *sctx = shash_desc_ctx(desc);
-
- sctx->uset = false;
-#ifdef CONFIG_AS_AVX2
- sctx->wset = false;
-#endif
-
- return crypto_poly1305_init(desc);
-}
-
-static void poly1305_simd_mult(u32 *a, const u32 *b)
-{
- u8 m[POLY1305_BLOCK_SIZE];
-
- memset(m, 0, sizeof(m));
- /* The poly1305 block function adds a hi-bit to the accumulator which
- * we don't need for key multiplication; compensate for it. */
- a[4] -= 1 << 24;
- poly1305_block_sse2(a, m, b, 1);
-}
-
-static unsigned int poly1305_simd_blocks(struct poly1305_desc_ctx *dctx,
- const u8 *src, unsigned int srclen)
-{
- struct poly1305_simd_desc_ctx *sctx;
- unsigned int blocks, datalen;
-
- BUILD_BUG_ON(offsetof(struct poly1305_simd_desc_ctx, base));
- sctx = container_of(dctx, struct poly1305_simd_desc_ctx, base);
-
- if (unlikely(!dctx->sset)) {
- datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
- src += srclen - datalen;
- srclen = datalen;
- }
-
-#ifdef CONFIG_AS_AVX2
- if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) {
- if (unlikely(!sctx->wset)) {
- if (!sctx->uset) {
- memcpy(sctx->u, dctx->r, sizeof(sctx->u));
- poly1305_simd_mult(sctx->u, dctx->r);
- sctx->uset = true;
- }
- memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u));
- poly1305_simd_mult(sctx->u + 5, dctx->r);
- memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u));
- poly1305_simd_mult(sctx->u + 10, dctx->r);
- sctx->wset = true;
- }
- blocks = srclen / (POLY1305_BLOCK_SIZE * 4);
- poly1305_4block_avx2(dctx->h, src, dctx->r, blocks, sctx->u);
- src += POLY1305_BLOCK_SIZE * 4 * blocks;
- srclen -= POLY1305_BLOCK_SIZE * 4 * blocks;
- }
-#endif
- if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) {
- if (unlikely(!sctx->uset)) {
- memcpy(sctx->u, dctx->r, sizeof(sctx->u));
- poly1305_simd_mult(sctx->u, dctx->r);
- sctx->uset = true;
- }
- blocks = srclen / (POLY1305_BLOCK_SIZE * 2);
- poly1305_2block_sse2(dctx->h, src, dctx->r, blocks, sctx->u);
- src += POLY1305_BLOCK_SIZE * 2 * blocks;
- srclen -= POLY1305_BLOCK_SIZE * 2 * blocks;
- }
- if (srclen >= POLY1305_BLOCK_SIZE) {
- poly1305_block_sse2(dctx->h, src, dctx->r, 1);
- srclen -= POLY1305_BLOCK_SIZE;
- }
- return srclen;
-}
-
-static int poly1305_simd_update(struct shash_desc *desc,
- const u8 *src, unsigned int srclen)
-{
- struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
- unsigned int bytes;
-
- /* kernel_fpu_begin/end is costly, use fallback for small updates */
- if (srclen <= 288 || !may_use_simd())
- return crypto_poly1305_update(desc, src, srclen);
-
- kernel_fpu_begin();
-
- if (unlikely(dctx->buflen)) {
- bytes = min(srclen, POLY1305_BLOCK_SIZE - dctx->buflen);
- memcpy(dctx->buf + dctx->buflen, src, bytes);
- src += bytes;
- srclen -= bytes;
- dctx->buflen += bytes;
-
- if (dctx->buflen == POLY1305_BLOCK_SIZE) {
- poly1305_simd_blocks(dctx, dctx->buf,
- POLY1305_BLOCK_SIZE);
- dctx->buflen = 0;
- }
- }
-
- if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
- bytes = poly1305_simd_blocks(dctx, src, srclen);
- src += srclen - bytes;
- srclen = bytes;
- }
-
- kernel_fpu_end();
-
- if (unlikely(srclen)) {
- dctx->buflen = srclen;
- memcpy(dctx->buf, src, srclen);
- }
-
- return 0;
-}
-
-static struct shash_alg alg = {
- .digestsize = POLY1305_DIGEST_SIZE,
- .init = poly1305_simd_init,
- .update = poly1305_simd_update,
- .final = crypto_poly1305_final,
- .descsize = sizeof(struct poly1305_simd_desc_ctx),
- .base = {
- .cra_name = "poly1305",
- .cra_driver_name = "poly1305-simd",
- .cra_priority = 300,
- .cra_blocksize = POLY1305_BLOCK_SIZE,
- .cra_module = THIS_MODULE,
- },
-};
-
-static int __init poly1305_simd_mod_init(void)
-{
- if (!boot_cpu_has(X86_FEATURE_XMM2))
- return -ENODEV;
-
-#ifdef CONFIG_AS_AVX2
- poly1305_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) &&
- boot_cpu_has(X86_FEATURE_AVX2) &&
- cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
- alg.descsize = sizeof(struct poly1305_simd_desc_ctx);
- if (poly1305_use_avx2)
- alg.descsize += 10 * sizeof(u32);
-#endif
- return crypto_register_shash(&alg);
-}
-
-static void __exit poly1305_simd_mod_exit(void)
-{
- crypto_unregister_shash(&alg);
-}
-
-module_init(poly1305_simd_mod_init);
-module_exit(poly1305_simd_mod_exit);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("Poly1305 authenticator");
-MODULE_ALIAS_CRYPTO("poly1305");
-MODULE_ALIAS_CRYPTO("poly1305-simd");
diff --git a/crypto/Kconfig b/crypto/Kconfig
index f3e40ac56d93..47859a0f8052 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -656,24 +656,13 @@ config CRYPTO_GHASH
config CRYPTO_POLY1305
tristate "Poly1305 authenticator algorithm"
select CRYPTO_HASH
+ select ZINC_POLY1305
help
Poly1305 authenticator algorithm, RFC7539.
Poly1305 is an authenticator algorithm designed by Daniel J. Bernstein.
It is used for the ChaCha20-Poly1305 AEAD, specified in RFC7539 for use
- in IETF protocols. This is the portable C implementation of Poly1305.
-
-config CRYPTO_POLY1305_X86_64
- tristate "Poly1305 authenticator algorithm (x86_64/SSE2/AVX2)"
- depends on X86 && 64BIT
- select CRYPTO_POLY1305
- help
- Poly1305 authenticator algorithm, RFC7539.
-
- Poly1305 is an authenticator algorithm designed by Daniel J. Bernstein.
- It is used for the ChaCha20-Poly1305 AEAD, specified in RFC7539 for use
- in IETF protocols. This is the x86_64 assembler implementation using SIMD
- instructions.
+ in IETF protocols.
config CRYPTO_MD4
tristate "MD4 digest algorithm"
diff --git a/crypto/Makefile b/crypto/Makefile
index 6d1d40eeb964..5e60348d02e2 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -118,7 +118,7 @@ obj-$(CONFIG_CRYPTO_SEED) += seed.o
obj-$(CONFIG_CRYPTO_SPECK) += speck.o
obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_generic.o
-obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_generic.o
+obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_zinc.o
obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
obj-$(CONFIG_CRYPTO_CRC32C) += crc32c_generic.o
diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c
index 600afa99941f..bf523797bef3 100644
--- a/crypto/chacha20poly1305.c
+++ b/crypto/chacha20poly1305.c
@@ -14,7 +14,7 @@
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/chacha20.h>
-#include <crypto/poly1305.h>
+#include <zinc/poly1305.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
@@ -62,7 +62,7 @@ struct chachapoly_req_ctx {
/* the key we generate for Poly1305 using Chacha20 */
u8 key[POLY1305_KEY_SIZE];
/* calculated Poly1305 tag */
- u8 tag[POLY1305_DIGEST_SIZE];
+ u8 tag[POLY1305_MAC_SIZE];
/* length of data to en/decrypt, without ICV */
unsigned int cryptlen;
/* Actual AD, excluding IV */
@@ -471,7 +471,7 @@ static int chachapoly_decrypt(struct aead_request *req)
{
struct chachapoly_req_ctx *rctx = aead_request_ctx(req);
- rctx->cryptlen = req->cryptlen - POLY1305_DIGEST_SIZE;
+ rctx->cryptlen = req->cryptlen - POLY1305_MAC_SIZE;
/* decrypt call chain:
* - poly_genkey/done()
@@ -513,7 +513,7 @@ static int chachapoly_setkey(struct crypto_aead *aead, const u8 *key,
static int chachapoly_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
- if (authsize != POLY1305_DIGEST_SIZE)
+ if (authsize != POLY1305_MAC_SIZE)
return -EINVAL;
return 0;
@@ -613,7 +613,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
poly_hash = __crypto_hash_alg_common(poly);
err = -EINVAL;
- if (poly_hash->digestsize != POLY1305_DIGEST_SIZE)
+ if (poly_hash->digestsize != POLY1305_MAC_SIZE)
goto out_put_poly;
err = -ENOMEM;
@@ -666,7 +666,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
ctx->saltlen;
inst->alg.ivsize = ivsize;
inst->alg.chunksize = crypto_skcipher_alg_chunksize(chacha);
- inst->alg.maxauthsize = POLY1305_DIGEST_SIZE;
+ inst->alg.maxauthsize = POLY1305_MAC_SIZE;
inst->alg.init = chachapoly_init;
inst->alg.exit = chachapoly_exit;
inst->alg.encrypt = chachapoly_encrypt;
diff --git a/crypto/poly1305_generic.c b/crypto/poly1305_generic.c
deleted file mode 100644
index 47d3a6b83931..000000000000
--- a/crypto/poly1305_generic.c
+++ /dev/null
@@ -1,304 +0,0 @@
-/*
- * Poly1305 authenticator algorithm, RFC7539
- *
- * Copyright (C) 2015 Martin Willi
- *
- * Based on public domain code by Andrew Moon and Daniel J. Bernstein.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <crypto/algapi.h>
-#include <crypto/internal/hash.h>
-#include <crypto/poly1305.h>
-#include <linux/crypto.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <asm/unaligned.h>
-
-static inline u64 mlt(u64 a, u64 b)
-{
- return a * b;
-}
-
-static inline u32 sr(u64 v, u_char n)
-{
- return v >> n;
-}
-
-static inline u32 and(u32 v, u32 mask)
-{
- return v & mask;
-}
-
-int crypto_poly1305_init(struct shash_desc *desc)
-{
- struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
-
- memset(dctx->h, 0, sizeof(dctx->h));
- dctx->buflen = 0;
- dctx->rset = false;
- dctx->sset = false;
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_poly1305_init);
-
-static void poly1305_setrkey(struct poly1305_desc_ctx *dctx, const u8 *key)
-{
- /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
- dctx->r[0] = (get_unaligned_le32(key + 0) >> 0) & 0x3ffffff;
- dctx->r[1] = (get_unaligned_le32(key + 3) >> 2) & 0x3ffff03;
- dctx->r[2] = (get_unaligned_le32(key + 6) >> 4) & 0x3ffc0ff;
- dctx->r[3] = (get_unaligned_le32(key + 9) >> 6) & 0x3f03fff;
- dctx->r[4] = (get_unaligned_le32(key + 12) >> 8) & 0x00fffff;
-}
-
-static void poly1305_setskey(struct poly1305_desc_ctx *dctx, const u8 *key)
-{
- dctx->s[0] = get_unaligned_le32(key + 0);
- dctx->s[1] = get_unaligned_le32(key + 4);
- dctx->s[2] = get_unaligned_le32(key + 8);
- dctx->s[3] = get_unaligned_le32(key + 12);
-}
-
-/*
- * Poly1305 requires a unique key for each tag, which implies that we can't set
- * it on the tfm that gets accessed by multiple users simultaneously. Instead we
- * expect the key as the first 32 bytes in the update() call.
- */
-unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
- const u8 *src, unsigned int srclen)
-{
- if (!dctx->sset) {
- if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) {
- poly1305_setrkey(dctx, src);
- src += POLY1305_BLOCK_SIZE;
- srclen -= POLY1305_BLOCK_SIZE;
- dctx->rset = true;
- }
- if (srclen >= POLY1305_BLOCK_SIZE) {
- poly1305_setskey(dctx, src);
- src += POLY1305_BLOCK_SIZE;
- srclen -= POLY1305_BLOCK_SIZE;
- dctx->sset = true;
- }
- }
- return srclen;
-}
-EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey);
-
-static unsigned int poly1305_blocks(struct poly1305_desc_ctx *dctx,
- const u8 *src, unsigned int srclen,
- u32 hibit)
-{
- u32 r0, r1, r2, r3, r4;
- u32 s1, s2, s3, s4;
- u32 h0, h1, h2, h3, h4;
- u64 d0, d1, d2, d3, d4;
- unsigned int datalen;
-
- if (unlikely(!dctx->sset)) {
- datalen = crypto_poly1305_setdesckey(dctx, src, srclen);
- src += srclen - datalen;
- srclen = datalen;
- }
-
- r0 = dctx->r[0];
- r1 = dctx->r[1];
- r2 = dctx->r[2];
- r3 = dctx->r[3];
- r4 = dctx->r[4];
-
- s1 = r1 * 5;
- s2 = r2 * 5;
- s3 = r3 * 5;
- s4 = r4 * 5;
-
- h0 = dctx->h[0];
- h1 = dctx->h[1];
- h2 = dctx->h[2];
- h3 = dctx->h[3];
- h4 = dctx->h[4];
-
- while (likely(srclen >= POLY1305_BLOCK_SIZE)) {
-
- /* h += m[i] */
- h0 += (get_unaligned_le32(src + 0) >> 0) & 0x3ffffff;
- h1 += (get_unaligned_le32(src + 3) >> 2) & 0x3ffffff;
- h2 += (get_unaligned_le32(src + 6) >> 4) & 0x3ffffff;
- h3 += (get_unaligned_le32(src + 9) >> 6) & 0x3ffffff;
- h4 += (get_unaligned_le32(src + 12) >> 8) | hibit;
-
- /* h *= r */
- d0 = mlt(h0, r0) + mlt(h1, s4) + mlt(h2, s3) +
- mlt(h3, s2) + mlt(h4, s1);
- d1 = mlt(h0, r1) + mlt(h1, r0) + mlt(h2, s4) +
- mlt(h3, s3) + mlt(h4, s2);
- d2 = mlt(h0, r2) + mlt(h1, r1) + mlt(h2, r0) +
- mlt(h3, s4) + mlt(h4, s3);
- d3 = mlt(h0, r3) + mlt(h1, r2) + mlt(h2, r1) +
- mlt(h3, r0) + mlt(h4, s4);
- d4 = mlt(h0, r4) + mlt(h1, r3) + mlt(h2, r2) +
- mlt(h3, r1) + mlt(h4, r0);
-
- /* (partial) h %= p */
- d1 += sr(d0, 26); h0 = and(d0, 0x3ffffff);
- d2 += sr(d1, 26); h1 = and(d1, 0x3ffffff);
- d3 += sr(d2, 26); h2 = and(d2, 0x3ffffff);
- d4 += sr(d3, 26); h3 = and(d3, 0x3ffffff);
- h0 += sr(d4, 26) * 5; h4 = and(d4, 0x3ffffff);
- h1 += h0 >> 26; h0 = h0 & 0x3ffffff;
-
- src += POLY1305_BLOCK_SIZE;
- srclen -= POLY1305_BLOCK_SIZE;
- }
-
- dctx->h[0] = h0;
- dctx->h[1] = h1;
- dctx->h[2] = h2;
- dctx->h[3] = h3;
- dctx->h[4] = h4;
-
- return srclen;
-}
-
-int crypto_poly1305_update(struct shash_desc *desc,
- const u8 *src, unsigned int srclen)
-{
- struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
- unsigned int bytes;
-
- if (unlikely(dctx->buflen)) {
- bytes = min(srclen, POLY1305_BLOCK_SIZE - dctx->buflen);
- memcpy(dctx->buf + dctx->buflen, src, bytes);
- src += bytes;
- srclen -= bytes;
- dctx->buflen += bytes;
-
- if (dctx->buflen == POLY1305_BLOCK_SIZE) {
- poly1305_blocks(dctx, dctx->buf,
- POLY1305_BLOCK_SIZE, 1 << 24);
- dctx->buflen = 0;
- }
- }
-
- if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
- bytes = poly1305_blocks(dctx, src, srclen, 1 << 24);
- src += srclen - bytes;
- srclen = bytes;
- }
-
- if (unlikely(srclen)) {
- dctx->buflen = srclen;
- memcpy(dctx->buf, src, srclen);
- }
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_poly1305_update);
-
-int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
-{
- struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
- u32 h0, h1, h2, h3, h4;
- u32 g0, g1, g2, g3, g4;
- u32 mask;
- u64 f = 0;
-
- if (unlikely(!dctx->sset))
- return -ENOKEY;
-
- if (unlikely(dctx->buflen)) {
- dctx->buf[dctx->buflen++] = 1;
- memset(dctx->buf + dctx->buflen, 0,
- POLY1305_BLOCK_SIZE - dctx->buflen);
- poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0);
- }
-
- /* fully carry h */
- h0 = dctx->h[0];
- h1 = dctx->h[1];
- h2 = dctx->h[2];
- h3 = dctx->h[3];
- h4 = dctx->h[4];
-
- h2 += (h1 >> 26); h1 = h1 & 0x3ffffff;
- h3 += (h2 >> 26); h2 = h2 & 0x3ffffff;
- h4 += (h3 >> 26); h3 = h3 & 0x3ffffff;
- h0 += (h4 >> 26) * 5; h4 = h4 & 0x3ffffff;
- h1 += (h0 >> 26); h0 = h0 & 0x3ffffff;
-
- /* compute h + -p */
- g0 = h0 + 5;
- g1 = h1 + (g0 >> 26); g0 &= 0x3ffffff;
- g2 = h2 + (g1 >> 26); g1 &= 0x3ffffff;
- g3 = h3 + (g2 >> 26); g2 &= 0x3ffffff;
- g4 = h4 + (g3 >> 26) - (1 << 26); g3 &= 0x3ffffff;
-
- /* select h if h < p, or h + -p if h >= p */
- mask = (g4 >> ((sizeof(u32) * 8) - 1)) - 1;
- g0 &= mask;
- g1 &= mask;
- g2 &= mask;
- g3 &= mask;
- g4 &= mask;
- mask = ~mask;
- h0 = (h0 & mask) | g0;
- h1 = (h1 & mask) | g1;
- h2 = (h2 & mask) | g2;
- h3 = (h3 & mask) | g3;
- h4 = (h4 & mask) | g4;
-
- /* h = h % (2^128) */
- h0 = (h0 >> 0) | (h1 << 26);
- h1 = (h1 >> 6) | (h2 << 20);
- h2 = (h2 >> 12) | (h3 << 14);
- h3 = (h3 >> 18) | (h4 << 8);
-
- /* mac = (h + s) % (2^128) */
- f = (f >> 32) + h0 + dctx->s[0]; put_unaligned_le32(f, dst + 0);
- f = (f >> 32) + h1 + dctx->s[1]; put_unaligned_le32(f, dst + 4);
- f = (f >> 32) + h2 + dctx->s[2]; put_unaligned_le32(f, dst + 8);
- f = (f >> 32) + h3 + dctx->s[3]; put_unaligned_le32(f, dst + 12);
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_poly1305_final);
-
-static struct shash_alg poly1305_alg = {
- .digestsize = POLY1305_DIGEST_SIZE,
- .init = crypto_poly1305_init,
- .update = crypto_poly1305_update,
- .final = crypto_poly1305_final,
- .descsize = sizeof(struct poly1305_desc_ctx),
- .base = {
- .cra_name = "poly1305",
- .cra_driver_name = "poly1305-generic",
- .cra_priority = 100,
- .cra_blocksize = POLY1305_BLOCK_SIZE,
- .cra_module = THIS_MODULE,
- },
-};
-
-static int __init poly1305_mod_init(void)
-{
- return crypto_register_shash(&poly1305_alg);
-}
-
-static void __exit poly1305_mod_exit(void)
-{
- crypto_unregister_shash(&poly1305_alg);
-}
-
-module_init(poly1305_mod_init);
-module_exit(poly1305_mod_exit);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("Poly1305 authenticator");
-MODULE_ALIAS_CRYPTO("poly1305");
-MODULE_ALIAS_CRYPTO("poly1305-generic");
diff --git a/crypto/poly1305_zinc.c b/crypto/poly1305_zinc.c
new file mode 100644
index 000000000000..4794442edf26
--- /dev/null
+++ b/crypto/poly1305_zinc.c
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/internal/hash.h>
+#include <zinc/poly1305.h>
+#include <linux/crypto.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/simd.h>
+
+struct poly1305_desc_ctx {
+ struct poly1305_ctx ctx;
+ u8 key[POLY1305_KEY_SIZE];
+ unsigned int rem_key_bytes;
+};
+
+static int crypto_poly1305_init(struct shash_desc *desc)
+{
+ struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+ dctx->rem_key_bytes = POLY1305_KEY_SIZE;
+ return 0;
+}
+
+static int crypto_poly1305_update(struct shash_desc *desc, const u8 *src,
+ unsigned int srclen)
+{
+ struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+ simd_context_t simd_context;
+
+ if (unlikely(dctx->rem_key_bytes)) {
+ unsigned int key_bytes = min(srclen, dctx->rem_key_bytes);
+ memcpy(dctx->key + (POLY1305_KEY_SIZE - dctx->rem_key_bytes),
+ src, key_bytes);
+ src += key_bytes;
+ srclen -= key_bytes;
+ dctx->rem_key_bytes -= key_bytes;
+ if (!dctx->rem_key_bytes) {
+ poly1305_init(&dctx->ctx, dctx->key);
+ memzero_explicit(dctx->key, sizeof(dctx->key));
+ }
+ if (!srclen)
+ return 0;
+ }
+
+ simd_get(&simd_context);
+ poly1305_update(&dctx->ctx, src, srclen, &simd_context);
+ simd_put(&simd_context);
+
+ return 0;
+}
+
+static int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
+{
+ struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+ simd_context_t simd_context;
+
+ simd_get(&simd_context);
+ poly1305_final(&dctx->ctx, dst, &simd_context);
+ simd_put(&simd_context);
+ return 0;
+}
+
+static struct shash_alg poly1305_alg = {
+ .digestsize = POLY1305_MAC_SIZE,
+ .init = crypto_poly1305_init,
+ .update = crypto_poly1305_update,
+ .final = crypto_poly1305_final,
+ .descsize = sizeof(struct poly1305_desc_ctx),
+ .base = {
+ .cra_name = "poly1305",
+ .cra_driver_name = "poly1305-software",
+ .cra_priority = 100,
+ .cra_blocksize = POLY1305_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ },
+};
+
+static int __init poly1305_mod_init(void)
+{
+ return crypto_register_shash(&poly1305_alg);
+}
+
+static void __exit poly1305_mod_exit(void)
+{
+ crypto_unregister_shash(&poly1305_alg);
+}
+
+module_init(poly1305_mod_init);
+module_exit(poly1305_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_DESCRIPTION("Poly1305 authenticator");
+MODULE_ALIAS_CRYPTO("poly1305");
+MODULE_ALIAS_CRYPTO("poly1305-software");
diff --git a/include/crypto/poly1305.h b/include/crypto/poly1305.h
deleted file mode 100644
index f718a19da82f..000000000000
--- a/include/crypto/poly1305.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Common values for the Poly1305 algorithm
- */
-
-#ifndef _CRYPTO_POLY1305_H
-#define _CRYPTO_POLY1305_H
-
-#include <linux/types.h>
-#include <linux/crypto.h>
-
-#define POLY1305_BLOCK_SIZE 16
-#define POLY1305_KEY_SIZE 32
-#define POLY1305_DIGEST_SIZE 16
-
-struct poly1305_desc_ctx {
- /* key */
- u32 r[5];
- /* finalize key */
- u32 s[4];
- /* accumulator */
- u32 h[5];
- /* partial buffer */
- u8 buf[POLY1305_BLOCK_SIZE];
- /* bytes used in partial buffer */
- unsigned int buflen;
- /* r key has been set */
- bool rset;
- /* s key has been set */
- bool sset;
-};
-
-int crypto_poly1305_init(struct shash_desc *desc);
-unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx,
- const u8 *src, unsigned int srclen);
-int crypto_poly1305_update(struct shash_desc *desc,
- const u8 *src, unsigned int srclen);
-int crypto_poly1305_final(struct shash_desc *desc, u8 *dst);
-
-#endif
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next v7 26/28] crypto: port ChaCha20 to Zinc
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
` (22 preceding siblings ...)
2018-10-06 2:57 ` [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc Jason A. Donenfeld
@ 2018-10-06 2:57 ` Jason A. Donenfeld
2018-10-06 13:07 ` Martin Willi
23 siblings, 1 reply; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-06 2:57 UTC (permalink / raw)
To: linux-kernel, netdev, davem, gregkh
Cc: Jason A. Donenfeld, Samuel Neves, Andy Lutomirski, linux-crypto
Now that ChaCha20 is in Zinc, we can have the crypto API code simply
call into it. The crypto API expects to have a stored key per instance
and independent nonces, so we follow suite and store the key and
initialize the nonce independently.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: linux-crypto@vger.kernel.org
---
arch/arm/configs/exynos_defconfig | 1 -
arch/arm/configs/multi_v7_defconfig | 1 -
arch/arm/configs/omap2plus_defconfig | 1 -
arch/arm/crypto/Kconfig | 6 -
arch/arm/crypto/Makefile | 2 -
arch/arm/crypto/chacha20-neon-core.S | 521 --------------------
arch/arm/crypto/chacha20-neon-glue.c | 127 -----
arch/arm64/configs/defconfig | 1 -
arch/arm64/crypto/Kconfig | 6 -
arch/arm64/crypto/Makefile | 3 -
arch/arm64/crypto/chacha20-neon-core.S | 450 -----------------
arch/arm64/crypto/chacha20-neon-glue.c | 133 -----
arch/x86/crypto/Makefile | 3 -
arch/x86/crypto/chacha20-avx2-x86_64.S | 448 -----------------
arch/x86/crypto/chacha20-ssse3-x86_64.S | 630 ------------------------
arch/x86/crypto/chacha20_glue.c | 146 ------
crypto/Kconfig | 17 +-
crypto/Makefile | 2 +-
crypto/chacha20_generic.c | 136 -----
crypto/chacha20_zinc.c | 90 ++++
crypto/chacha20poly1305.c | 8 +-
include/crypto/chacha20.h | 12 -
22 files changed, 96 insertions(+), 2648 deletions(-)
delete mode 100644 arch/arm/crypto/chacha20-neon-core.S
delete mode 100644 arch/arm/crypto/chacha20-neon-glue.c
delete mode 100644 arch/arm64/crypto/chacha20-neon-core.S
delete mode 100644 arch/arm64/crypto/chacha20-neon-glue.c
delete mode 100644 arch/x86/crypto/chacha20-avx2-x86_64.S
delete mode 100644 arch/x86/crypto/chacha20-ssse3-x86_64.S
delete mode 100644 arch/x86/crypto/chacha20_glue.c
delete mode 100644 crypto/chacha20_generic.c
create mode 100644 crypto/chacha20_zinc.c
diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
index 27ea6dfcf2f2..95929b5e7b10 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -350,7 +350,6 @@ CONFIG_CRYPTO_SHA1_ARM_NEON=m
CONFIG_CRYPTO_SHA256_ARM=m
CONFIG_CRYPTO_SHA512_ARM=m
CONFIG_CRYPTO_AES_ARM_BS=m
-CONFIG_CRYPTO_CHACHA20_NEON=m
CONFIG_CRC_CCITT=y
CONFIG_FONTS=y
CONFIG_FONT_7x14=y
diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
index fc33444e94f0..63be07724db3 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -1000,4 +1000,3 @@ CONFIG_CRYPTO_AES_ARM_BS=m
CONFIG_CRYPTO_AES_ARM_CE=m
CONFIG_CRYPTO_GHASH_ARM_CE=m
CONFIG_CRYPTO_CRC32_ARM_CE=m
-CONFIG_CRYPTO_CHACHA20_NEON=m
diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
index 6491419b1dad..f585a8ecc336 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -547,7 +547,6 @@ CONFIG_CRYPTO_SHA512_ARM=m
CONFIG_CRYPTO_AES_ARM=m
CONFIG_CRYPTO_AES_ARM_BS=m
CONFIG_CRYPTO_GHASH_ARM_CE=m
-CONFIG_CRYPTO_CHACHA20_NEON=m
CONFIG_CRC_CCITT=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=y
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 925d1364727a..fb80fd89f0e7 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -115,12 +115,6 @@ config CRYPTO_CRC32_ARM_CE
depends on KERNEL_MODE_NEON && CRC32
select CRYPTO_HASH
-config CRYPTO_CHACHA20_NEON
- tristate "NEON accelerated ChaCha20 symmetric cipher"
- depends on KERNEL_MODE_NEON
- select CRYPTO_BLKCIPHER
- select CRYPTO_CHACHA20
-
config CRYPTO_SPECK_NEON
tristate "NEON accelerated Speck cipher algorithms"
depends on KERNEL_MODE_NEON
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index 8de542c48ade..bbfa98447063 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -9,7 +9,6 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
-obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
@@ -53,7 +52,6 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
-chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
speck-neon-y := speck-neon-core.o speck-neon-glue.o
ifdef REGENERATE_ARM_CRYPTO
diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S
deleted file mode 100644
index 451a849ad518..000000000000
--- a/arch/arm/crypto/chacha20-neon-core.S
+++ /dev/null
@@ -1,521 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
- *
- * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on:
- * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSE3 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
- .text
- .fpu neon
- .align 5
-
-ENTRY(chacha20_block_xor_neon)
- // r0: Input state matrix, s
- // r1: 1 data block output, o
- // r2: 1 data block input, i
-
- //
- // This function encrypts one ChaCha20 block by loading the state matrix
- // in four NEON registers. It performs matrix operation on four words in
- // parallel, but requireds shuffling to rearrange the words after each
- // round.
- //
-
- // x0..3 = s0..3
- add ip, r0, #0x20
- vld1.32 {q0-q1}, [r0]
- vld1.32 {q2-q3}, [ip]
-
- vmov q8, q0
- vmov q9, q1
- vmov q10, q2
- vmov q11, q3
-
- mov r3, #10
-
-.Ldoubleround:
- // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- vadd.i32 q0, q0, q1
- veor q3, q3, q0
- vrev32.16 q3, q3
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- vadd.i32 q2, q2, q3
- veor q4, q1, q2
- vshl.u32 q1, q4, #12
- vsri.u32 q1, q4, #20
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- vadd.i32 q0, q0, q1
- veor q4, q3, q0
- vshl.u32 q3, q4, #8
- vsri.u32 q3, q4, #24
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- vadd.i32 q2, q2, q3
- veor q4, q1, q2
- vshl.u32 q1, q4, #7
- vsri.u32 q1, q4, #25
-
- // x1 = shuffle32(x1, MASK(0, 3, 2, 1))
- vext.8 q1, q1, q1, #4
- // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- vext.8 q2, q2, q2, #8
- // x3 = shuffle32(x3, MASK(2, 1, 0, 3))
- vext.8 q3, q3, q3, #12
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- vadd.i32 q0, q0, q1
- veor q3, q3, q0
- vrev32.16 q3, q3
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- vadd.i32 q2, q2, q3
- veor q4, q1, q2
- vshl.u32 q1, q4, #12
- vsri.u32 q1, q4, #20
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- vadd.i32 q0, q0, q1
- veor q4, q3, q0
- vshl.u32 q3, q4, #8
- vsri.u32 q3, q4, #24
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- vadd.i32 q2, q2, q3
- veor q4, q1, q2
- vshl.u32 q1, q4, #7
- vsri.u32 q1, q4, #25
-
- // x1 = shuffle32(x1, MASK(2, 1, 0, 3))
- vext.8 q1, q1, q1, #12
- // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- vext.8 q2, q2, q2, #8
- // x3 = shuffle32(x3, MASK(0, 3, 2, 1))
- vext.8 q3, q3, q3, #4
-
- subs r3, r3, #1
- bne .Ldoubleround
-
- add ip, r2, #0x20
- vld1.8 {q4-q5}, [r2]
- vld1.8 {q6-q7}, [ip]
-
- // o0 = i0 ^ (x0 + s0)
- vadd.i32 q0, q0, q8
- veor q0, q0, q4
-
- // o1 = i1 ^ (x1 + s1)
- vadd.i32 q1, q1, q9
- veor q1, q1, q5
-
- // o2 = i2 ^ (x2 + s2)
- vadd.i32 q2, q2, q10
- veor q2, q2, q6
-
- // o3 = i3 ^ (x3 + s3)
- vadd.i32 q3, q3, q11
- veor q3, q3, q7
-
- add ip, r1, #0x20
- vst1.8 {q0-q1}, [r1]
- vst1.8 {q2-q3}, [ip]
-
- bx lr
-ENDPROC(chacha20_block_xor_neon)
-
- .align 5
-ENTRY(chacha20_4block_xor_neon)
- push {r4-r6, lr}
- mov ip, sp // preserve the stack pointer
- sub r3, sp, #0x20 // allocate a 32 byte buffer
- bic r3, r3, #0x1f // aligned to 32 bytes
- mov sp, r3
-
- // r0: Input state matrix, s
- // r1: 4 data blocks output, o
- // r2: 4 data blocks input, i
-
- //
- // This function encrypts four consecutive ChaCha20 blocks by loading
- // the state matrix in NEON registers four times. The algorithm performs
- // each operation on the corresponding word of each state matrix, hence
- // requires no word shuffling. For final XORing step we transpose the
- // matrix by interleaving 32- and then 64-bit words, which allows us to
- // do XOR in NEON registers.
- //
-
- // x0..15[0-3] = s0..3[0..3]
- add r3, r0, #0x20
- vld1.32 {q0-q1}, [r0]
- vld1.32 {q2-q3}, [r3]
-
- adr r3, CTRINC
- vdup.32 q15, d7[1]
- vdup.32 q14, d7[0]
- vld1.32 {q11}, [r3, :128]
- vdup.32 q13, d6[1]
- vdup.32 q12, d6[0]
- vadd.i32 q12, q12, q11 // x12 += counter values 0-3
- vdup.32 q11, d5[1]
- vdup.32 q10, d5[0]
- vdup.32 q9, d4[1]
- vdup.32 q8, d4[0]
- vdup.32 q7, d3[1]
- vdup.32 q6, d3[0]
- vdup.32 q5, d2[1]
- vdup.32 q4, d2[0]
- vdup.32 q3, d1[1]
- vdup.32 q2, d1[0]
- vdup.32 q1, d0[1]
- vdup.32 q0, d0[0]
-
- mov r3, #10
-
-.Ldoubleround4:
- // x0 += x4, x12 = rotl32(x12 ^ x0, 16)
- // x1 += x5, x13 = rotl32(x13 ^ x1, 16)
- // x2 += x6, x14 = rotl32(x14 ^ x2, 16)
- // x3 += x7, x15 = rotl32(x15 ^ x3, 16)
- vadd.i32 q0, q0, q4
- vadd.i32 q1, q1, q5
- vadd.i32 q2, q2, q6
- vadd.i32 q3, q3, q7
-
- veor q12, q12, q0
- veor q13, q13, q1
- veor q14, q14, q2
- veor q15, q15, q3
-
- vrev32.16 q12, q12
- vrev32.16 q13, q13
- vrev32.16 q14, q14
- vrev32.16 q15, q15
-
- // x8 += x12, x4 = rotl32(x4 ^ x8, 12)
- // x9 += x13, x5 = rotl32(x5 ^ x9, 12)
- // x10 += x14, x6 = rotl32(x6 ^ x10, 12)
- // x11 += x15, x7 = rotl32(x7 ^ x11, 12)
- vadd.i32 q8, q8, q12
- vadd.i32 q9, q9, q13
- vadd.i32 q10, q10, q14
- vadd.i32 q11, q11, q15
-
- vst1.32 {q8-q9}, [sp, :256]
-
- veor q8, q4, q8
- veor q9, q5, q9
- vshl.u32 q4, q8, #12
- vshl.u32 q5, q9, #12
- vsri.u32 q4, q8, #20
- vsri.u32 q5, q9, #20
-
- veor q8, q6, q10
- veor q9, q7, q11
- vshl.u32 q6, q8, #12
- vshl.u32 q7, q9, #12
- vsri.u32 q6, q8, #20
- vsri.u32 q7, q9, #20
-
- // x0 += x4, x12 = rotl32(x12 ^ x0, 8)
- // x1 += x5, x13 = rotl32(x13 ^ x1, 8)
- // x2 += x6, x14 = rotl32(x14 ^ x2, 8)
- // x3 += x7, x15 = rotl32(x15 ^ x3, 8)
- vadd.i32 q0, q0, q4
- vadd.i32 q1, q1, q5
- vadd.i32 q2, q2, q6
- vadd.i32 q3, q3, q7
-
- veor q8, q12, q0
- veor q9, q13, q1
- vshl.u32 q12, q8, #8
- vshl.u32 q13, q9, #8
- vsri.u32 q12, q8, #24
- vsri.u32 q13, q9, #24
-
- veor q8, q14, q2
- veor q9, q15, q3
- vshl.u32 q14, q8, #8
- vshl.u32 q15, q9, #8
- vsri.u32 q14, q8, #24
- vsri.u32 q15, q9, #24
-
- vld1.32 {q8-q9}, [sp, :256]
-
- // x8 += x12, x4 = rotl32(x4 ^ x8, 7)
- // x9 += x13, x5 = rotl32(x5 ^ x9, 7)
- // x10 += x14, x6 = rotl32(x6 ^ x10, 7)
- // x11 += x15, x7 = rotl32(x7 ^ x11, 7)
- vadd.i32 q8, q8, q12
- vadd.i32 q9, q9, q13
- vadd.i32 q10, q10, q14
- vadd.i32 q11, q11, q15
-
- vst1.32 {q8-q9}, [sp, :256]
-
- veor q8, q4, q8
- veor q9, q5, q9
- vshl.u32 q4, q8, #7
- vshl.u32 q5, q9, #7
- vsri.u32 q4, q8, #25
- vsri.u32 q5, q9, #25
-
- veor q8, q6, q10
- veor q9, q7, q11
- vshl.u32 q6, q8, #7
- vshl.u32 q7, q9, #7
- vsri.u32 q6, q8, #25
- vsri.u32 q7, q9, #25
-
- vld1.32 {q8-q9}, [sp, :256]
-
- // x0 += x5, x15 = rotl32(x15 ^ x0, 16)
- // x1 += x6, x12 = rotl32(x12 ^ x1, 16)
- // x2 += x7, x13 = rotl32(x13 ^ x2, 16)
- // x3 += x4, x14 = rotl32(x14 ^ x3, 16)
- vadd.i32 q0, q0, q5
- vadd.i32 q1, q1, q6
- vadd.i32 q2, q2, q7
- vadd.i32 q3, q3, q4
-
- veor q15, q15, q0
- veor q12, q12, q1
- veor q13, q13, q2
- veor q14, q14, q3
-
- vrev32.16 q15, q15
- vrev32.16 q12, q12
- vrev32.16 q13, q13
- vrev32.16 q14, q14
-
- // x10 += x15, x5 = rotl32(x5 ^ x10, 12)
- // x11 += x12, x6 = rotl32(x6 ^ x11, 12)
- // x8 += x13, x7 = rotl32(x7 ^ x8, 12)
- // x9 += x14, x4 = rotl32(x4 ^ x9, 12)
- vadd.i32 q10, q10, q15
- vadd.i32 q11, q11, q12
- vadd.i32 q8, q8, q13
- vadd.i32 q9, q9, q14
-
- vst1.32 {q8-q9}, [sp, :256]
-
- veor q8, q7, q8
- veor q9, q4, q9
- vshl.u32 q7, q8, #12
- vshl.u32 q4, q9, #12
- vsri.u32 q7, q8, #20
- vsri.u32 q4, q9, #20
-
- veor q8, q5, q10
- veor q9, q6, q11
- vshl.u32 q5, q8, #12
- vshl.u32 q6, q9, #12
- vsri.u32 q5, q8, #20
- vsri.u32 q6, q9, #20
-
- // x0 += x5, x15 = rotl32(x15 ^ x0, 8)
- // x1 += x6, x12 = rotl32(x12 ^ x1, 8)
- // x2 += x7, x13 = rotl32(x13 ^ x2, 8)
- // x3 += x4, x14 = rotl32(x14 ^ x3, 8)
- vadd.i32 q0, q0, q5
- vadd.i32 q1, q1, q6
- vadd.i32 q2, q2, q7
- vadd.i32 q3, q3, q4
-
- veor q8, q15, q0
- veor q9, q12, q1
- vshl.u32 q15, q8, #8
- vshl.u32 q12, q9, #8
- vsri.u32 q15, q8, #24
- vsri.u32 q12, q9, #24
-
- veor q8, q13, q2
- veor q9, q14, q3
- vshl.u32 q13, q8, #8
- vshl.u32 q14, q9, #8
- vsri.u32 q13, q8, #24
- vsri.u32 q14, q9, #24
-
- vld1.32 {q8-q9}, [sp, :256]
-
- // x10 += x15, x5 = rotl32(x5 ^ x10, 7)
- // x11 += x12, x6 = rotl32(x6 ^ x11, 7)
- // x8 += x13, x7 = rotl32(x7 ^ x8, 7)
- // x9 += x14, x4 = rotl32(x4 ^ x9, 7)
- vadd.i32 q10, q10, q15
- vadd.i32 q11, q11, q12
- vadd.i32 q8, q8, q13
- vadd.i32 q9, q9, q14
-
- vst1.32 {q8-q9}, [sp, :256]
-
- veor q8, q7, q8
- veor q9, q4, q9
- vshl.u32 q7, q8, #7
- vshl.u32 q4, q9, #7
- vsri.u32 q7, q8, #25
- vsri.u32 q4, q9, #25
-
- veor q8, q5, q10
- veor q9, q6, q11
- vshl.u32 q5, q8, #7
- vshl.u32 q6, q9, #7
- vsri.u32 q5, q8, #25
- vsri.u32 q6, q9, #25
-
- subs r3, r3, #1
- beq 0f
-
- vld1.32 {q8-q9}, [sp, :256]
- b .Ldoubleround4
-
- // x0[0-3] += s0[0]
- // x1[0-3] += s0[1]
- // x2[0-3] += s0[2]
- // x3[0-3] += s0[3]
-0: ldmia r0!, {r3-r6}
- vdup.32 q8, r3
- vdup.32 q9, r4
- vadd.i32 q0, q0, q8
- vadd.i32 q1, q1, q9
- vdup.32 q8, r5
- vdup.32 q9, r6
- vadd.i32 q2, q2, q8
- vadd.i32 q3, q3, q9
-
- // x4[0-3] += s1[0]
- // x5[0-3] += s1[1]
- // x6[0-3] += s1[2]
- // x7[0-3] += s1[3]
- ldmia r0!, {r3-r6}
- vdup.32 q8, r3
- vdup.32 q9, r4
- vadd.i32 q4, q4, q8
- vadd.i32 q5, q5, q9
- vdup.32 q8, r5
- vdup.32 q9, r6
- vadd.i32 q6, q6, q8
- vadd.i32 q7, q7, q9
-
- // interleave 32-bit words in state n, n+1
- vzip.32 q0, q1
- vzip.32 q2, q3
- vzip.32 q4, q5
- vzip.32 q6, q7
-
- // interleave 64-bit words in state n, n+2
- vswp d1, d4
- vswp d3, d6
- vswp d9, d12
- vswp d11, d14
-
- // xor with corresponding input, write to output
- vld1.8 {q8-q9}, [r2]!
- veor q8, q8, q0
- veor q9, q9, q4
- vst1.8 {q8-q9}, [r1]!
-
- vld1.32 {q8-q9}, [sp, :256]
-
- // x8[0-3] += s2[0]
- // x9[0-3] += s2[1]
- // x10[0-3] += s2[2]
- // x11[0-3] += s2[3]
- ldmia r0!, {r3-r6}
- vdup.32 q0, r3
- vdup.32 q4, r4
- vadd.i32 q8, q8, q0
- vadd.i32 q9, q9, q4
- vdup.32 q0, r5
- vdup.32 q4, r6
- vadd.i32 q10, q10, q0
- vadd.i32 q11, q11, q4
-
- // x12[0-3] += s3[0]
- // x13[0-3] += s3[1]
- // x14[0-3] += s3[2]
- // x15[0-3] += s3[3]
- ldmia r0!, {r3-r6}
- vdup.32 q0, r3
- vdup.32 q4, r4
- adr r3, CTRINC
- vadd.i32 q12, q12, q0
- vld1.32 {q0}, [r3, :128]
- vadd.i32 q13, q13, q4
- vadd.i32 q12, q12, q0 // x12 += counter values 0-3
-
- vdup.32 q0, r5
- vdup.32 q4, r6
- vadd.i32 q14, q14, q0
- vadd.i32 q15, q15, q4
-
- // interleave 32-bit words in state n, n+1
- vzip.32 q8, q9
- vzip.32 q10, q11
- vzip.32 q12, q13
- vzip.32 q14, q15
-
- // interleave 64-bit words in state n, n+2
- vswp d17, d20
- vswp d19, d22
- vswp d25, d28
- vswp d27, d30
-
- vmov q4, q1
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q8
- veor q1, q1, q12
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q2
- veor q1, q1, q6
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q10
- veor q1, q1, q14
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q4
- veor q1, q1, q5
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q9
- veor q1, q1, q13
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]!
- veor q0, q0, q3
- veor q1, q1, q7
- vst1.8 {q0-q1}, [r1]!
-
- vld1.8 {q0-q1}, [r2]
- veor q0, q0, q11
- veor q1, q1, q15
- vst1.8 {q0-q1}, [r1]
-
- mov sp, ip
- pop {r4-r6, pc}
-ENDPROC(chacha20_4block_xor_neon)
-
- .align 4
-CTRINC: .word 0, 1, 2, 3
diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
deleted file mode 100644
index 59a7be08e80c..000000000000
--- a/arch/arm/crypto/chacha20-neon-glue.c
+++ /dev/null
@@ -1,127 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
- *
- * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on:
- * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <crypto/algapi.h>
-#include <crypto/chacha20.h>
-#include <crypto/internal/skcipher.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-
-#include <asm/hwcap.h>
-#include <asm/neon.h>
-#include <asm/simd.h>
-
-asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-
-static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
- unsigned int bytes)
-{
- u8 buf[CHACHA20_BLOCK_SIZE];
-
- while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
- chacha20_4block_xor_neon(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE * 4;
- src += CHACHA20_BLOCK_SIZE * 4;
- dst += CHACHA20_BLOCK_SIZE * 4;
- state[12] += 4;
- }
- while (bytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_block_xor_neon(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE;
- src += CHACHA20_BLOCK_SIZE;
- dst += CHACHA20_BLOCK_SIZE;
- state[12]++;
- }
- if (bytes) {
- memcpy(buf, src, bytes);
- chacha20_block_xor_neon(state, buf, buf);
- memcpy(dst, buf, bytes);
- }
-}
-
-static int chacha20_neon(struct skcipher_request *req)
-{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
- struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
- struct skcipher_walk walk;
- u32 state[16];
- int err;
-
- if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
- return crypto_chacha20_crypt(req);
-
- err = skcipher_walk_virt(&walk, req, true);
-
- crypto_chacha20_init(state, ctx, walk.iv);
-
- kernel_neon_begin();
- while (walk.nbytes > 0) {
- unsigned int nbytes = walk.nbytes;
-
- if (nbytes < walk.total)
- nbytes = round_down(nbytes, walk.stride);
-
- chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
- nbytes);
- err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
- }
- kernel_neon_end();
-
- return err;
-}
-
-static struct skcipher_alg alg = {
- .base.cra_name = "chacha20",
- .base.cra_driver_name = "chacha20-neon",
- .base.cra_priority = 300,
- .base.cra_blocksize = 1,
- .base.cra_ctxsize = sizeof(struct chacha20_ctx),
- .base.cra_module = THIS_MODULE,
-
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .chunksize = CHACHA20_BLOCK_SIZE,
- .walksize = 4 * CHACHA20_BLOCK_SIZE,
- .setkey = crypto_chacha20_setkey,
- .encrypt = chacha20_neon,
- .decrypt = chacha20_neon,
-};
-
-static int __init chacha20_simd_mod_init(void)
-{
- if (!(elf_hwcap & HWCAP_NEON))
- return -ENODEV;
-
- return crypto_register_skcipher(&alg);
-}
-
-static void __exit chacha20_simd_mod_fini(void)
-{
- crypto_unregister_skcipher(&alg);
-}
-
-module_init(chacha20_simd_mod_init);
-module_exit(chacha20_simd_mod_fini);
-
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS_CRYPTO("chacha20");
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index db8d364f8476..6cc3c8a0ad88 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -709,5 +709,4 @@ CONFIG_CRYPTO_CRCT10DIF_ARM64_CE=m
CONFIG_CRYPTO_CRC32_ARM64_CE=m
CONFIG_CRYPTO_AES_ARM64_CE_CCM=y
CONFIG_CRYPTO_AES_ARM64_CE_BLK=y
-CONFIG_CRYPTO_CHACHA20_NEON=m
CONFIG_CRYPTO_AES_ARM64_BS=m
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index e3fdb0fd6f70..9db6d775a880 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -105,12 +105,6 @@ config CRYPTO_AES_ARM64_NEON_BLK
select CRYPTO_AES
select CRYPTO_SIMD
-config CRYPTO_CHACHA20_NEON
- tristate "NEON accelerated ChaCha20 symmetric cipher"
- depends on KERNEL_MODE_NEON
- select CRYPTO_BLKCIPHER
- select CRYPTO_CHACHA20
-
config CRYPTO_AES_ARM64_BS
tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
depends on KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index bcafd016618e..507c4bfb86e3 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -53,9 +53,6 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
sha512-arm64-y := sha512-glue.o sha512-core.o
-obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
-chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
-
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
speck-neon-y := speck-neon-core.o speck-neon-glue.o
diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha20-neon-core.S
deleted file mode 100644
index 13c85e272c2a..000000000000
--- a/arch/arm64/crypto/chacha20-neon-core.S
+++ /dev/null
@@ -1,450 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
- *
- * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on:
- * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
- .text
- .align 6
-
-ENTRY(chacha20_block_xor_neon)
- // x0: Input state matrix, s
- // x1: 1 data block output, o
- // x2: 1 data block input, i
-
- //
- // This function encrypts one ChaCha20 block by loading the state matrix
- // in four NEON registers. It performs matrix operation on four words in
- // parallel, but requires shuffling to rearrange the words after each
- // round.
- //
-
- // x0..3 = s0..3
- adr x3, ROT8
- ld1 {v0.4s-v3.4s}, [x0]
- ld1 {v8.4s-v11.4s}, [x0]
- ld1 {v12.4s}, [x3]
-
- mov x3, #10
-
-.Ldoubleround:
- // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- add v0.4s, v0.4s, v1.4s
- eor v3.16b, v3.16b, v0.16b
- rev32 v3.8h, v3.8h
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- add v2.4s, v2.4s, v3.4s
- eor v4.16b, v1.16b, v2.16b
- shl v1.4s, v4.4s, #12
- sri v1.4s, v4.4s, #20
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- add v0.4s, v0.4s, v1.4s
- eor v3.16b, v3.16b, v0.16b
- tbl v3.16b, {v3.16b}, v12.16b
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- add v2.4s, v2.4s, v3.4s
- eor v4.16b, v1.16b, v2.16b
- shl v1.4s, v4.4s, #7
- sri v1.4s, v4.4s, #25
-
- // x1 = shuffle32(x1, MASK(0, 3, 2, 1))
- ext v1.16b, v1.16b, v1.16b, #4
- // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- ext v2.16b, v2.16b, v2.16b, #8
- // x3 = shuffle32(x3, MASK(2, 1, 0, 3))
- ext v3.16b, v3.16b, v3.16b, #12
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- add v0.4s, v0.4s, v1.4s
- eor v3.16b, v3.16b, v0.16b
- rev32 v3.8h, v3.8h
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- add v2.4s, v2.4s, v3.4s
- eor v4.16b, v1.16b, v2.16b
- shl v1.4s, v4.4s, #12
- sri v1.4s, v4.4s, #20
-
- // x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- add v0.4s, v0.4s, v1.4s
- eor v3.16b, v3.16b, v0.16b
- tbl v3.16b, {v3.16b}, v12.16b
-
- // x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- add v2.4s, v2.4s, v3.4s
- eor v4.16b, v1.16b, v2.16b
- shl v1.4s, v4.4s, #7
- sri v1.4s, v4.4s, #25
-
- // x1 = shuffle32(x1, MASK(2, 1, 0, 3))
- ext v1.16b, v1.16b, v1.16b, #12
- // x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- ext v2.16b, v2.16b, v2.16b, #8
- // x3 = shuffle32(x3, MASK(0, 3, 2, 1))
- ext v3.16b, v3.16b, v3.16b, #4
-
- subs x3, x3, #1
- b.ne .Ldoubleround
-
- ld1 {v4.16b-v7.16b}, [x2]
-
- // o0 = i0 ^ (x0 + s0)
- add v0.4s, v0.4s, v8.4s
- eor v0.16b, v0.16b, v4.16b
-
- // o1 = i1 ^ (x1 + s1)
- add v1.4s, v1.4s, v9.4s
- eor v1.16b, v1.16b, v5.16b
-
- // o2 = i2 ^ (x2 + s2)
- add v2.4s, v2.4s, v10.4s
- eor v2.16b, v2.16b, v6.16b
-
- // o3 = i3 ^ (x3 + s3)
- add v3.4s, v3.4s, v11.4s
- eor v3.16b, v3.16b, v7.16b
-
- st1 {v0.16b-v3.16b}, [x1]
-
- ret
-ENDPROC(chacha20_block_xor_neon)
-
- .align 6
-ENTRY(chacha20_4block_xor_neon)
- // x0: Input state matrix, s
- // x1: 4 data blocks output, o
- // x2: 4 data blocks input, i
-
- //
- // This function encrypts four consecutive ChaCha20 blocks by loading
- // the state matrix in NEON registers four times. The algorithm performs
- // each operation on the corresponding word of each state matrix, hence
- // requires no word shuffling. For final XORing step we transpose the
- // matrix by interleaving 32- and then 64-bit words, which allows us to
- // do XOR in NEON registers.
- //
- adr x3, CTRINC // ... and ROT8
- ld1 {v30.4s-v31.4s}, [x3]
-
- // x0..15[0-3] = s0..3[0..3]
- mov x4, x0
- ld4r { v0.4s- v3.4s}, [x4], #16
- ld4r { v4.4s- v7.4s}, [x4], #16
- ld4r { v8.4s-v11.4s}, [x4], #16
- ld4r {v12.4s-v15.4s}, [x4]
-
- // x12 += counter values 0-3
- add v12.4s, v12.4s, v30.4s
-
- mov x3, #10
-
-.Ldoubleround4:
- // x0 += x4, x12 = rotl32(x12 ^ x0, 16)
- // x1 += x5, x13 = rotl32(x13 ^ x1, 16)
- // x2 += x6, x14 = rotl32(x14 ^ x2, 16)
- // x3 += x7, x15 = rotl32(x15 ^ x3, 16)
- add v0.4s, v0.4s, v4.4s
- add v1.4s, v1.4s, v5.4s
- add v2.4s, v2.4s, v6.4s
- add v3.4s, v3.4s, v7.4s
-
- eor v12.16b, v12.16b, v0.16b
- eor v13.16b, v13.16b, v1.16b
- eor v14.16b, v14.16b, v2.16b
- eor v15.16b, v15.16b, v3.16b
-
- rev32 v12.8h, v12.8h
- rev32 v13.8h, v13.8h
- rev32 v14.8h, v14.8h
- rev32 v15.8h, v15.8h
-
- // x8 += x12, x4 = rotl32(x4 ^ x8, 12)
- // x9 += x13, x5 = rotl32(x5 ^ x9, 12)
- // x10 += x14, x6 = rotl32(x6 ^ x10, 12)
- // x11 += x15, x7 = rotl32(x7 ^ x11, 12)
- add v8.4s, v8.4s, v12.4s
- add v9.4s, v9.4s, v13.4s
- add v10.4s, v10.4s, v14.4s
- add v11.4s, v11.4s, v15.4s
-
- eor v16.16b, v4.16b, v8.16b
- eor v17.16b, v5.16b, v9.16b
- eor v18.16b, v6.16b, v10.16b
- eor v19.16b, v7.16b, v11.16b
-
- shl v4.4s, v16.4s, #12
- shl v5.4s, v17.4s, #12
- shl v6.4s, v18.4s, #12
- shl v7.4s, v19.4s, #12
-
- sri v4.4s, v16.4s, #20
- sri v5.4s, v17.4s, #20
- sri v6.4s, v18.4s, #20
- sri v7.4s, v19.4s, #20
-
- // x0 += x4, x12 = rotl32(x12 ^ x0, 8)
- // x1 += x5, x13 = rotl32(x13 ^ x1, 8)
- // x2 += x6, x14 = rotl32(x14 ^ x2, 8)
- // x3 += x7, x15 = rotl32(x15 ^ x3, 8)
- add v0.4s, v0.4s, v4.4s
- add v1.4s, v1.4s, v5.4s
- add v2.4s, v2.4s, v6.4s
- add v3.4s, v3.4s, v7.4s
-
- eor v12.16b, v12.16b, v0.16b
- eor v13.16b, v13.16b, v1.16b
- eor v14.16b, v14.16b, v2.16b
- eor v15.16b, v15.16b, v3.16b
-
- tbl v12.16b, {v12.16b}, v31.16b
- tbl v13.16b, {v13.16b}, v31.16b
- tbl v14.16b, {v14.16b}, v31.16b
- tbl v15.16b, {v15.16b}, v31.16b
-
- // x8 += x12, x4 = rotl32(x4 ^ x8, 7)
- // x9 += x13, x5 = rotl32(x5 ^ x9, 7)
- // x10 += x14, x6 = rotl32(x6 ^ x10, 7)
- // x11 += x15, x7 = rotl32(x7 ^ x11, 7)
- add v8.4s, v8.4s, v12.4s
- add v9.4s, v9.4s, v13.4s
- add v10.4s, v10.4s, v14.4s
- add v11.4s, v11.4s, v15.4s
-
- eor v16.16b, v4.16b, v8.16b
- eor v17.16b, v5.16b, v9.16b
- eor v18.16b, v6.16b, v10.16b
- eor v19.16b, v7.16b, v11.16b
-
- shl v4.4s, v16.4s, #7
- shl v5.4s, v17.4s, #7
- shl v6.4s, v18.4s, #7
- shl v7.4s, v19.4s, #7
-
- sri v4.4s, v16.4s, #25
- sri v5.4s, v17.4s, #25
- sri v6.4s, v18.4s, #25
- sri v7.4s, v19.4s, #25
-
- // x0 += x5, x15 = rotl32(x15 ^ x0, 16)
- // x1 += x6, x12 = rotl32(x12 ^ x1, 16)
- // x2 += x7, x13 = rotl32(x13 ^ x2, 16)
- // x3 += x4, x14 = rotl32(x14 ^ x3, 16)
- add v0.4s, v0.4s, v5.4s
- add v1.4s, v1.4s, v6.4s
- add v2.4s, v2.4s, v7.4s
- add v3.4s, v3.4s, v4.4s
-
- eor v15.16b, v15.16b, v0.16b
- eor v12.16b, v12.16b, v1.16b
- eor v13.16b, v13.16b, v2.16b
- eor v14.16b, v14.16b, v3.16b
-
- rev32 v15.8h, v15.8h
- rev32 v12.8h, v12.8h
- rev32 v13.8h, v13.8h
- rev32 v14.8h, v14.8h
-
- // x10 += x15, x5 = rotl32(x5 ^ x10, 12)
- // x11 += x12, x6 = rotl32(x6 ^ x11, 12)
- // x8 += x13, x7 = rotl32(x7 ^ x8, 12)
- // x9 += x14, x4 = rotl32(x4 ^ x9, 12)
- add v10.4s, v10.4s, v15.4s
- add v11.4s, v11.4s, v12.4s
- add v8.4s, v8.4s, v13.4s
- add v9.4s, v9.4s, v14.4s
-
- eor v16.16b, v5.16b, v10.16b
- eor v17.16b, v6.16b, v11.16b
- eor v18.16b, v7.16b, v8.16b
- eor v19.16b, v4.16b, v9.16b
-
- shl v5.4s, v16.4s, #12
- shl v6.4s, v17.4s, #12
- shl v7.4s, v18.4s, #12
- shl v4.4s, v19.4s, #12
-
- sri v5.4s, v16.4s, #20
- sri v6.4s, v17.4s, #20
- sri v7.4s, v18.4s, #20
- sri v4.4s, v19.4s, #20
-
- // x0 += x5, x15 = rotl32(x15 ^ x0, 8)
- // x1 += x6, x12 = rotl32(x12 ^ x1, 8)
- // x2 += x7, x13 = rotl32(x13 ^ x2, 8)
- // x3 += x4, x14 = rotl32(x14 ^ x3, 8)
- add v0.4s, v0.4s, v5.4s
- add v1.4s, v1.4s, v6.4s
- add v2.4s, v2.4s, v7.4s
- add v3.4s, v3.4s, v4.4s
-
- eor v15.16b, v15.16b, v0.16b
- eor v12.16b, v12.16b, v1.16b
- eor v13.16b, v13.16b, v2.16b
- eor v14.16b, v14.16b, v3.16b
-
- tbl v15.16b, {v15.16b}, v31.16b
- tbl v12.16b, {v12.16b}, v31.16b
- tbl v13.16b, {v13.16b}, v31.16b
- tbl v14.16b, {v14.16b}, v31.16b
-
- // x10 += x15, x5 = rotl32(x5 ^ x10, 7)
- // x11 += x12, x6 = rotl32(x6 ^ x11, 7)
- // x8 += x13, x7 = rotl32(x7 ^ x8, 7)
- // x9 += x14, x4 = rotl32(x4 ^ x9, 7)
- add v10.4s, v10.4s, v15.4s
- add v11.4s, v11.4s, v12.4s
- add v8.4s, v8.4s, v13.4s
- add v9.4s, v9.4s, v14.4s
-
- eor v16.16b, v5.16b, v10.16b
- eor v17.16b, v6.16b, v11.16b
- eor v18.16b, v7.16b, v8.16b
- eor v19.16b, v4.16b, v9.16b
-
- shl v5.4s, v16.4s, #7
- shl v6.4s, v17.4s, #7
- shl v7.4s, v18.4s, #7
- shl v4.4s, v19.4s, #7
-
- sri v5.4s, v16.4s, #25
- sri v6.4s, v17.4s, #25
- sri v7.4s, v18.4s, #25
- sri v4.4s, v19.4s, #25
-
- subs x3, x3, #1
- b.ne .Ldoubleround4
-
- ld4r {v16.4s-v19.4s}, [x0], #16
- ld4r {v20.4s-v23.4s}, [x0], #16
-
- // x12 += counter values 0-3
- add v12.4s, v12.4s, v30.4s
-
- // x0[0-3] += s0[0]
- // x1[0-3] += s0[1]
- // x2[0-3] += s0[2]
- // x3[0-3] += s0[3]
- add v0.4s, v0.4s, v16.4s
- add v1.4s, v1.4s, v17.4s
- add v2.4s, v2.4s, v18.4s
- add v3.4s, v3.4s, v19.4s
-
- ld4r {v24.4s-v27.4s}, [x0], #16
- ld4r {v28.4s-v31.4s}, [x0]
-
- // x4[0-3] += s1[0]
- // x5[0-3] += s1[1]
- // x6[0-3] += s1[2]
- // x7[0-3] += s1[3]
- add v4.4s, v4.4s, v20.4s
- add v5.4s, v5.4s, v21.4s
- add v6.4s, v6.4s, v22.4s
- add v7.4s, v7.4s, v23.4s
-
- // x8[0-3] += s2[0]
- // x9[0-3] += s2[1]
- // x10[0-3] += s2[2]
- // x11[0-3] += s2[3]
- add v8.4s, v8.4s, v24.4s
- add v9.4s, v9.4s, v25.4s
- add v10.4s, v10.4s, v26.4s
- add v11.4s, v11.4s, v27.4s
-
- // x12[0-3] += s3[0]
- // x13[0-3] += s3[1]
- // x14[0-3] += s3[2]
- // x15[0-3] += s3[3]
- add v12.4s, v12.4s, v28.4s
- add v13.4s, v13.4s, v29.4s
- add v14.4s, v14.4s, v30.4s
- add v15.4s, v15.4s, v31.4s
-
- // interleave 32-bit words in state n, n+1
- zip1 v16.4s, v0.4s, v1.4s
- zip2 v17.4s, v0.4s, v1.4s
- zip1 v18.4s, v2.4s, v3.4s
- zip2 v19.4s, v2.4s, v3.4s
- zip1 v20.4s, v4.4s, v5.4s
- zip2 v21.4s, v4.4s, v5.4s
- zip1 v22.4s, v6.4s, v7.4s
- zip2 v23.4s, v6.4s, v7.4s
- zip1 v24.4s, v8.4s, v9.4s
- zip2 v25.4s, v8.4s, v9.4s
- zip1 v26.4s, v10.4s, v11.4s
- zip2 v27.4s, v10.4s, v11.4s
- zip1 v28.4s, v12.4s, v13.4s
- zip2 v29.4s, v12.4s, v13.4s
- zip1 v30.4s, v14.4s, v15.4s
- zip2 v31.4s, v14.4s, v15.4s
-
- // interleave 64-bit words in state n, n+2
- zip1 v0.2d, v16.2d, v18.2d
- zip2 v4.2d, v16.2d, v18.2d
- zip1 v8.2d, v17.2d, v19.2d
- zip2 v12.2d, v17.2d, v19.2d
- ld1 {v16.16b-v19.16b}, [x2], #64
-
- zip1 v1.2d, v20.2d, v22.2d
- zip2 v5.2d, v20.2d, v22.2d
- zip1 v9.2d, v21.2d, v23.2d
- zip2 v13.2d, v21.2d, v23.2d
- ld1 {v20.16b-v23.16b}, [x2], #64
-
- zip1 v2.2d, v24.2d, v26.2d
- zip2 v6.2d, v24.2d, v26.2d
- zip1 v10.2d, v25.2d, v27.2d
- zip2 v14.2d, v25.2d, v27.2d
- ld1 {v24.16b-v27.16b}, [x2], #64
-
- zip1 v3.2d, v28.2d, v30.2d
- zip2 v7.2d, v28.2d, v30.2d
- zip1 v11.2d, v29.2d, v31.2d
- zip2 v15.2d, v29.2d, v31.2d
- ld1 {v28.16b-v31.16b}, [x2]
-
- // xor with corresponding input, write to output
- eor v16.16b, v16.16b, v0.16b
- eor v17.16b, v17.16b, v1.16b
- eor v18.16b, v18.16b, v2.16b
- eor v19.16b, v19.16b, v3.16b
- eor v20.16b, v20.16b, v4.16b
- eor v21.16b, v21.16b, v5.16b
- st1 {v16.16b-v19.16b}, [x1], #64
- eor v22.16b, v22.16b, v6.16b
- eor v23.16b, v23.16b, v7.16b
- eor v24.16b, v24.16b, v8.16b
- eor v25.16b, v25.16b, v9.16b
- st1 {v20.16b-v23.16b}, [x1], #64
- eor v26.16b, v26.16b, v10.16b
- eor v27.16b, v27.16b, v11.16b
- eor v28.16b, v28.16b, v12.16b
- st1 {v24.16b-v27.16b}, [x1], #64
- eor v29.16b, v29.16b, v13.16b
- eor v30.16b, v30.16b, v14.16b
- eor v31.16b, v31.16b, v15.16b
- st1 {v28.16b-v31.16b}, [x1]
-
- ret
-ENDPROC(chacha20_4block_xor_neon)
-
-CTRINC: .word 0, 1, 2, 3
-ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f
diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c
deleted file mode 100644
index 727579c93ded..000000000000
--- a/arch/arm64/crypto/chacha20-neon-glue.c
+++ /dev/null
@@ -1,133 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
- *
- * Copyright (C) 2016 - 2017 Linaro, Ltd. <ard.biesheuvel@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on:
- * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <crypto/algapi.h>
-#include <crypto/chacha20.h>
-#include <crypto/internal/skcipher.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-
-#include <asm/hwcap.h>
-#include <asm/neon.h>
-#include <asm/simd.h>
-
-asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
-
-static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
- unsigned int bytes)
-{
- u8 buf[CHACHA20_BLOCK_SIZE];
-
- while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
- kernel_neon_begin();
- chacha20_4block_xor_neon(state, dst, src);
- kernel_neon_end();
- bytes -= CHACHA20_BLOCK_SIZE * 4;
- src += CHACHA20_BLOCK_SIZE * 4;
- dst += CHACHA20_BLOCK_SIZE * 4;
- state[12] += 4;
- }
-
- if (!bytes)
- return;
-
- kernel_neon_begin();
- while (bytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_block_xor_neon(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE;
- src += CHACHA20_BLOCK_SIZE;
- dst += CHACHA20_BLOCK_SIZE;
- state[12]++;
- }
- if (bytes) {
- memcpy(buf, src, bytes);
- chacha20_block_xor_neon(state, buf, buf);
- memcpy(dst, buf, bytes);
- }
- kernel_neon_end();
-}
-
-static int chacha20_neon(struct skcipher_request *req)
-{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
- struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
- struct skcipher_walk walk;
- u32 state[16];
- int err;
-
- if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE)
- return crypto_chacha20_crypt(req);
-
- err = skcipher_walk_virt(&walk, req, false);
-
- crypto_chacha20_init(state, ctx, walk.iv);
-
- while (walk.nbytes > 0) {
- unsigned int nbytes = walk.nbytes;
-
- if (nbytes < walk.total)
- nbytes = round_down(nbytes, walk.stride);
-
- chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
- nbytes);
- err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
- }
-
- return err;
-}
-
-static struct skcipher_alg alg = {
- .base.cra_name = "chacha20",
- .base.cra_driver_name = "chacha20-neon",
- .base.cra_priority = 300,
- .base.cra_blocksize = 1,
- .base.cra_ctxsize = sizeof(struct chacha20_ctx),
- .base.cra_module = THIS_MODULE,
-
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .chunksize = CHACHA20_BLOCK_SIZE,
- .walksize = 4 * CHACHA20_BLOCK_SIZE,
- .setkey = crypto_chacha20_setkey,
- .encrypt = chacha20_neon,
- .decrypt = chacha20_neon,
-};
-
-static int __init chacha20_simd_mod_init(void)
-{
- if (!(elf_hwcap & HWCAP_ASIMD))
- return -ENODEV;
-
- return crypto_register_skcipher(&alg);
-}
-
-static void __exit chacha20_simd_mod_fini(void)
-{
- crypto_unregister_skcipher(&alg);
-}
-
-module_init(chacha20_simd_mod_init);
-module_exit(chacha20_simd_mod_fini);
-
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS_CRYPTO("chacha20");
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index cf830219846b..419212c31246 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -23,7 +23,6 @@ obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
-obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha20-x86_64.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
@@ -76,7 +75,6 @@ camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
-chacha20-x86_64-y := chacha20-ssse3-x86_64.o chacha20_glue.o
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
aegis128-aesni-y := aegis128-aesni-asm.o aegis128-aesni-glue.o
@@ -99,7 +97,6 @@ endif
ifeq ($(avx2_supported),yes)
camellia-aesni-avx2-y := camellia-aesni-avx2-asm_64.o camellia_aesni_avx2_glue.o
- chacha20-x86_64-y += chacha20-avx2-x86_64.o
serpent-avx2-y := serpent-avx2-asm_64.o serpent_avx2_glue.o
morus1280-avx2-y := morus1280-avx2-asm.o morus1280-avx2-glue.o
diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S b/arch/x86/crypto/chacha20-avx2-x86_64.S
deleted file mode 100644
index f3cd26f48332..000000000000
--- a/arch/x86/crypto/chacha20-avx2-x86_64.S
+++ /dev/null
@@ -1,448 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, x64 AVX2 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
-.section .rodata.cst32.ROT8, "aM", @progbits, 32
-.align 32
-ROT8: .octa 0x0e0d0c0f0a09080b0605040702010003
- .octa 0x0e0d0c0f0a09080b0605040702010003
-
-.section .rodata.cst32.ROT16, "aM", @progbits, 32
-.align 32
-ROT16: .octa 0x0d0c0f0e09080b0a0504070601000302
- .octa 0x0d0c0f0e09080b0a0504070601000302
-
-.section .rodata.cst32.CTRINC, "aM", @progbits, 32
-.align 32
-CTRINC: .octa 0x00000003000000020000000100000000
- .octa 0x00000007000000060000000500000004
-
-.text
-
-ENTRY(chacha20_8block_xor_avx2)
- # %rdi: Input state matrix, s
- # %rsi: 8 data blocks output, o
- # %rdx: 8 data blocks input, i
-
- # This function encrypts eight consecutive ChaCha20 blocks by loading
- # the state matrix in AVX registers eight times. As we need some
- # scratch registers, we save the first four registers on the stack. The
- # algorithm performs each operation on the corresponding word of each
- # state matrix, hence requires no word shuffling. For final XORing step
- # we transpose the matrix by interleaving 32-, 64- and then 128-bit
- # words, which allows us to do XOR in AVX registers. 8/16-bit word
- # rotation is done with the slightly better performing byte shuffling,
- # 7/12-bit word rotation uses traditional shift+OR.
-
- vzeroupper
- # 4 * 32 byte stack, 32-byte aligned
- lea 8(%rsp),%r10
- and $~31, %rsp
- sub $0x80, %rsp
-
- # x0..15[0-7] = s[0..15]
- vpbroadcastd 0x00(%rdi),%ymm0
- vpbroadcastd 0x04(%rdi),%ymm1
- vpbroadcastd 0x08(%rdi),%ymm2
- vpbroadcastd 0x0c(%rdi),%ymm3
- vpbroadcastd 0x10(%rdi),%ymm4
- vpbroadcastd 0x14(%rdi),%ymm5
- vpbroadcastd 0x18(%rdi),%ymm6
- vpbroadcastd 0x1c(%rdi),%ymm7
- vpbroadcastd 0x20(%rdi),%ymm8
- vpbroadcastd 0x24(%rdi),%ymm9
- vpbroadcastd 0x28(%rdi),%ymm10
- vpbroadcastd 0x2c(%rdi),%ymm11
- vpbroadcastd 0x30(%rdi),%ymm12
- vpbroadcastd 0x34(%rdi),%ymm13
- vpbroadcastd 0x38(%rdi),%ymm14
- vpbroadcastd 0x3c(%rdi),%ymm15
- # x0..3 on stack
- vmovdqa %ymm0,0x00(%rsp)
- vmovdqa %ymm1,0x20(%rsp)
- vmovdqa %ymm2,0x40(%rsp)
- vmovdqa %ymm3,0x60(%rsp)
-
- vmovdqa CTRINC(%rip),%ymm1
- vmovdqa ROT8(%rip),%ymm2
- vmovdqa ROT16(%rip),%ymm3
-
- # x12 += counter values 0-3
- vpaddd %ymm1,%ymm12,%ymm12
-
- mov $10,%ecx
-
-.Ldoubleround8:
- # x0 += x4, x12 = rotl32(x12 ^ x0, 16)
- vpaddd 0x00(%rsp),%ymm4,%ymm0
- vmovdqa %ymm0,0x00(%rsp)
- vpxor %ymm0,%ymm12,%ymm12
- vpshufb %ymm3,%ymm12,%ymm12
- # x1 += x5, x13 = rotl32(x13 ^ x1, 16)
- vpaddd 0x20(%rsp),%ymm5,%ymm0
- vmovdqa %ymm0,0x20(%rsp)
- vpxor %ymm0,%ymm13,%ymm13
- vpshufb %ymm3,%ymm13,%ymm13
- # x2 += x6, x14 = rotl32(x14 ^ x2, 16)
- vpaddd 0x40(%rsp),%ymm6,%ymm0
- vmovdqa %ymm0,0x40(%rsp)
- vpxor %ymm0,%ymm14,%ymm14
- vpshufb %ymm3,%ymm14,%ymm14
- # x3 += x7, x15 = rotl32(x15 ^ x3, 16)
- vpaddd 0x60(%rsp),%ymm7,%ymm0
- vmovdqa %ymm0,0x60(%rsp)
- vpxor %ymm0,%ymm15,%ymm15
- vpshufb %ymm3,%ymm15,%ymm15
-
- # x8 += x12, x4 = rotl32(x4 ^ x8, 12)
- vpaddd %ymm12,%ymm8,%ymm8
- vpxor %ymm8,%ymm4,%ymm4
- vpslld $12,%ymm4,%ymm0
- vpsrld $20,%ymm4,%ymm4
- vpor %ymm0,%ymm4,%ymm4
- # x9 += x13, x5 = rotl32(x5 ^ x9, 12)
- vpaddd %ymm13,%ymm9,%ymm9
- vpxor %ymm9,%ymm5,%ymm5
- vpslld $12,%ymm5,%ymm0
- vpsrld $20,%ymm5,%ymm5
- vpor %ymm0,%ymm5,%ymm5
- # x10 += x14, x6 = rotl32(x6 ^ x10, 12)
- vpaddd %ymm14,%ymm10,%ymm10
- vpxor %ymm10,%ymm6,%ymm6
- vpslld $12,%ymm6,%ymm0
- vpsrld $20,%ymm6,%ymm6
- vpor %ymm0,%ymm6,%ymm6
- # x11 += x15, x7 = rotl32(x7 ^ x11, 12)
- vpaddd %ymm15,%ymm11,%ymm11
- vpxor %ymm11,%ymm7,%ymm7
- vpslld $12,%ymm7,%ymm0
- vpsrld $20,%ymm7,%ymm7
- vpor %ymm0,%ymm7,%ymm7
-
- # x0 += x4, x12 = rotl32(x12 ^ x0, 8)
- vpaddd 0x00(%rsp),%ymm4,%ymm0
- vmovdqa %ymm0,0x00(%rsp)
- vpxor %ymm0,%ymm12,%ymm12
- vpshufb %ymm2,%ymm12,%ymm12
- # x1 += x5, x13 = rotl32(x13 ^ x1, 8)
- vpaddd 0x20(%rsp),%ymm5,%ymm0
- vmovdqa %ymm0,0x20(%rsp)
- vpxor %ymm0,%ymm13,%ymm13
- vpshufb %ymm2,%ymm13,%ymm13
- # x2 += x6, x14 = rotl32(x14 ^ x2, 8)
- vpaddd 0x40(%rsp),%ymm6,%ymm0
- vmovdqa %ymm0,0x40(%rsp)
- vpxor %ymm0,%ymm14,%ymm14
- vpshufb %ymm2,%ymm14,%ymm14
- # x3 += x7, x15 = rotl32(x15 ^ x3, 8)
- vpaddd 0x60(%rsp),%ymm7,%ymm0
- vmovdqa %ymm0,0x60(%rsp)
- vpxor %ymm0,%ymm15,%ymm15
- vpshufb %ymm2,%ymm15,%ymm15
-
- # x8 += x12, x4 = rotl32(x4 ^ x8, 7)
- vpaddd %ymm12,%ymm8,%ymm8
- vpxor %ymm8,%ymm4,%ymm4
- vpslld $7,%ymm4,%ymm0
- vpsrld $25,%ymm4,%ymm4
- vpor %ymm0,%ymm4,%ymm4
- # x9 += x13, x5 = rotl32(x5 ^ x9, 7)
- vpaddd %ymm13,%ymm9,%ymm9
- vpxor %ymm9,%ymm5,%ymm5
- vpslld $7,%ymm5,%ymm0
- vpsrld $25,%ymm5,%ymm5
- vpor %ymm0,%ymm5,%ymm5
- # x10 += x14, x6 = rotl32(x6 ^ x10, 7)
- vpaddd %ymm14,%ymm10,%ymm10
- vpxor %ymm10,%ymm6,%ymm6
- vpslld $7,%ymm6,%ymm0
- vpsrld $25,%ymm6,%ymm6
- vpor %ymm0,%ymm6,%ymm6
- # x11 += x15, x7 = rotl32(x7 ^ x11, 7)
- vpaddd %ymm15,%ymm11,%ymm11
- vpxor %ymm11,%ymm7,%ymm7
- vpslld $7,%ymm7,%ymm0
- vpsrld $25,%ymm7,%ymm7
- vpor %ymm0,%ymm7,%ymm7
-
- # x0 += x5, x15 = rotl32(x15 ^ x0, 16)
- vpaddd 0x00(%rsp),%ymm5,%ymm0
- vmovdqa %ymm0,0x00(%rsp)
- vpxor %ymm0,%ymm15,%ymm15
- vpshufb %ymm3,%ymm15,%ymm15
- # x1 += x6, x12 = rotl32(x12 ^ x1, 16)%ymm0
- vpaddd 0x20(%rsp),%ymm6,%ymm0
- vmovdqa %ymm0,0x20(%rsp)
- vpxor %ymm0,%ymm12,%ymm12
- vpshufb %ymm3,%ymm12,%ymm12
- # x2 += x7, x13 = rotl32(x13 ^ x2, 16)
- vpaddd 0x40(%rsp),%ymm7,%ymm0
- vmovdqa %ymm0,0x40(%rsp)
- vpxor %ymm0,%ymm13,%ymm13
- vpshufb %ymm3,%ymm13,%ymm13
- # x3 += x4, x14 = rotl32(x14 ^ x3, 16)
- vpaddd 0x60(%rsp),%ymm4,%ymm0
- vmovdqa %ymm0,0x60(%rsp)
- vpxor %ymm0,%ymm14,%ymm14
- vpshufb %ymm3,%ymm14,%ymm14
-
- # x10 += x15, x5 = rotl32(x5 ^ x10, 12)
- vpaddd %ymm15,%ymm10,%ymm10
- vpxor %ymm10,%ymm5,%ymm5
- vpslld $12,%ymm5,%ymm0
- vpsrld $20,%ymm5,%ymm5
- vpor %ymm0,%ymm5,%ymm5
- # x11 += x12, x6 = rotl32(x6 ^ x11, 12)
- vpaddd %ymm12,%ymm11,%ymm11
- vpxor %ymm11,%ymm6,%ymm6
- vpslld $12,%ymm6,%ymm0
- vpsrld $20,%ymm6,%ymm6
- vpor %ymm0,%ymm6,%ymm6
- # x8 += x13, x7 = rotl32(x7 ^ x8, 12)
- vpaddd %ymm13,%ymm8,%ymm8
- vpxor %ymm8,%ymm7,%ymm7
- vpslld $12,%ymm7,%ymm0
- vpsrld $20,%ymm7,%ymm7
- vpor %ymm0,%ymm7,%ymm7
- # x9 += x14, x4 = rotl32(x4 ^ x9, 12)
- vpaddd %ymm14,%ymm9,%ymm9
- vpxor %ymm9,%ymm4,%ymm4
- vpslld $12,%ymm4,%ymm0
- vpsrld $20,%ymm4,%ymm4
- vpor %ymm0,%ymm4,%ymm4
-
- # x0 += x5, x15 = rotl32(x15 ^ x0, 8)
- vpaddd 0x00(%rsp),%ymm5,%ymm0
- vmovdqa %ymm0,0x00(%rsp)
- vpxor %ymm0,%ymm15,%ymm15
- vpshufb %ymm2,%ymm15,%ymm15
- # x1 += x6, x12 = rotl32(x12 ^ x1, 8)
- vpaddd 0x20(%rsp),%ymm6,%ymm0
- vmovdqa %ymm0,0x20(%rsp)
- vpxor %ymm0,%ymm12,%ymm12
- vpshufb %ymm2,%ymm12,%ymm12
- # x2 += x7, x13 = rotl32(x13 ^ x2, 8)
- vpaddd 0x40(%rsp),%ymm7,%ymm0
- vmovdqa %ymm0,0x40(%rsp)
- vpxor %ymm0,%ymm13,%ymm13
- vpshufb %ymm2,%ymm13,%ymm13
- # x3 += x4, x14 = rotl32(x14 ^ x3, 8)
- vpaddd 0x60(%rsp),%ymm4,%ymm0
- vmovdqa %ymm0,0x60(%rsp)
- vpxor %ymm0,%ymm14,%ymm14
- vpshufb %ymm2,%ymm14,%ymm14
-
- # x10 += x15, x5 = rotl32(x5 ^ x10, 7)
- vpaddd %ymm15,%ymm10,%ymm10
- vpxor %ymm10,%ymm5,%ymm5
- vpslld $7,%ymm5,%ymm0
- vpsrld $25,%ymm5,%ymm5
- vpor %ymm0,%ymm5,%ymm5
- # x11 += x12, x6 = rotl32(x6 ^ x11, 7)
- vpaddd %ymm12,%ymm11,%ymm11
- vpxor %ymm11,%ymm6,%ymm6
- vpslld $7,%ymm6,%ymm0
- vpsrld $25,%ymm6,%ymm6
- vpor %ymm0,%ymm6,%ymm6
- # x8 += x13, x7 = rotl32(x7 ^ x8, 7)
- vpaddd %ymm13,%ymm8,%ymm8
- vpxor %ymm8,%ymm7,%ymm7
- vpslld $7,%ymm7,%ymm0
- vpsrld $25,%ymm7,%ymm7
- vpor %ymm0,%ymm7,%ymm7
- # x9 += x14, x4 = rotl32(x4 ^ x9, 7)
- vpaddd %ymm14,%ymm9,%ymm9
- vpxor %ymm9,%ymm4,%ymm4
- vpslld $7,%ymm4,%ymm0
- vpsrld $25,%ymm4,%ymm4
- vpor %ymm0,%ymm4,%ymm4
-
- dec %ecx
- jnz .Ldoubleround8
-
- # x0..15[0-3] += s[0..15]
- vpbroadcastd 0x00(%rdi),%ymm0
- vpaddd 0x00(%rsp),%ymm0,%ymm0
- vmovdqa %ymm0,0x00(%rsp)
- vpbroadcastd 0x04(%rdi),%ymm0
- vpaddd 0x20(%rsp),%ymm0,%ymm0
- vmovdqa %ymm0,0x20(%rsp)
- vpbroadcastd 0x08(%rdi),%ymm0
- vpaddd 0x40(%rsp),%ymm0,%ymm0
- vmovdqa %ymm0,0x40(%rsp)
- vpbroadcastd 0x0c(%rdi),%ymm0
- vpaddd 0x60(%rsp),%ymm0,%ymm0
- vmovdqa %ymm0,0x60(%rsp)
- vpbroadcastd 0x10(%rdi),%ymm0
- vpaddd %ymm0,%ymm4,%ymm4
- vpbroadcastd 0x14(%rdi),%ymm0
- vpaddd %ymm0,%ymm5,%ymm5
- vpbroadcastd 0x18(%rdi),%ymm0
- vpaddd %ymm0,%ymm6,%ymm6
- vpbroadcastd 0x1c(%rdi),%ymm0
- vpaddd %ymm0,%ymm7,%ymm7
- vpbroadcastd 0x20(%rdi),%ymm0
- vpaddd %ymm0,%ymm8,%ymm8
- vpbroadcastd 0x24(%rdi),%ymm0
- vpaddd %ymm0,%ymm9,%ymm9
- vpbroadcastd 0x28(%rdi),%ymm0
- vpaddd %ymm0,%ymm10,%ymm10
- vpbroadcastd 0x2c(%rdi),%ymm0
- vpaddd %ymm0,%ymm11,%ymm11
- vpbroadcastd 0x30(%rdi),%ymm0
- vpaddd %ymm0,%ymm12,%ymm12
- vpbroadcastd 0x34(%rdi),%ymm0
- vpaddd %ymm0,%ymm13,%ymm13
- vpbroadcastd 0x38(%rdi),%ymm0
- vpaddd %ymm0,%ymm14,%ymm14
- vpbroadcastd 0x3c(%rdi),%ymm0
- vpaddd %ymm0,%ymm15,%ymm15
-
- # x12 += counter values 0-3
- vpaddd %ymm1,%ymm12,%ymm12
-
- # interleave 32-bit words in state n, n+1
- vmovdqa 0x00(%rsp),%ymm0
- vmovdqa 0x20(%rsp),%ymm1
- vpunpckldq %ymm1,%ymm0,%ymm2
- vpunpckhdq %ymm1,%ymm0,%ymm1
- vmovdqa %ymm2,0x00(%rsp)
- vmovdqa %ymm1,0x20(%rsp)
- vmovdqa 0x40(%rsp),%ymm0
- vmovdqa 0x60(%rsp),%ymm1
- vpunpckldq %ymm1,%ymm0,%ymm2
- vpunpckhdq %ymm1,%ymm0,%ymm1
- vmovdqa %ymm2,0x40(%rsp)
- vmovdqa %ymm1,0x60(%rsp)
- vmovdqa %ymm4,%ymm0
- vpunpckldq %ymm5,%ymm0,%ymm4
- vpunpckhdq %ymm5,%ymm0,%ymm5
- vmovdqa %ymm6,%ymm0
- vpunpckldq %ymm7,%ymm0,%ymm6
- vpunpckhdq %ymm7,%ymm0,%ymm7
- vmovdqa %ymm8,%ymm0
- vpunpckldq %ymm9,%ymm0,%ymm8
- vpunpckhdq %ymm9,%ymm0,%ymm9
- vmovdqa %ymm10,%ymm0
- vpunpckldq %ymm11,%ymm0,%ymm10
- vpunpckhdq %ymm11,%ymm0,%ymm11
- vmovdqa %ymm12,%ymm0
- vpunpckldq %ymm13,%ymm0,%ymm12
- vpunpckhdq %ymm13,%ymm0,%ymm13
- vmovdqa %ymm14,%ymm0
- vpunpckldq %ymm15,%ymm0,%ymm14
- vpunpckhdq %ymm15,%ymm0,%ymm15
-
- # interleave 64-bit words in state n, n+2
- vmovdqa 0x00(%rsp),%ymm0
- vmovdqa 0x40(%rsp),%ymm2
- vpunpcklqdq %ymm2,%ymm0,%ymm1
- vpunpckhqdq %ymm2,%ymm0,%ymm2
- vmovdqa %ymm1,0x00(%rsp)
- vmovdqa %ymm2,0x40(%rsp)
- vmovdqa 0x20(%rsp),%ymm0
- vmovdqa 0x60(%rsp),%ymm2
- vpunpcklqdq %ymm2,%ymm0,%ymm1
- vpunpckhqdq %ymm2,%ymm0,%ymm2
- vmovdqa %ymm1,0x20(%rsp)
- vmovdqa %ymm2,0x60(%rsp)
- vmovdqa %ymm4,%ymm0
- vpunpcklqdq %ymm6,%ymm0,%ymm4
- vpunpckhqdq %ymm6,%ymm0,%ymm6
- vmovdqa %ymm5,%ymm0
- vpunpcklqdq %ymm7,%ymm0,%ymm5
- vpunpckhqdq %ymm7,%ymm0,%ymm7
- vmovdqa %ymm8,%ymm0
- vpunpcklqdq %ymm10,%ymm0,%ymm8
- vpunpckhqdq %ymm10,%ymm0,%ymm10
- vmovdqa %ymm9,%ymm0
- vpunpcklqdq %ymm11,%ymm0,%ymm9
- vpunpckhqdq %ymm11,%ymm0,%ymm11
- vmovdqa %ymm12,%ymm0
- vpunpcklqdq %ymm14,%ymm0,%ymm12
- vpunpckhqdq %ymm14,%ymm0,%ymm14
- vmovdqa %ymm13,%ymm0
- vpunpcklqdq %ymm15,%ymm0,%ymm13
- vpunpckhqdq %ymm15,%ymm0,%ymm15
-
- # interleave 128-bit words in state n, n+4
- vmovdqa 0x00(%rsp),%ymm0
- vperm2i128 $0x20,%ymm4,%ymm0,%ymm1
- vperm2i128 $0x31,%ymm4,%ymm0,%ymm4
- vmovdqa %ymm1,0x00(%rsp)
- vmovdqa 0x20(%rsp),%ymm0
- vperm2i128 $0x20,%ymm5,%ymm0,%ymm1
- vperm2i128 $0x31,%ymm5,%ymm0,%ymm5
- vmovdqa %ymm1,0x20(%rsp)
- vmovdqa 0x40(%rsp),%ymm0
- vperm2i128 $0x20,%ymm6,%ymm0,%ymm1
- vperm2i128 $0x31,%ymm6,%ymm0,%ymm6
- vmovdqa %ymm1,0x40(%rsp)
- vmovdqa 0x60(%rsp),%ymm0
- vperm2i128 $0x20,%ymm7,%ymm0,%ymm1
- vperm2i128 $0x31,%ymm7,%ymm0,%ymm7
- vmovdqa %ymm1,0x60(%rsp)
- vperm2i128 $0x20,%ymm12,%ymm8,%ymm0
- vperm2i128 $0x31,%ymm12,%ymm8,%ymm12
- vmovdqa %ymm0,%ymm8
- vperm2i128 $0x20,%ymm13,%ymm9,%ymm0
- vperm2i128 $0x31,%ymm13,%ymm9,%ymm13
- vmovdqa %ymm0,%ymm9
- vperm2i128 $0x20,%ymm14,%ymm10,%ymm0
- vperm2i128 $0x31,%ymm14,%ymm10,%ymm14
- vmovdqa %ymm0,%ymm10
- vperm2i128 $0x20,%ymm15,%ymm11,%ymm0
- vperm2i128 $0x31,%ymm15,%ymm11,%ymm15
- vmovdqa %ymm0,%ymm11
-
- # xor with corresponding input, write to output
- vmovdqa 0x00(%rsp),%ymm0
- vpxor 0x0000(%rdx),%ymm0,%ymm0
- vmovdqu %ymm0,0x0000(%rsi)
- vmovdqa 0x20(%rsp),%ymm0
- vpxor 0x0080(%rdx),%ymm0,%ymm0
- vmovdqu %ymm0,0x0080(%rsi)
- vmovdqa 0x40(%rsp),%ymm0
- vpxor 0x0040(%rdx),%ymm0,%ymm0
- vmovdqu %ymm0,0x0040(%rsi)
- vmovdqa 0x60(%rsp),%ymm0
- vpxor 0x00c0(%rdx),%ymm0,%ymm0
- vmovdqu %ymm0,0x00c0(%rsi)
- vpxor 0x0100(%rdx),%ymm4,%ymm4
- vmovdqu %ymm4,0x0100(%rsi)
- vpxor 0x0180(%rdx),%ymm5,%ymm5
- vmovdqu %ymm5,0x00180(%rsi)
- vpxor 0x0140(%rdx),%ymm6,%ymm6
- vmovdqu %ymm6,0x0140(%rsi)
- vpxor 0x01c0(%rdx),%ymm7,%ymm7
- vmovdqu %ymm7,0x01c0(%rsi)
- vpxor 0x0020(%rdx),%ymm8,%ymm8
- vmovdqu %ymm8,0x0020(%rsi)
- vpxor 0x00a0(%rdx),%ymm9,%ymm9
- vmovdqu %ymm9,0x00a0(%rsi)
- vpxor 0x0060(%rdx),%ymm10,%ymm10
- vmovdqu %ymm10,0x0060(%rsi)
- vpxor 0x00e0(%rdx),%ymm11,%ymm11
- vmovdqu %ymm11,0x00e0(%rsi)
- vpxor 0x0120(%rdx),%ymm12,%ymm12
- vmovdqu %ymm12,0x0120(%rsi)
- vpxor 0x01a0(%rdx),%ymm13,%ymm13
- vmovdqu %ymm13,0x01a0(%rsi)
- vpxor 0x0160(%rdx),%ymm14,%ymm14
- vmovdqu %ymm14,0x0160(%rsi)
- vpxor 0x01e0(%rdx),%ymm15,%ymm15
- vmovdqu %ymm15,0x01e0(%rsi)
-
- vzeroupper
- lea -8(%r10),%rsp
- ret
-ENDPROC(chacha20_8block_xor_avx2)
diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S b/arch/x86/crypto/chacha20-ssse3-x86_64.S
deleted file mode 100644
index 512a2b500fd1..000000000000
--- a/arch/x86/crypto/chacha20-ssse3-x86_64.S
+++ /dev/null
@@ -1,630 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <linux/linkage.h>
-
-.section .rodata.cst16.ROT8, "aM", @progbits, 16
-.align 16
-ROT8: .octa 0x0e0d0c0f0a09080b0605040702010003
-.section .rodata.cst16.ROT16, "aM", @progbits, 16
-.align 16
-ROT16: .octa 0x0d0c0f0e09080b0a0504070601000302
-.section .rodata.cst16.CTRINC, "aM", @progbits, 16
-.align 16
-CTRINC: .octa 0x00000003000000020000000100000000
-
-.text
-
-ENTRY(chacha20_block_xor_ssse3)
- # %rdi: Input state matrix, s
- # %rsi: 1 data block output, o
- # %rdx: 1 data block input, i
-
- # This function encrypts one ChaCha20 block by loading the state matrix
- # in four SSE registers. It performs matrix operation on four words in
- # parallel, but requireds shuffling to rearrange the words after each
- # round. 8/16-bit word rotation is done with the slightly better
- # performing SSSE3 byte shuffling, 7/12-bit word rotation uses
- # traditional shift+OR.
-
- # x0..3 = s0..3
- movdqa 0x00(%rdi),%xmm0
- movdqa 0x10(%rdi),%xmm1
- movdqa 0x20(%rdi),%xmm2
- movdqa 0x30(%rdi),%xmm3
- movdqa %xmm0,%xmm8
- movdqa %xmm1,%xmm9
- movdqa %xmm2,%xmm10
- movdqa %xmm3,%xmm11
-
- movdqa ROT8(%rip),%xmm4
- movdqa ROT16(%rip),%xmm5
-
- mov $10,%ecx
-
-.Ldoubleround:
-
- # x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- paddd %xmm1,%xmm0
- pxor %xmm0,%xmm3
- pshufb %xmm5,%xmm3
-
- # x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- paddd %xmm3,%xmm2
- pxor %xmm2,%xmm1
- movdqa %xmm1,%xmm6
- pslld $12,%xmm6
- psrld $20,%xmm1
- por %xmm6,%xmm1
-
- # x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- paddd %xmm1,%xmm0
- pxor %xmm0,%xmm3
- pshufb %xmm4,%xmm3
-
- # x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- paddd %xmm3,%xmm2
- pxor %xmm2,%xmm1
- movdqa %xmm1,%xmm7
- pslld $7,%xmm7
- psrld $25,%xmm1
- por %xmm7,%xmm1
-
- # x1 = shuffle32(x1, MASK(0, 3, 2, 1))
- pshufd $0x39,%xmm1,%xmm1
- # x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- pshufd $0x4e,%xmm2,%xmm2
- # x3 = shuffle32(x3, MASK(2, 1, 0, 3))
- pshufd $0x93,%xmm3,%xmm3
-
- # x0 += x1, x3 = rotl32(x3 ^ x0, 16)
- paddd %xmm1,%xmm0
- pxor %xmm0,%xmm3
- pshufb %xmm5,%xmm3
-
- # x2 += x3, x1 = rotl32(x1 ^ x2, 12)
- paddd %xmm3,%xmm2
- pxor %xmm2,%xmm1
- movdqa %xmm1,%xmm6
- pslld $12,%xmm6
- psrld $20,%xmm1
- por %xmm6,%xmm1
-
- # x0 += x1, x3 = rotl32(x3 ^ x0, 8)
- paddd %xmm1,%xmm0
- pxor %xmm0,%xmm3
- pshufb %xmm4,%xmm3
-
- # x2 += x3, x1 = rotl32(x1 ^ x2, 7)
- paddd %xmm3,%xmm2
- pxor %xmm2,%xmm1
- movdqa %xmm1,%xmm7
- pslld $7,%xmm7
- psrld $25,%xmm1
- por %xmm7,%xmm1
-
- # x1 = shuffle32(x1, MASK(2, 1, 0, 3))
- pshufd $0x93,%xmm1,%xmm1
- # x2 = shuffle32(x2, MASK(1, 0, 3, 2))
- pshufd $0x4e,%xmm2,%xmm2
- # x3 = shuffle32(x3, MASK(0, 3, 2, 1))
- pshufd $0x39,%xmm3,%xmm3
-
- dec %ecx
- jnz .Ldoubleround
-
- # o0 = i0 ^ (x0 + s0)
- movdqu 0x00(%rdx),%xmm4
- paddd %xmm8,%xmm0
- pxor %xmm4,%xmm0
- movdqu %xmm0,0x00(%rsi)
- # o1 = i1 ^ (x1 + s1)
- movdqu 0x10(%rdx),%xmm5
- paddd %xmm9,%xmm1
- pxor %xmm5,%xmm1
- movdqu %xmm1,0x10(%rsi)
- # o2 = i2 ^ (x2 + s2)
- movdqu 0x20(%rdx),%xmm6
- paddd %xmm10,%xmm2
- pxor %xmm6,%xmm2
- movdqu %xmm2,0x20(%rsi)
- # o3 = i3 ^ (x3 + s3)
- movdqu 0x30(%rdx),%xmm7
- paddd %xmm11,%xmm3
- pxor %xmm7,%xmm3
- movdqu %xmm3,0x30(%rsi)
-
- ret
-ENDPROC(chacha20_block_xor_ssse3)
-
-ENTRY(chacha20_4block_xor_ssse3)
- # %rdi: Input state matrix, s
- # %rsi: 4 data blocks output, o
- # %rdx: 4 data blocks input, i
-
- # This function encrypts four consecutive ChaCha20 blocks by loading the
- # the state matrix in SSE registers four times. As we need some scratch
- # registers, we save the first four registers on the stack. The
- # algorithm performs each operation on the corresponding word of each
- # state matrix, hence requires no word shuffling. For final XORing step
- # we transpose the matrix by interleaving 32- and then 64-bit words,
- # which allows us to do XOR in SSE registers. 8/16-bit word rotation is
- # done with the slightly better performing SSSE3 byte shuffling,
- # 7/12-bit word rotation uses traditional shift+OR.
-
- lea 8(%rsp),%r10
- sub $0x80,%rsp
- and $~63,%rsp
-
- # x0..15[0-3] = s0..3[0..3]
- movq 0x00(%rdi),%xmm1
- pshufd $0x00,%xmm1,%xmm0
- pshufd $0x55,%xmm1,%xmm1
- movq 0x08(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- movq 0x10(%rdi),%xmm5
- pshufd $0x00,%xmm5,%xmm4
- pshufd $0x55,%xmm5,%xmm5
- movq 0x18(%rdi),%xmm7
- pshufd $0x00,%xmm7,%xmm6
- pshufd $0x55,%xmm7,%xmm7
- movq 0x20(%rdi),%xmm9
- pshufd $0x00,%xmm9,%xmm8
- pshufd $0x55,%xmm9,%xmm9
- movq 0x28(%rdi),%xmm11
- pshufd $0x00,%xmm11,%xmm10
- pshufd $0x55,%xmm11,%xmm11
- movq 0x30(%rdi),%xmm13
- pshufd $0x00,%xmm13,%xmm12
- pshufd $0x55,%xmm13,%xmm13
- movq 0x38(%rdi),%xmm15
- pshufd $0x00,%xmm15,%xmm14
- pshufd $0x55,%xmm15,%xmm15
- # x0..3 on stack
- movdqa %xmm0,0x00(%rsp)
- movdqa %xmm1,0x10(%rsp)
- movdqa %xmm2,0x20(%rsp)
- movdqa %xmm3,0x30(%rsp)
-
- movdqa CTRINC(%rip),%xmm1
- movdqa ROT8(%rip),%xmm2
- movdqa ROT16(%rip),%xmm3
-
- # x12 += counter values 0-3
- paddd %xmm1,%xmm12
-
- mov $10,%ecx
-
-.Ldoubleround4:
- # x0 += x4, x12 = rotl32(x12 ^ x0, 16)
- movdqa 0x00(%rsp),%xmm0
- paddd %xmm4,%xmm0
- movdqa %xmm0,0x00(%rsp)
- pxor %xmm0,%xmm12
- pshufb %xmm3,%xmm12
- # x1 += x5, x13 = rotl32(x13 ^ x1, 16)
- movdqa 0x10(%rsp),%xmm0
- paddd %xmm5,%xmm0
- movdqa %xmm0,0x10(%rsp)
- pxor %xmm0,%xmm13
- pshufb %xmm3,%xmm13
- # x2 += x6, x14 = rotl32(x14 ^ x2, 16)
- movdqa 0x20(%rsp),%xmm0
- paddd %xmm6,%xmm0
- movdqa %xmm0,0x20(%rsp)
- pxor %xmm0,%xmm14
- pshufb %xmm3,%xmm14
- # x3 += x7, x15 = rotl32(x15 ^ x3, 16)
- movdqa 0x30(%rsp),%xmm0
- paddd %xmm7,%xmm0
- movdqa %xmm0,0x30(%rsp)
- pxor %xmm0,%xmm15
- pshufb %xmm3,%xmm15
-
- # x8 += x12, x4 = rotl32(x4 ^ x8, 12)
- paddd %xmm12,%xmm8
- pxor %xmm8,%xmm4
- movdqa %xmm4,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm4
- por %xmm0,%xmm4
- # x9 += x13, x5 = rotl32(x5 ^ x9, 12)
- paddd %xmm13,%xmm9
- pxor %xmm9,%xmm5
- movdqa %xmm5,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm5
- por %xmm0,%xmm5
- # x10 += x14, x6 = rotl32(x6 ^ x10, 12)
- paddd %xmm14,%xmm10
- pxor %xmm10,%xmm6
- movdqa %xmm6,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm6
- por %xmm0,%xmm6
- # x11 += x15, x7 = rotl32(x7 ^ x11, 12)
- paddd %xmm15,%xmm11
- pxor %xmm11,%xmm7
- movdqa %xmm7,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm7
- por %xmm0,%xmm7
-
- # x0 += x4, x12 = rotl32(x12 ^ x0, 8)
- movdqa 0x00(%rsp),%xmm0
- paddd %xmm4,%xmm0
- movdqa %xmm0,0x00(%rsp)
- pxor %xmm0,%xmm12
- pshufb %xmm2,%xmm12
- # x1 += x5, x13 = rotl32(x13 ^ x1, 8)
- movdqa 0x10(%rsp),%xmm0
- paddd %xmm5,%xmm0
- movdqa %xmm0,0x10(%rsp)
- pxor %xmm0,%xmm13
- pshufb %xmm2,%xmm13
- # x2 += x6, x14 = rotl32(x14 ^ x2, 8)
- movdqa 0x20(%rsp),%xmm0
- paddd %xmm6,%xmm0
- movdqa %xmm0,0x20(%rsp)
- pxor %xmm0,%xmm14
- pshufb %xmm2,%xmm14
- # x3 += x7, x15 = rotl32(x15 ^ x3, 8)
- movdqa 0x30(%rsp),%xmm0
- paddd %xmm7,%xmm0
- movdqa %xmm0,0x30(%rsp)
- pxor %xmm0,%xmm15
- pshufb %xmm2,%xmm15
-
- # x8 += x12, x4 = rotl32(x4 ^ x8, 7)
- paddd %xmm12,%xmm8
- pxor %xmm8,%xmm4
- movdqa %xmm4,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm4
- por %xmm0,%xmm4
- # x9 += x13, x5 = rotl32(x5 ^ x9, 7)
- paddd %xmm13,%xmm9
- pxor %xmm9,%xmm5
- movdqa %xmm5,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm5
- por %xmm0,%xmm5
- # x10 += x14, x6 = rotl32(x6 ^ x10, 7)
- paddd %xmm14,%xmm10
- pxor %xmm10,%xmm6
- movdqa %xmm6,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm6
- por %xmm0,%xmm6
- # x11 += x15, x7 = rotl32(x7 ^ x11, 7)
- paddd %xmm15,%xmm11
- pxor %xmm11,%xmm7
- movdqa %xmm7,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm7
- por %xmm0,%xmm7
-
- # x0 += x5, x15 = rotl32(x15 ^ x0, 16)
- movdqa 0x00(%rsp),%xmm0
- paddd %xmm5,%xmm0
- movdqa %xmm0,0x00(%rsp)
- pxor %xmm0,%xmm15
- pshufb %xmm3,%xmm15
- # x1 += x6, x12 = rotl32(x12 ^ x1, 16)
- movdqa 0x10(%rsp),%xmm0
- paddd %xmm6,%xmm0
- movdqa %xmm0,0x10(%rsp)
- pxor %xmm0,%xmm12
- pshufb %xmm3,%xmm12
- # x2 += x7, x13 = rotl32(x13 ^ x2, 16)
- movdqa 0x20(%rsp),%xmm0
- paddd %xmm7,%xmm0
- movdqa %xmm0,0x20(%rsp)
- pxor %xmm0,%xmm13
- pshufb %xmm3,%xmm13
- # x3 += x4, x14 = rotl32(x14 ^ x3, 16)
- movdqa 0x30(%rsp),%xmm0
- paddd %xmm4,%xmm0
- movdqa %xmm0,0x30(%rsp)
- pxor %xmm0,%xmm14
- pshufb %xmm3,%xmm14
-
- # x10 += x15, x5 = rotl32(x5 ^ x10, 12)
- paddd %xmm15,%xmm10
- pxor %xmm10,%xmm5
- movdqa %xmm5,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm5
- por %xmm0,%xmm5
- # x11 += x12, x6 = rotl32(x6 ^ x11, 12)
- paddd %xmm12,%xmm11
- pxor %xmm11,%xmm6
- movdqa %xmm6,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm6
- por %xmm0,%xmm6
- # x8 += x13, x7 = rotl32(x7 ^ x8, 12)
- paddd %xmm13,%xmm8
- pxor %xmm8,%xmm7
- movdqa %xmm7,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm7
- por %xmm0,%xmm7
- # x9 += x14, x4 = rotl32(x4 ^ x9, 12)
- paddd %xmm14,%xmm9
- pxor %xmm9,%xmm4
- movdqa %xmm4,%xmm0
- pslld $12,%xmm0
- psrld $20,%xmm4
- por %xmm0,%xmm4
-
- # x0 += x5, x15 = rotl32(x15 ^ x0, 8)
- movdqa 0x00(%rsp),%xmm0
- paddd %xmm5,%xmm0
- movdqa %xmm0,0x00(%rsp)
- pxor %xmm0,%xmm15
- pshufb %xmm2,%xmm15
- # x1 += x6, x12 = rotl32(x12 ^ x1, 8)
- movdqa 0x10(%rsp),%xmm0
- paddd %xmm6,%xmm0
- movdqa %xmm0,0x10(%rsp)
- pxor %xmm0,%xmm12
- pshufb %xmm2,%xmm12
- # x2 += x7, x13 = rotl32(x13 ^ x2, 8)
- movdqa 0x20(%rsp),%xmm0
- paddd %xmm7,%xmm0
- movdqa %xmm0,0x20(%rsp)
- pxor %xmm0,%xmm13
- pshufb %xmm2,%xmm13
- # x3 += x4, x14 = rotl32(x14 ^ x3, 8)
- movdqa 0x30(%rsp),%xmm0
- paddd %xmm4,%xmm0
- movdqa %xmm0,0x30(%rsp)
- pxor %xmm0,%xmm14
- pshufb %xmm2,%xmm14
-
- # x10 += x15, x5 = rotl32(x5 ^ x10, 7)
- paddd %xmm15,%xmm10
- pxor %xmm10,%xmm5
- movdqa %xmm5,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm5
- por %xmm0,%xmm5
- # x11 += x12, x6 = rotl32(x6 ^ x11, 7)
- paddd %xmm12,%xmm11
- pxor %xmm11,%xmm6
- movdqa %xmm6,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm6
- por %xmm0,%xmm6
- # x8 += x13, x7 = rotl32(x7 ^ x8, 7)
- paddd %xmm13,%xmm8
- pxor %xmm8,%xmm7
- movdqa %xmm7,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm7
- por %xmm0,%xmm7
- # x9 += x14, x4 = rotl32(x4 ^ x9, 7)
- paddd %xmm14,%xmm9
- pxor %xmm9,%xmm4
- movdqa %xmm4,%xmm0
- pslld $7,%xmm0
- psrld $25,%xmm4
- por %xmm0,%xmm4
-
- dec %ecx
- jnz .Ldoubleround4
-
- # x0[0-3] += s0[0]
- # x1[0-3] += s0[1]
- movq 0x00(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd 0x00(%rsp),%xmm2
- movdqa %xmm2,0x00(%rsp)
- paddd 0x10(%rsp),%xmm3
- movdqa %xmm3,0x10(%rsp)
- # x2[0-3] += s0[2]
- # x3[0-3] += s0[3]
- movq 0x08(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd 0x20(%rsp),%xmm2
- movdqa %xmm2,0x20(%rsp)
- paddd 0x30(%rsp),%xmm3
- movdqa %xmm3,0x30(%rsp)
-
- # x4[0-3] += s1[0]
- # x5[0-3] += s1[1]
- movq 0x10(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm4
- paddd %xmm3,%xmm5
- # x6[0-3] += s1[2]
- # x7[0-3] += s1[3]
- movq 0x18(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm6
- paddd %xmm3,%xmm7
-
- # x8[0-3] += s2[0]
- # x9[0-3] += s2[1]
- movq 0x20(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm8
- paddd %xmm3,%xmm9
- # x10[0-3] += s2[2]
- # x11[0-3] += s2[3]
- movq 0x28(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm10
- paddd %xmm3,%xmm11
-
- # x12[0-3] += s3[0]
- # x13[0-3] += s3[1]
- movq 0x30(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm12
- paddd %xmm3,%xmm13
- # x14[0-3] += s3[2]
- # x15[0-3] += s3[3]
- movq 0x38(%rdi),%xmm3
- pshufd $0x00,%xmm3,%xmm2
- pshufd $0x55,%xmm3,%xmm3
- paddd %xmm2,%xmm14
- paddd %xmm3,%xmm15
-
- # x12 += counter values 0-3
- paddd %xmm1,%xmm12
-
- # interleave 32-bit words in state n, n+1
- movdqa 0x00(%rsp),%xmm0
- movdqa 0x10(%rsp),%xmm1
- movdqa %xmm0,%xmm2
- punpckldq %xmm1,%xmm2
- punpckhdq %xmm1,%xmm0
- movdqa %xmm2,0x00(%rsp)
- movdqa %xmm0,0x10(%rsp)
- movdqa 0x20(%rsp),%xmm0
- movdqa 0x30(%rsp),%xmm1
- movdqa %xmm0,%xmm2
- punpckldq %xmm1,%xmm2
- punpckhdq %xmm1,%xmm0
- movdqa %xmm2,0x20(%rsp)
- movdqa %xmm0,0x30(%rsp)
- movdqa %xmm4,%xmm0
- punpckldq %xmm5,%xmm4
- punpckhdq %xmm5,%xmm0
- movdqa %xmm0,%xmm5
- movdqa %xmm6,%xmm0
- punpckldq %xmm7,%xmm6
- punpckhdq %xmm7,%xmm0
- movdqa %xmm0,%xmm7
- movdqa %xmm8,%xmm0
- punpckldq %xmm9,%xmm8
- punpckhdq %xmm9,%xmm0
- movdqa %xmm0,%xmm9
- movdqa %xmm10,%xmm0
- punpckldq %xmm11,%xmm10
- punpckhdq %xmm11,%xmm0
- movdqa %xmm0,%xmm11
- movdqa %xmm12,%xmm0
- punpckldq %xmm13,%xmm12
- punpckhdq %xmm13,%xmm0
- movdqa %xmm0,%xmm13
- movdqa %xmm14,%xmm0
- punpckldq %xmm15,%xmm14
- punpckhdq %xmm15,%xmm0
- movdqa %xmm0,%xmm15
-
- # interleave 64-bit words in state n, n+2
- movdqa 0x00(%rsp),%xmm0
- movdqa 0x20(%rsp),%xmm1
- movdqa %xmm0,%xmm2
- punpcklqdq %xmm1,%xmm2
- punpckhqdq %xmm1,%xmm0
- movdqa %xmm2,0x00(%rsp)
- movdqa %xmm0,0x20(%rsp)
- movdqa 0x10(%rsp),%xmm0
- movdqa 0x30(%rsp),%xmm1
- movdqa %xmm0,%xmm2
- punpcklqdq %xmm1,%xmm2
- punpckhqdq %xmm1,%xmm0
- movdqa %xmm2,0x10(%rsp)
- movdqa %xmm0,0x30(%rsp)
- movdqa %xmm4,%xmm0
- punpcklqdq %xmm6,%xmm4
- punpckhqdq %xmm6,%xmm0
- movdqa %xmm0,%xmm6
- movdqa %xmm5,%xmm0
- punpcklqdq %xmm7,%xmm5
- punpckhqdq %xmm7,%xmm0
- movdqa %xmm0,%xmm7
- movdqa %xmm8,%xmm0
- punpcklqdq %xmm10,%xmm8
- punpckhqdq %xmm10,%xmm0
- movdqa %xmm0,%xmm10
- movdqa %xmm9,%xmm0
- punpcklqdq %xmm11,%xmm9
- punpckhqdq %xmm11,%xmm0
- movdqa %xmm0,%xmm11
- movdqa %xmm12,%xmm0
- punpcklqdq %xmm14,%xmm12
- punpckhqdq %xmm14,%xmm0
- movdqa %xmm0,%xmm14
- movdqa %xmm13,%xmm0
- punpcklqdq %xmm15,%xmm13
- punpckhqdq %xmm15,%xmm0
- movdqa %xmm0,%xmm15
-
- # xor with corresponding input, write to output
- movdqa 0x00(%rsp),%xmm0
- movdqu 0x00(%rdx),%xmm1
- pxor %xmm1,%xmm0
- movdqu %xmm0,0x00(%rsi)
- movdqa 0x10(%rsp),%xmm0
- movdqu 0x80(%rdx),%xmm1
- pxor %xmm1,%xmm0
- movdqu %xmm0,0x80(%rsi)
- movdqa 0x20(%rsp),%xmm0
- movdqu 0x40(%rdx),%xmm1
- pxor %xmm1,%xmm0
- movdqu %xmm0,0x40(%rsi)
- movdqa 0x30(%rsp),%xmm0
- movdqu 0xc0(%rdx),%xmm1
- pxor %xmm1,%xmm0
- movdqu %xmm0,0xc0(%rsi)
- movdqu 0x10(%rdx),%xmm1
- pxor %xmm1,%xmm4
- movdqu %xmm4,0x10(%rsi)
- movdqu 0x90(%rdx),%xmm1
- pxor %xmm1,%xmm5
- movdqu %xmm5,0x90(%rsi)
- movdqu 0x50(%rdx),%xmm1
- pxor %xmm1,%xmm6
- movdqu %xmm6,0x50(%rsi)
- movdqu 0xd0(%rdx),%xmm1
- pxor %xmm1,%xmm7
- movdqu %xmm7,0xd0(%rsi)
- movdqu 0x20(%rdx),%xmm1
- pxor %xmm1,%xmm8
- movdqu %xmm8,0x20(%rsi)
- movdqu 0xa0(%rdx),%xmm1
- pxor %xmm1,%xmm9
- movdqu %xmm9,0xa0(%rsi)
- movdqu 0x60(%rdx),%xmm1
- pxor %xmm1,%xmm10
- movdqu %xmm10,0x60(%rsi)
- movdqu 0xe0(%rdx),%xmm1
- pxor %xmm1,%xmm11
- movdqu %xmm11,0xe0(%rsi)
- movdqu 0x30(%rdx),%xmm1
- pxor %xmm1,%xmm12
- movdqu %xmm12,0x30(%rsi)
- movdqu 0xb0(%rdx),%xmm1
- pxor %xmm1,%xmm13
- movdqu %xmm13,0xb0(%rsi)
- movdqu 0x70(%rdx),%xmm1
- pxor %xmm1,%xmm14
- movdqu %xmm14,0x70(%rsi)
- movdqu 0xf0(%rdx),%xmm1
- pxor %xmm1,%xmm15
- movdqu %xmm15,0xf0(%rsi)
-
- lea -8(%r10),%rsp
- ret
-ENDPROC(chacha20_4block_xor_ssse3)
diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
deleted file mode 100644
index dce7c5d39c2f..000000000000
--- a/arch/x86/crypto/chacha20_glue.c
+++ /dev/null
@@ -1,146 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <crypto/algapi.h>
-#include <crypto/chacha20.h>
-#include <crypto/internal/skcipher.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <asm/fpu/api.h>
-#include <asm/simd.h>
-
-#define CHACHA20_STATE_ALIGN 16
-
-asmlinkage void chacha20_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src);
-asmlinkage void chacha20_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src);
-#ifdef CONFIG_AS_AVX2
-asmlinkage void chacha20_8block_xor_avx2(u32 *state, u8 *dst, const u8 *src);
-static bool chacha20_use_avx2;
-#endif
-
-static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
- unsigned int bytes)
-{
- u8 buf[CHACHA20_BLOCK_SIZE];
-
-#ifdef CONFIG_AS_AVX2
- if (chacha20_use_avx2) {
- while (bytes >= CHACHA20_BLOCK_SIZE * 8) {
- chacha20_8block_xor_avx2(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE * 8;
- src += CHACHA20_BLOCK_SIZE * 8;
- dst += CHACHA20_BLOCK_SIZE * 8;
- state[12] += 8;
- }
- }
-#endif
- while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
- chacha20_4block_xor_ssse3(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE * 4;
- src += CHACHA20_BLOCK_SIZE * 4;
- dst += CHACHA20_BLOCK_SIZE * 4;
- state[12] += 4;
- }
- while (bytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_block_xor_ssse3(state, dst, src);
- bytes -= CHACHA20_BLOCK_SIZE;
- src += CHACHA20_BLOCK_SIZE;
- dst += CHACHA20_BLOCK_SIZE;
- state[12]++;
- }
- if (bytes) {
- memcpy(buf, src, bytes);
- chacha20_block_xor_ssse3(state, buf, buf);
- memcpy(dst, buf, bytes);
- }
-}
-
-static int chacha20_simd(struct skcipher_request *req)
-{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
- struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
- u32 *state, state_buf[16 + 2] __aligned(8);
- struct skcipher_walk walk;
- int err;
-
- BUILD_BUG_ON(CHACHA20_STATE_ALIGN != 16);
- state = PTR_ALIGN(state_buf + 0, CHACHA20_STATE_ALIGN);
-
- if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
- return crypto_chacha20_crypt(req);
-
- err = skcipher_walk_virt(&walk, req, true);
-
- crypto_chacha20_init(state, ctx, walk.iv);
-
- kernel_fpu_begin();
-
- while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
- rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
- err = skcipher_walk_done(&walk,
- walk.nbytes % CHACHA20_BLOCK_SIZE);
- }
-
- if (walk.nbytes) {
- chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
- walk.nbytes);
- err = skcipher_walk_done(&walk, 0);
- }
-
- kernel_fpu_end();
-
- return err;
-}
-
-static struct skcipher_alg alg = {
- .base.cra_name = "chacha20",
- .base.cra_driver_name = "chacha20-simd",
- .base.cra_priority = 300,
- .base.cra_blocksize = 1,
- .base.cra_ctxsize = sizeof(struct chacha20_ctx),
- .base.cra_module = THIS_MODULE,
-
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .chunksize = CHACHA20_BLOCK_SIZE,
- .setkey = crypto_chacha20_setkey,
- .encrypt = chacha20_simd,
- .decrypt = chacha20_simd,
-};
-
-static int __init chacha20_simd_mod_init(void)
-{
- if (!boot_cpu_has(X86_FEATURE_SSSE3))
- return -ENODEV;
-
-#ifdef CONFIG_AS_AVX2
- chacha20_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) &&
- boot_cpu_has(X86_FEATURE_AVX2) &&
- cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
-#endif
- return crypto_register_skcipher(&alg);
-}
-
-static void __exit chacha20_simd_mod_fini(void)
-{
- crypto_unregister_skcipher(&alg);
-}
-
-module_init(chacha20_simd_mod_init);
-module_exit(chacha20_simd_mod_fini);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("chacha20 cipher algorithm, SIMD accelerated");
-MODULE_ALIAS_CRYPTO("chacha20");
-MODULE_ALIAS_CRYPTO("chacha20-simd");
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 47859a0f8052..42dc48aa9b81 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1428,27 +1428,12 @@ config CRYPTO_SALSA20
config CRYPTO_CHACHA20
tristate "ChaCha20 cipher algorithm"
select CRYPTO_BLKCIPHER
+ select ZINC_CHACHA20
help
ChaCha20 cipher algorithm, RFC7539.
ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
Bernstein and further specified in RFC7539 for use in IETF protocols.
- This is the portable C implementation of ChaCha20.
-
- See also:
- <http://cr.yp.to/chacha/chacha-20080128.pdf>
-
-config CRYPTO_CHACHA20_X86_64
- tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)"
- depends on X86 && 64BIT
- select CRYPTO_BLKCIPHER
- select CRYPTO_CHACHA20
- help
- ChaCha20 cipher algorithm, RFC7539.
-
- ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J.
- Bernstein and further specified in RFC7539 for use in IETF protocols.
- This is the x86_64 assembler implementation using SIMD instructions.
See also:
<http://cr.yp.to/chacha/chacha-20080128.pdf>
diff --git a/crypto/Makefile b/crypto/Makefile
index 5e60348d02e2..587103b87890 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -117,7 +117,7 @@ obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
obj-$(CONFIG_CRYPTO_SEED) += seed.o
obj-$(CONFIG_CRYPTO_SPECK) += speck.o
obj-$(CONFIG_CRYPTO_SALSA20) += salsa20_generic.o
-obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_generic.o
+obj-$(CONFIG_CRYPTO_CHACHA20) += chacha20_zinc.o
obj-$(CONFIG_CRYPTO_POLY1305) += poly1305_zinc.o
obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
deleted file mode 100644
index e451c3cb6a56..000000000000
--- a/crypto/chacha20_generic.c
+++ /dev/null
@@ -1,136 +0,0 @@
-/*
- * ChaCha20 256-bit cipher algorithm, RFC7539
- *
- * Copyright (C) 2015 Martin Willi
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include <asm/unaligned.h>
-#include <crypto/algapi.h>
-#include <crypto/chacha20.h>
-#include <crypto/internal/skcipher.h>
-#include <linux/module.h>
-
-static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src,
- unsigned int bytes)
-{
- u32 stream[CHACHA20_BLOCK_WORDS];
-
- if (dst != src)
- memcpy(dst, src, bytes);
-
- while (bytes >= CHACHA20_BLOCK_SIZE) {
- chacha20_block(state, stream);
- crypto_xor(dst, (const u8 *)stream, CHACHA20_BLOCK_SIZE);
- bytes -= CHACHA20_BLOCK_SIZE;
- dst += CHACHA20_BLOCK_SIZE;
- }
- if (bytes) {
- chacha20_block(state, stream);
- crypto_xor(dst, (const u8 *)stream, bytes);
- }
-}
-
-void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
-{
- state[0] = 0x61707865; /* "expa" */
- state[1] = 0x3320646e; /* "nd 3" */
- state[2] = 0x79622d32; /* "2-by" */
- state[3] = 0x6b206574; /* "te k" */
- state[4] = ctx->key[0];
- state[5] = ctx->key[1];
- state[6] = ctx->key[2];
- state[7] = ctx->key[3];
- state[8] = ctx->key[4];
- state[9] = ctx->key[5];
- state[10] = ctx->key[6];
- state[11] = ctx->key[7];
- state[12] = get_unaligned_le32(iv + 0);
- state[13] = get_unaligned_le32(iv + 4);
- state[14] = get_unaligned_le32(iv + 8);
- state[15] = get_unaligned_le32(iv + 12);
-}
-EXPORT_SYMBOL_GPL(crypto_chacha20_init);
-
-int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
- unsigned int keysize)
-{
- struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
- int i;
-
- if (keysize != CHACHA20_KEY_SIZE)
- return -EINVAL;
-
- for (i = 0; i < ARRAY_SIZE(ctx->key); i++)
- ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32));
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
-
-int crypto_chacha20_crypt(struct skcipher_request *req)
-{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
- struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
- struct skcipher_walk walk;
- u32 state[16];
- int err;
-
- err = skcipher_walk_virt(&walk, req, true);
-
- crypto_chacha20_init(state, ctx, walk.iv);
-
- while (walk.nbytes > 0) {
- unsigned int nbytes = walk.nbytes;
-
- if (nbytes < walk.total)
- nbytes = round_down(nbytes, walk.stride);
-
- chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
- nbytes);
- err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
- }
-
- return err;
-}
-EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
-
-static struct skcipher_alg alg = {
- .base.cra_name = "chacha20",
- .base.cra_driver_name = "chacha20-generic",
- .base.cra_priority = 100,
- .base.cra_blocksize = 1,
- .base.cra_ctxsize = sizeof(struct chacha20_ctx),
- .base.cra_module = THIS_MODULE,
-
- .min_keysize = CHACHA20_KEY_SIZE,
- .max_keysize = CHACHA20_KEY_SIZE,
- .ivsize = CHACHA20_IV_SIZE,
- .chunksize = CHACHA20_BLOCK_SIZE,
- .setkey = crypto_chacha20_setkey,
- .encrypt = crypto_chacha20_crypt,
- .decrypt = crypto_chacha20_crypt,
-};
-
-static int __init chacha20_generic_mod_init(void)
-{
- return crypto_register_skcipher(&alg);
-}
-
-static void __exit chacha20_generic_mod_fini(void)
-{
- crypto_unregister_skcipher(&alg);
-}
-
-module_init(chacha20_generic_mod_init);
-module_exit(chacha20_generic_mod_fini);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
-MODULE_DESCRIPTION("chacha20 cipher algorithm");
-MODULE_ALIAS_CRYPTO("chacha20");
-MODULE_ALIAS_CRYPTO("chacha20-generic");
diff --git a/crypto/chacha20_zinc.c b/crypto/chacha20_zinc.c
new file mode 100644
index 000000000000..55e4585de08c
--- /dev/null
+++ b/crypto/chacha20_zinc.c
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include <asm/unaligned.h>
+#include <crypto/algapi.h>
+#include <crypto/internal/skcipher.h>
+#include <zinc/chacha20.h>
+#include <linux/module.h>
+
+static int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
+ unsigned int keysize)
+{
+ struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ if (keysize != CHACHA20_KEY_SIZE)
+ return -EINVAL;
+ chacha20_init(ctx, key, 0);
+ return 0;
+}
+
+static int crypto_chacha20_crypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct chacha20_ctx ctx = *(struct chacha20_ctx *)crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ simd_context_t simd_context;
+ int err, i;
+
+ err = skcipher_walk_virt(&walk, req, true);
+ if (unlikely(err))
+ return err;
+
+ for (i = 0; i < ARRAY_SIZE(ctx.counter); ++i)
+ ctx.counter[i] = get_unaligned_le32(walk.iv + i * sizeof(u32));
+
+ simd_get(&simd_context);
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+
+ if (nbytes < walk.total)
+ nbytes = round_down(nbytes, walk.stride);
+
+ chacha20(&ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes,
+ &simd_context);
+
+ err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+ simd_relax(&simd_context);
+ }
+ simd_put(&simd_context);
+
+ return err;
+}
+
+static struct skcipher_alg alg = {
+ .base.cra_name = "chacha20",
+ .base.cra_driver_name = "chacha20-software",
+ .base.cra_priority = 100,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct chacha20_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = CHACHA20_KEY_SIZE,
+ .max_keysize = CHACHA20_KEY_SIZE,
+ .ivsize = CHACHA20_NONCE_SIZE,
+ .chunksize = CHACHA20_BLOCK_SIZE,
+ .setkey = crypto_chacha20_setkey,
+ .encrypt = crypto_chacha20_crypt,
+ .decrypt = crypto_chacha20_crypt,
+};
+
+static int __init chacha20_mod_init(void)
+{
+ return crypto_register_skcipher(&alg);
+}
+
+static void __exit chacha20_mod_exit(void)
+{
+ crypto_unregister_skcipher(&alg);
+}
+
+module_init(chacha20_mod_init);
+module_exit(chacha20_mod_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_DESCRIPTION("ChaCha20 stream cipher");
+MODULE_ALIAS_CRYPTO("chacha20");
+MODULE_ALIAS_CRYPTO("chacha20-software");
diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c
index bf523797bef3..585c7ef4f543 100644
--- a/crypto/chacha20poly1305.c
+++ b/crypto/chacha20poly1305.c
@@ -13,7 +13,7 @@
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
#include <crypto/scatterwalk.h>
-#include <crypto/chacha20.h>
+#include <zinc/chacha20.h>
#include <zinc/poly1305.h>
#include <linux/err.h>
#include <linux/init.h>
@@ -51,7 +51,7 @@ struct poly_req {
};
struct chacha_req {
- u8 iv[CHACHA20_IV_SIZE];
+ u8 iv[CHACHA20_NONCE_SIZE];
struct scatterlist src[1];
struct skcipher_request req; /* must be last member */
};
@@ -91,7 +91,7 @@ static void chacha_iv(u8 *iv, struct aead_request *req, u32 icb)
memcpy(iv, &leicb, sizeof(leicb));
memcpy(iv + sizeof(leicb), ctx->salt, ctx->saltlen);
memcpy(iv + sizeof(leicb) + ctx->saltlen, req->iv,
- CHACHA20_IV_SIZE - sizeof(leicb) - ctx->saltlen);
+ CHACHA20_NONCE_SIZE - sizeof(leicb) - ctx->saltlen);
}
static int poly_verify_tag(struct aead_request *req)
@@ -639,7 +639,7 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
err = -EINVAL;
/* Need 16-byte IV size, including Initial Block Counter value */
- if (crypto_skcipher_alg_ivsize(chacha) != CHACHA20_IV_SIZE)
+ if (crypto_skcipher_alg_ivsize(chacha) != CHACHA20_NONCE_SIZE)
goto out_drop_chacha;
/* Not a stream cipher? */
if (chacha->base.cra_blocksize != 1)
diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
index b83d66073db0..3b92f58f3891 100644
--- a/include/crypto/chacha20.h
+++ b/include/crypto/chacha20.h
@@ -6,23 +6,11 @@
#ifndef _CRYPTO_CHACHA20_H
#define _CRYPTO_CHACHA20_H
-#include <crypto/skcipher.h>
-#include <linux/types.h>
-#include <linux/crypto.h>
-
#define CHACHA20_IV_SIZE 16
#define CHACHA20_KEY_SIZE 32
#define CHACHA20_BLOCK_SIZE 64
#define CHACHA20_BLOCK_WORDS (CHACHA20_BLOCK_SIZE / sizeof(u32))
-struct chacha20_ctx {
- u32 key[8];
-};
-
void chacha20_block(u32 *state, u32 *stream);
-void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
-int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
- unsigned int keysize);
-int crypto_chacha20_crypt(struct skcipher_request *req);
#endif
--
2.19.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v7 26/28] crypto: port ChaCha20 to Zinc
2018-10-06 2:57 ` [PATCH net-next v7 26/28] crypto: port ChaCha20 " Jason A. Donenfeld
@ 2018-10-06 13:07 ` Martin Willi
0 siblings, 0 replies; 28+ messages in thread
From: Martin Willi @ 2018-10-06 13:07 UTC (permalink / raw)
To: Jason A. Donenfeld, linux-kernel, netdev, davem, gregkh
Cc: Samuel Neves, Andy Lutomirski, linux-crypto
Hi Jason,
> Now that ChaCha20 is in Zinc, we can have the crypto API code simply
> call into it.
> delete mode 100644 arch/x86/crypto/chacha20-avx2-x86_64.S
> delete mode 100644 arch/x86/crypto/chacha20-ssse3-x86_64.S
I did some more testing with that new Zinc ChaCha20 code on x64, and
I'm still not convinced that it is an improvement compared to the
existing implementation.
>From a performance perspective, Zinc is in faster when working on sizes
that are not a multiple of chacha block sizes. This is due to the more
aggressive use of SSE/AVX code paths compared to the conservative use
in the existing implementation; instead of calculating two separate
blocks that are actually required, one can calculate four of them and
just discards two. This certainly improves benchmark results, but also
has some side effects regarding energy usage, thermal budget or even
shared hyper-threading resources.
One can certainly argue that the more aggressive approach is
preferable. However, I did some fairly trivial (non-optimized) changes
to the existing implementations to use a similar aggressive approach.
Numbers for SSE are slightly in favor of the existing implementation,
while the AVX path is almost on par, see below (produces some
interesting graphs, btw.).
When looking at your code, the assembly generated from Perl is
certainly harder to work with. The plain C version does make some heavy
use of macros and other tricks, but with a very questionable effect at
least on my system.
That being said, I think that whole mystic Zinc thing does not really
help in having a common base to work with or handling questions like
these above. In the end, these are just some crypto function that it
provides, and this IMHO can very well live under where it belongs to.
Best regards
Martin
---
ChaCha20 benchmark using tcrypt, numbers in kOps/s, current
implementation with a more aggressive SSE/AVX use vs. zinc:
size crnt zinc
8 5750 5818
16 5843 5726
24 5746 5757
32 5820 5813
40 5761 5710
48 5735 5761
56 5723 5742
64 5871 5685
72 3714 3520
80 3587 3475
88 3686 3424
96 3580 3371
104 3712 3313
112 3582 3207
120 3679 3150
128 3567 3568
136 3674 3690
144 3525 3599
152 3684 3566
160 3593 3515
168 3682 3437
176 3564 3325
184 3671 3279
192 3573 3762
200 3667 3702
208 3576 3622
216 3662 3518
224 3566 3445
232 3654 3422
240 3565 3317
248 3640 3279
256 3720 3723
264 3615 3639
272 3594 3597
280 3587 3565
288 3502 3484
296 3605 3422
304 3620 3352
312 3592 3308
320 3488 3694
328 3580 3681
336 3585 3599
344 3587 3523
352 3486 3419
360 3579 3403
368 3601 3334
376 3581 3257
384 3498 3715
392 3601 3612
400 3600 3553
408 3596 3496
416 3495 3430
424 3591 3402
432 3568 3311
440 3576 3275
448 3501 3689
456 3563 3618
464 3592 3576
472 3581 3509
480 3480 3405
488 3556 3397
496 3563 3298
504 3567 3277
512 3656 3735
520 2575 2209
528 2524 2148
536 2571 2164
544 2519 2138
552 2570 2126
560 2510 2035
568 2526 2041
576 2633 2199
584 2151 2183
592 2113 2145
600 2159 2155
608 2108 2133
616 2157 2115
624 2104 2064
632 2159 2045
640 2104 2188
648 2142 2182
656 2115 2158
664 2151 2147
672 2113 2139
680 2146 2114
688 2097 2077
696 2137 2043
704 2101 2208
712 2137 2189
720 2117 2169
728 2132 2145
736 2107 2142
744 2136 2081
752 2105 2064
760 2136 2043
768 2166 2211
776 2122 2192
784 2129 2146
792 2126 2141
800 2094 2094
808 2126 2100
816 2133 2061
824 2134 2045
832 2103 2223
840 2143 2184
848 2130 2173
856 2135 2145
864 2084 2126
872 2134 2105
880 2128 2056
888 2131 2043
896 2093 2219
904 2127 2192
912 2130 2170
920 2127 2149
928 2082 2125
936 2113 2098
944 2126 2060
952 2120 2049
960 2085 2204
968 2088 2187
976 1927 2166
984 1943 2136
992 1911 2119
1000 1959 2101
1008 2116 2042
1016 2124 2048
1024 2152 2195
1032 1729 1565
1040 1708 1544
1048 1726 1554
1056 1702 1541
1064 1724 1523
1072 1699 1507
1080 1719 1497
1088 1767 1592
1096 1536 1575
1104 1506 1563
1112 1529 1544
1120 1518 1521
1128 1526 1521
1136 1518 1501
1144 1535 1491
1152 1507 1575
1160 1525 1558
1168 1500 1554
1176 1524 1545
1184 1516 1538
1192 1532 1530
1200 1511 1493
1208 1512 1498
1216 1505 1581
1224 1518 1563
1232 1513 1549
1240 1533 1538
1248 1504 1527
1256 1532 1520
1264 1510 1505
1272 1525 1492
1280 1539 1574
1288 1518 1573
1296 1522 1551
1304 1520 1548
1312 1508 1535
1320 1524 1524
1328 1522 1508
1336 1515 1500
1344 1496 1579
1352 1517 1573
1360 1522 1546
1368 1515 1545
1376 1494 1536
1384 1516 1526
1392 1522 1504
1400 1520 1480
1408 1501 1589
1416 1511 1558
1424 1516 1546
1432 1516 1537
1440 1502 1523
1448 1516 1512
1456 1510 1491
1464 1509 1481
1472 1496 1577
1480 1514 1559
1488 1512 1548
1496 1513 1534
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc
2018-10-06 2:57 ` [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc Jason A. Donenfeld
@ 2018-10-08 23:21 ` Eric Biggers
2018-10-09 0:02 ` Jason A. Donenfeld
0 siblings, 1 reply; 28+ messages in thread
From: Eric Biggers @ 2018-10-08 23:21 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: linux-kernel, netdev, davem, gregkh, Samuel Neves,
Andy Lutomirski, linux-crypto
On Sat, Oct 06, 2018 at 04:57:06AM +0200, Jason A. Donenfeld wrote:
> diff --git a/crypto/poly1305_zinc.c b/crypto/poly1305_zinc.c
> new file mode 100644
> index 000000000000..4794442edf26
> --- /dev/null
> +++ b/crypto/poly1305_zinc.c
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * Copyright (C) 2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + */
> +
> +#include <crypto/algapi.h>
> +#include <crypto/internal/hash.h>
> +#include <zinc/poly1305.h>
> +#include <linux/crypto.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/simd.h>
> +
> +struct poly1305_desc_ctx {
> + struct poly1305_ctx ctx;
> + u8 key[POLY1305_KEY_SIZE];
> + unsigned int rem_key_bytes;
> +};
> +
> +static int crypto_poly1305_init(struct shash_desc *desc)
> +{
> + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
> + dctx->rem_key_bytes = POLY1305_KEY_SIZE;
> + return 0;
> +}
> +
> +static int crypto_poly1305_update(struct shash_desc *desc, const u8 *src,
> + unsigned int srclen)
> +{
> + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
> + simd_context_t simd_context;
> +
> + if (unlikely(dctx->rem_key_bytes)) {
> + unsigned int key_bytes = min(srclen, dctx->rem_key_bytes);
> + memcpy(dctx->key + (POLY1305_KEY_SIZE - dctx->rem_key_bytes),
> + src, key_bytes);
> + src += key_bytes;
> + srclen -= key_bytes;
> + dctx->rem_key_bytes -= key_bytes;
> + if (!dctx->rem_key_bytes) {
> + poly1305_init(&dctx->ctx, dctx->key);
> + memzero_explicit(dctx->key, sizeof(dctx->key));
> + }
> + if (!srclen)
> + return 0;
> + }
> +
> + simd_get(&simd_context);
> + poly1305_update(&dctx->ctx, src, srclen, &simd_context);
> + simd_put(&simd_context);
> +
> + return 0;
> +}
> +
> +static int crypto_poly1305_final(struct shash_desc *desc, u8 *dst)
> +{
> + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
> + simd_context_t simd_context;
> +
> + simd_get(&simd_context);
> + poly1305_final(&dctx->ctx, dst, &simd_context);
> + simd_put(&simd_context);
> + return 0;
> +}
This crashes on very short inputs. crypto_poly1305_final() is missing:
if (dctx->rem_key_bytes)
return -ENOKEY;
- Eric
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library
2018-10-06 2:56 ` [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library Jason A. Donenfeld
@ 2018-10-08 23:22 ` Eric Biggers
0 siblings, 0 replies; 28+ messages in thread
From: Eric Biggers @ 2018-10-08 23:22 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: linux-kernel, netdev, davem, gregkh, Samuel Neves,
Jean-Philippe Aumasson, Andy Lutomirski, Andrew Morton,
Linus Torvalds, kernel-hardening, linux-crypto
On Sat, Oct 06, 2018 at 04:56:44AM +0200, Jason A. Donenfeld wrote:
> Zinc stands for "Zinc Is Neat Crypto" or "Zinc as IN Crypto". It's also
> short, easy to type, and plays nicely with the recent trend of naming
> crypto libraries after elements. The guiding principle is "don't overdo
> it". It's less of a library and more of a directory tree for organizing
> well-curated direct implementations of cryptography primitives.
>
As I've asked before: please Cc linux-crypto on the whole patch series,
including the cover letter.
- Eric
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc
2018-10-08 23:21 ` Eric Biggers
@ 2018-10-09 0:02 ` Jason A. Donenfeld
0 siblings, 0 replies; 28+ messages in thread
From: Jason A. Donenfeld @ 2018-10-09 0:02 UTC (permalink / raw)
To: Eric Biggers
Cc: LKML, Netdev, David Miller, Greg Kroah-Hartman, Samuel Neves,
Andrew Lutomirski, Linux Crypto Mailing List
Hi Eric,
On Tue, Oct 9, 2018 at 1:21 AM Eric Biggers <ebiggers@kernel.org> wrote:
> This crashes on very short inputs. crypto_poly1305_final() is missing:
>
> if (dctx->rem_key_bytes)
> return -ENOKEY;
Good catch, thanks. Queued for v8.
Jason
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2018-10-09 0:02 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181006025709.4019-1-Jason@zx2c4.com>
2018-10-06 2:56 ` [PATCH net-next v7 03/28] zinc: introduce minimal cryptography library Jason A. Donenfeld
2018-10-08 23:22 ` Eric Biggers
2018-10-06 2:56 ` [PATCH net-next v7 04/28] zinc: ChaCha20 generic C implementation and selftest Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 05/28] zinc: import Andy Polyakov's ChaCha20 x86_64 implementation Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 06/28] zinc: " Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 07/28] zinc: import Andy Polyakov's ChaCha20 ARM and ARM64 implementations Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 08/28] zinc: port " Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 09/28] zinc: " Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 10/28] zinc: ChaCha20 MIPS32r2 implementation Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 11/28] zinc: Poly1305 generic C implementations and selftest Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 12/28] zinc: import Andy Polyakov's Poly1305 x86_64 implementation Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 13/28] zinc: " Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 14/28] zinc: import Andy Polyakov's Poly1305 ARM and ARM64 implementations Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 15/28] zinc: " Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 16/28] zinc: import Andy Polyakov's Poly1305 MIPS64 implementation Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 17/28] zinc: Poly1305 MIPS32r2 and MIPS64 implementations Jason A. Donenfeld
2018-10-06 2:56 ` [PATCH net-next v7 18/28] zinc: ChaCha20Poly1305 construction and selftest Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 19/28] zinc: BLAKE2s generic C implementation " Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 20/28] zinc: BLAKE2s x86_64 implementation Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 21/28] zinc: Curve25519 generic C implementations and selftest Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 22/28] zinc: Curve25519 x86_64 implementation Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 23/28] zinc: import Bernstein and Schwabe's Curve25519 ARM implementation Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 24/28] zinc: " Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 25/28] crypto: port Poly1305 to Zinc Jason A. Donenfeld
2018-10-08 23:21 ` Eric Biggers
2018-10-09 0:02 ` Jason A. Donenfeld
2018-10-06 2:57 ` [PATCH net-next v7 26/28] crypto: port ChaCha20 " Jason A. Donenfeld
2018-10-06 13:07 ` Martin Willi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).