* [PATCH 0/2] Implement PMULL using host intrinsics @ 2023-06-01 12:33 Ard Biesheuvel 2023-06-01 12:33 ` [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 Ard Biesheuvel 2023-06-01 12:33 ` [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions Ard Biesheuvel 0 siblings, 2 replies; 6+ messages in thread From: Ard Biesheuvel @ 2023-06-01 12:33 UTC (permalink / raw) To: qemu-arm Cc: qemu-devel, Ard Biesheuvel, Peter Maydell, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé Another set of RFC patches - this time for 64x64->128 polynomial multiplication. Playing round with this on top of the AES changes I sent out earlier this week, I noticed that the speedup is rather substantial. PMULL is relevant for GCM encryption, which combines AES in counter mode with GHASH, which is based on multiplication in GF(2^128). The significance of PMULL to this encryption mode is basically why PMULL is part of the AES crypto extension on AArch64. Note that user emulation on a AArch64 host of x86 binaries that perform any kind of HTTPS communication under the hood would likely benefit from this. Again, this approach is likely too ad-hoc, but it helps span the space of what we might want to cover in terms of host acceleration API. (I'm not a TCG expert, but I guess this raises the question what to cover in helpers and what to cover using native TCG ops?) Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Ard Biesheuvel (2): target/arm: Use x86 intrinsics to implement PMULL.P64 target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions host/include/aarch64/host/cpuinfo.h | 1 + host/include/i386/host/cpuinfo.h | 1 + target/arm/tcg/vec_helper.c | 26 +++++++++++++++++++- target/i386/ops_sse.h | 24 ++++++++++++++++++ util/cpuinfo-aarch64.c | 1 + util/cpuinfo-i386.c | 1 + 6 files changed, 53 insertions(+), 1 deletion(-) -- 2.39.2 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 2023-06-01 12:33 [PATCH 0/2] Implement PMULL using host intrinsics Ard Biesheuvel @ 2023-06-01 12:33 ` Ard Biesheuvel 2023-06-01 13:00 ` Peter Maydell 2023-06-01 12:33 ` [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions Ard Biesheuvel 1 sibling, 1 reply; 6+ messages in thread From: Ard Biesheuvel @ 2023-06-01 12:33 UTC (permalink / raw) To: qemu-arm Cc: qemu-devel, Ard Biesheuvel, Peter Maydell, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- host/include/i386/host/cpuinfo.h | 1 + target/arm/tcg/vec_helper.c | 26 +++++++++++++++++++- util/cpuinfo-i386.c | 1 + 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h index 073d0a426f31487d..cf4ced844760d28f 100644 --- a/host/include/i386/host/cpuinfo.h +++ b/host/include/i386/host/cpuinfo.h @@ -27,6 +27,7 @@ #define CPUINFO_ATOMIC_VMOVDQA (1u << 16) #define CPUINFO_ATOMIC_VMOVDQU (1u << 17) #define CPUINFO_AES (1u << 18) +#define CPUINFO_PMULL (1u << 19) /* Initialized with a constructor. */ extern unsigned cpuinfo; diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index f59d3b26eacf08f8..fb422627588439b3 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -25,6 +25,14 @@ #include "qemu/int128.h" #include "vec_internal.h" +#ifdef __x86_64__ +#include "host/cpuinfo.h" +#include <wmmintrin.h> +#define TARGET_PMULL __attribute__((__target__("pclmul"))) +#else +#define TARGET_PMULL +#endif + /* * Data for expanding active predicate bits to bytes, for byte elements. * @@ -2010,12 +2018,28 @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc) * Because of the lanes are not accessed in strict columns, * this probably cannot be turned into a generic helper. */ -void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) +void TARGET_PMULL HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, j, opr_sz = simd_oprsz(desc); intptr_t hi = simd_data(desc); uint64_t *d = vd, *n = vn, *m = vm; +#ifdef __x86_64__ + if (cpuinfo & CPUINFO_PMULL) { + switch (hi) { + case 0: + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x0); + break; + case 1: + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x11); + break; + default: + g_assert_not_reached(); + } + return; + } +#endif + for (i = 0; i < opr_sz / 8; i += 2) { uint64_t nn = n[i + hi]; uint64_t mm = m[i + hi]; diff --git a/util/cpuinfo-i386.c b/util/cpuinfo-i386.c index 3043f066c0182dc8..8930e13451201a64 100644 --- a/util/cpuinfo-i386.c +++ b/util/cpuinfo-i386.c @@ -40,6 +40,7 @@ unsigned __attribute__((constructor)) cpuinfo_init(void) info |= (c & bit_MOVBE ? CPUINFO_MOVBE : 0); info |= (c & bit_POPCNT ? CPUINFO_POPCNT : 0); info |= (c & bit_AES ? CPUINFO_AES : 0); + info |= (c & bit_PCLMULQDQ ? CPUINFO_PMULL : 0); /* For AVX features, we must check available and usable. */ if ((c & bit_AVX) && (c & bit_OSXSAVE)) { -- 2.39.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 2023-06-01 12:33 ` [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 Ard Biesheuvel @ 2023-06-01 13:00 ` Peter Maydell 2023-06-01 15:28 ` Ard Biesheuvel 0 siblings, 1 reply; 6+ messages in thread From: Peter Maydell @ 2023-06-01 13:00 UTC (permalink / raw) To: Ard Biesheuvel Cc: qemu-arm, qemu-devel, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé On Thu, 1 Jun 2023 at 13:33, Ard Biesheuvel <ardb@kernel.org> wrote: > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > host/include/i386/host/cpuinfo.h | 1 + > target/arm/tcg/vec_helper.c | 26 +++++++++++++++++++- > util/cpuinfo-i386.c | 1 + > 3 files changed, 27 insertions(+), 1 deletion(-) > > diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h > index 073d0a426f31487d..cf4ced844760d28f 100644 > --- a/host/include/i386/host/cpuinfo.h > +++ b/host/include/i386/host/cpuinfo.h > @@ -27,6 +27,7 @@ > #define CPUINFO_ATOMIC_VMOVDQA (1u << 16) > #define CPUINFO_ATOMIC_VMOVDQU (1u << 17) > #define CPUINFO_AES (1u << 18) > +#define CPUINFO_PMULL (1u << 19) > > /* Initialized with a constructor. */ > extern unsigned cpuinfo; > diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c > index f59d3b26eacf08f8..fb422627588439b3 100644 > --- a/target/arm/tcg/vec_helper.c > +++ b/target/arm/tcg/vec_helper.c > @@ -25,6 +25,14 @@ > #include "qemu/int128.h" > #include "vec_internal.h" > > +#ifdef __x86_64__ > +#include "host/cpuinfo.h" > +#include <wmmintrin.h> > +#define TARGET_PMULL __attribute__((__target__("pclmul"))) > +#else > +#define TARGET_PMULL > +#endif > + > /* > * Data for expanding active predicate bits to bytes, for byte elements. > * > @@ -2010,12 +2018,28 @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc) > * Because of the lanes are not accessed in strict columns, > * this probably cannot be turned into a generic helper. > */ > -void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) > +void TARGET_PMULL HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) > { > intptr_t i, j, opr_sz = simd_oprsz(desc); > intptr_t hi = simd_data(desc); > uint64_t *d = vd, *n = vn, *m = vm; > > +#ifdef __x86_64__ > + if (cpuinfo & CPUINFO_PMULL) { > + switch (hi) { > + case 0: > + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x0); > + break; > + case 1: > + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x11); > + break; > + default: > + g_assert_not_reached(); > + } > + return; > + } > +#endif This needs to cope with the input vectors being more than just 128 bits wide, I think. Also you probably still need the clear_tail() to clear any high bits of the register. thanks -- PMM ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 2023-06-01 13:00 ` Peter Maydell @ 2023-06-01 15:28 ` Ard Biesheuvel 0 siblings, 0 replies; 6+ messages in thread From: Ard Biesheuvel @ 2023-06-01 15:28 UTC (permalink / raw) To: Peter Maydell Cc: qemu-arm, qemu-devel, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé On Thu, 1 Jun 2023 at 15:01, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Thu, 1 Jun 2023 at 13:33, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > --- > > host/include/i386/host/cpuinfo.h | 1 + > > target/arm/tcg/vec_helper.c | 26 +++++++++++++++++++- > > util/cpuinfo-i386.c | 1 + > > 3 files changed, 27 insertions(+), 1 deletion(-) > > > > diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h > > index 073d0a426f31487d..cf4ced844760d28f 100644 > > --- a/host/include/i386/host/cpuinfo.h > > +++ b/host/include/i386/host/cpuinfo.h > > @@ -27,6 +27,7 @@ > > #define CPUINFO_ATOMIC_VMOVDQA (1u << 16) > > #define CPUINFO_ATOMIC_VMOVDQU (1u << 17) > > #define CPUINFO_AES (1u << 18) > > +#define CPUINFO_PMULL (1u << 19) > > > > /* Initialized with a constructor. */ > > extern unsigned cpuinfo; > > diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c > > index f59d3b26eacf08f8..fb422627588439b3 100644 > > --- a/target/arm/tcg/vec_helper.c > > +++ b/target/arm/tcg/vec_helper.c > > @@ -25,6 +25,14 @@ > > #include "qemu/int128.h" > > #include "vec_internal.h" > > > > +#ifdef __x86_64__ > > +#include "host/cpuinfo.h" > > +#include <wmmintrin.h> > > +#define TARGET_PMULL __attribute__((__target__("pclmul"))) > > +#else > > +#define TARGET_PMULL > > +#endif > > + > > /* > > * Data for expanding active predicate bits to bytes, for byte elements. > > * > > @@ -2010,12 +2018,28 @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc) > > * Because of the lanes are not accessed in strict columns, > > * this probably cannot be turned into a generic helper. > > */ > > -void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) > > +void TARGET_PMULL HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) > > { > > intptr_t i, j, opr_sz = simd_oprsz(desc); > > intptr_t hi = simd_data(desc); > > uint64_t *d = vd, *n = vn, *m = vm; > > > > +#ifdef __x86_64__ > > + if (cpuinfo & CPUINFO_PMULL) { > > + switch (hi) { > > + case 0: > > + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x0); > > + break; > > + case 1: > > + *(__m128i *)vd = _mm_clmulepi64_si128(*(__m128i *)vm, *(__m128i *)vn, 0x11); > > + break; > > + default: > > + g_assert_not_reached(); > > + } > > + return; > > + } > > +#endif > > This needs to cope with the input vectors being more than > just 128 bits wide, I think. Also you probably still > need the clear_tail() to clear any high bits of the register. > Ah yes, I missed that completely. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions 2023-06-01 12:33 [PATCH 0/2] Implement PMULL using host intrinsics Ard Biesheuvel 2023-06-01 12:33 ` [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 Ard Biesheuvel @ 2023-06-01 12:33 ` Ard Biesheuvel 2023-06-01 17:13 ` Ard Biesheuvel 1 sibling, 1 reply; 6+ messages in thread From: Ard Biesheuvel @ 2023-06-01 12:33 UTC (permalink / raw) To: qemu-arm Cc: qemu-devel, Ard Biesheuvel, Peter Maydell, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé Use the AArch64 PMULL{2}.P64 instructions to implement PCLMULQDQ instead of emulating them in C code if the host supports this. This is used in the implementation of GCM, which is widely used in IPsec VPN and HTTPS. Somewhat surprising results: on my ThunderX2, enabling this on top of the AES acceleration I sent out earlier, the speedup is substantial. (1420 is a typical IPsec block size - in HTTPS, GCM operates on much larger block sizes but the kernel mode benchmarks are not the best place to measure its performance in this mode) tcrypt: testing speed of rfc4106(gcm(aes)) (rfc4106-gcm-aesni) encryption No acceleration tcrypt: test 5 (160 bit key, 1420 byte blocks): 10046 operations in 1 seconds (14265320 bytes) AES acceleration tcrypt: test 5 (160 bit key, 1420 byte blocks): 13970 operations in 1 seconds (19837400 bytes) AES + PMULL acceleration tcrypt: test 5 (160 bit key, 1420 byte blocks): 24372 operations in 1 seconds (34608240 bytes) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- host/include/aarch64/host/cpuinfo.h | 1 + target/i386/ops_sse.h | 24 ++++++++++++++++++++ util/cpuinfo-aarch64.c | 1 + 3 files changed, 26 insertions(+) diff --git a/host/include/aarch64/host/cpuinfo.h b/host/include/aarch64/host/cpuinfo.h index 05feeb4f4369fc19..da268dce1390cac0 100644 --- a/host/include/aarch64/host/cpuinfo.h +++ b/host/include/aarch64/host/cpuinfo.h @@ -10,6 +10,7 @@ #define CPUINFO_LSE (1u << 1) #define CPUINFO_LSE2 (1u << 2) #define CPUINFO_AES (1u << 3) +#define CPUINFO_PMULL (1u << 4) /* Initialized with a constructor. */ extern unsigned cpuinfo; diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index db79132778efd211..d7e7bd8b733122a8 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2157,6 +2157,30 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, uint64_t a, b; int i; +#ifdef __aarch64__ + if (cpuinfo & CPUINFO_PMULL) { + aes_vec_t vv = *(aes_vec_t *)v, vs = *(aes_vec_t *)s; + aes_vec_t *vd = (aes_vec_t *)d; + + switch (ctrl & 0x11) { + case 0x1: + asm("ext %0.16b, %0.16b, %0.16b, #8":"+w"(vv)); + /* fallthrough */ + case 0x0: + asm(".arch_extension aes\n" + "pmull %0.1q, %1.1d, %2.1d":"=w"(*vd):"w"(vv),"w"(vs)); + break; + case 0x10: + asm("ext %0.16b, %0.16b, %0.16b, #8":"+w"(vv)); + /* fallthrough */ + case 0x11: + asm(".arch_extension aes\n" + "pmull2 %0.1q, %1.2d, %2.2d":"=w"(*vd):"w"(vv),"w"(vs)); + } + return; + } +#endif + for (i = 0; i < 1 << SHIFT; i += 2) { a = v->Q(((ctrl & 1) != 0) + i); b = s->Q(((ctrl & 16) != 0) + i); diff --git a/util/cpuinfo-aarch64.c b/util/cpuinfo-aarch64.c index 769cdfeb2fc32d5e..95ec1f4adfc829b9 100644 --- a/util/cpuinfo-aarch64.c +++ b/util/cpuinfo-aarch64.c @@ -57,6 +57,7 @@ unsigned __attribute__((constructor)) cpuinfo_init(void) info |= (hwcap & HWCAP_ATOMICS ? CPUINFO_LSE : 0); info |= (hwcap & HWCAP_USCAT ? CPUINFO_LSE2 : 0); info |= (hwcap & HWCAP_AES ? CPUINFO_AES : 0); + info |= (hwcap & HWCAP_PMULL ? CPUINFO_PMULL : 0); #endif #ifdef CONFIG_DARWIN info |= sysctl_for_bool("hw.optional.arm.FEAT_LSE") * CPUINFO_LSE; -- 2.39.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions 2023-06-01 12:33 ` [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions Ard Biesheuvel @ 2023-06-01 17:13 ` Ard Biesheuvel 0 siblings, 0 replies; 6+ messages in thread From: Ard Biesheuvel @ 2023-06-01 17:13 UTC (permalink / raw) To: qemu-arm Cc: qemu-devel, Peter Maydell, Alex Bennée, Richard Henderson, Philippe Mathieu-Daudé On Thu, 1 Jun 2023 at 14:33, Ard Biesheuvel <ardb@kernel.org> wrote: > > Use the AArch64 PMULL{2}.P64 instructions to implement PCLMULQDQ instead > of emulating them in C code if the host supports this. This is used in > the implementation of GCM, which is widely used in IPsec VPN and HTTPS. > > Somewhat surprising results: on my ThunderX2, enabling this on top of > the AES acceleration I sent out earlier, the speedup is substantial. > > (1420 is a typical IPsec block size - in HTTPS, GCM operates on much > larger block sizes but the kernel mode benchmarks are not the best place > to measure its performance in this mode) > > tcrypt: testing speed of rfc4106(gcm(aes)) (rfc4106-gcm-aesni) encryption > > No acceleration > tcrypt: test 5 (160 bit key, 1420 byte blocks): 10046 operations in 1 seconds (14265320 bytes) > > AES acceleration > tcrypt: test 5 (160 bit key, 1420 byte blocks): 13970 operations in 1 seconds (19837400 bytes) > > AES + PMULL acceleration > tcrypt: test 5 (160 bit key, 1420 byte blocks): 24372 operations in 1 seconds (34608240 bytes) > User space benchmark (using OS's qemu-x86_64 vs one built with these changes applied) Speedup is about 5x ard@gambale:~/build/openssl$ apps/openssl speed -evp aes-128-gcm Doing AES-128-GCM for 3s on 16 size blocks: 1692138 AES-128-GCM's in 2.98s Doing AES-128-GCM for 3s on 64 size blocks: 665012 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 256 size blocks: 203784 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 1024 size blocks: 49397 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 8192 size blocks: 6447 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 16384 size blocks: 3058 AES-128-GCM's in 3.00s version: 3.2.0-dev built on: Thu Jun 1 17:06:09 2023 UTC options: bn(64,64) compiler: x86_64-linux-gnu-gcc -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_BUILDING_OPENSSL -DNDEBUG CPUINFO: OPENSSL_ia32cap=0xfed8320b0fcbfffd:0x8001020c01d843a9 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-128-GCM 9085.30k 14186.92k 17389.57k 16860.84k 17604.61k 16700.76k ard@gambale:~/build/openssl$ ../qemu/build/qemu-x86_64 apps/openssl speed -evp aes-128-gcm Doing AES-128-GCM for 3s on 16 size blocks: 2703271 AES-128-GCM's in 2.99s Doing AES-128-GCM for 3s on 64 size blocks: 1537884 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 256 size blocks: 653008 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 1024 size blocks: 203579 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 8192 size blocks: 29020 AES-128-GCM's in 3.00s Doing AES-128-GCM for 3s on 16384 size blocks: 14716 AES-128-GCM's in 2.99s version: 3.2.0-dev built on: Thu Jun 1 17:06:09 2023 UTC options: bn(64,64) compiler: x86_64-linux-gnu-gcc -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_BUILDING_OPENSSL -DNDEBUG CPUINFO: OPENSSL_ia32cap=0xfed8320b0fcbfffd:0x8001020c01d843a9 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-128-GCM 14465.66k 32808.19k 55723.35k 69488.30k 79243.95k 80637.77k ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-06-01 17:14 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-06-01 12:33 [PATCH 0/2] Implement PMULL using host intrinsics Ard Biesheuvel 2023-06-01 12:33 ` [PATCH 1/2] target/arm: Use x86 intrinsics to implement PMULL.P64 Ard Biesheuvel 2023-06-01 13:00 ` Peter Maydell 2023-06-01 15:28 ` Ard Biesheuvel 2023-06-01 12:33 ` [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions Ard Biesheuvel 2023-06-01 17:13 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).