Linux cryptographic layer development
 help / color / mirror / Atom feed
* Re: [PATCH v2 00/11] getting back -Wmaybe-uninitialized
From: Linus Torvalds @ 2016-11-11 17:13 UTC (permalink / raw)
  To: Arnd Bergmann, Srinivas Kandagatla, sayli karnik,
	Jonathan Cameron, Mark Brown
  Cc: Andrew Morton, Anna Schumaker, David S. Miller, Herbert Xu,
	Ilya Dryomov, Javier Martinez Canillas, Jiri Kosina, Ley Foon Tan,
	Luis R . Rodriguez, Martin Schwidefsky, Mauro Carvalho Chehab,
	Michal Marek, Russell King, Sean Young, Sebastian Ott,
	Trond Myklebust, the arch/x86 maintainers,
	Linux Kbuild mailing list, Linux
In-Reply-To: <20161110164454.293477-1-arnd-r2nGTMty4D4@public.gmane.org>

On Thu, Nov 10, 2016 at 8:44 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
>
> Please merge these directly if you are happy with the result.

I will take this.

I do see two warnings, but they both seem to be valid and recent,
though, so I have no issues with the spurious cases.

Warning #1:

  sound/soc/qcom/lpass-platform.c: In function ‘lpass_platform_pcmops_open’:
  sound/soc/qcom/lpass-platform.c:83:29: warning: ‘dma_ch’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
    drvdata->substream[dma_ch] = substream;
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~

and 'dma_ch' usage there really is crazy and wrong. Broken by
022d00ee0b55 ("ASoC: lpass-platform: Fix broken pcm data usage")

Warning #2 is not a real bug, but it's reasonable that gcc doesn't
know that storage_bytes (chip->read_size) has to be 2/4. Again,
introduced recently by commit 231147ee77f3 ("iio: maxim_thermocouple:
Align 16 bit big endian value of raw reads"), so you didn't see it.

  drivers/iio/temperature/maxim_thermocouple.c: In function
‘maxim_thermocouple_read_raw’:
  drivers/iio/temperature/maxim_thermocouple.c:141:5: warning: ‘ret’
may be used uninitialized in this function [-Wmaybe-uninitialized]
    if (ret)
       ^
  drivers/iio/temperature/maxim_thermocouple.c:128:6: note: ‘ret’ was
declared here
    int ret;
        ^~~

and I guess that code can just initialize 'ret' to '-EINVAL' or
something to just make the theoretical "somehow we had a wrong
chip->read_size" case error out cleanly.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH -next] hwrng: atmel - use clk_disable_unprepare instead of clk_disable
From: Wei Yongjun @ 2016-11-11 14:56 UTC (permalink / raw)
  To: Matt Mackall, Herbert Xu, Wenyou Yang, Nicolas Ferre
  Cc: Wei Yongjun, linux-crypto

From: Wei Yongjun <weiyongjun1@huawei.com>

Since clk_prepare_enable() is used to get trng->clk, we should
use clk_disable_unprepare() to release it for the error path.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
 drivers/char/hw_random/atmel-rng.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/atmel-rng.c b/drivers/char/hw_random/atmel-rng.c
index ae7cae5..661c82c 100644
--- a/drivers/char/hw_random/atmel-rng.c
+++ b/drivers/char/hw_random/atmel-rng.c
@@ -94,7 +94,7 @@ static int atmel_trng_probe(struct platform_device *pdev)
 	return 0;
 
 err_register:
-	clk_disable(trng->clk);
+	clk_disable_unprepare(trng->clk);
 	return ret;
 }

^ permalink raw reply related

* Re: [PATCH v3] crypto: only call put_page on referenced and used pages
From: Stephan Mueller @ 2016-11-11 14:28 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <6581903.GBJMzZudEe@tauon.atsec.com>

Am Dienstag, 13. September 2016, 13:27:34 CET schrieb Stephan Mueller:

Hi Herbert,

> Am Dienstag, 13. September 2016, 18:08:16 CEST schrieb Herbert Xu:
> 
> Hi Herbert,
> 
> > This patch appears to be papering over a real bug.
> > 
> > The async path should be exactly the same as the sync path, except
> > that we don't wait for completion.  So the question is why are we
> > getting this crash here for async but not sync?
> 
> At least one reason is found in skcipher_recvmsg_async with the following
> code path:
> 
>  if (txbufs == tx_nents) {
>                         struct scatterlist *tmp;
>                         int x;
>                         /* Ran out of tx slots in async request
>                          * need to expand */
>                         tmp = kcalloc(tx_nents * 2, sizeof(*tmp),
>                                       GFP_KERNEL);
>                         if (!tmp)
>                                 goto free;
> 
>                         sg_init_table(tmp, tx_nents * 2);
>                         for (x = 0; x < tx_nents; x++)
>                                 sg_set_page(&tmp[x], sg_page(&sreq->tsg[x]),
> sreq->tsg[x].length,
>                                             sreq->tsg[x].offset);
>                         kfree(sreq->tsg);
>                         sreq->tsg = tmp;
>                         tx_nents *= 2;
>                         mark = true;
>                 }
> 
> 
> ==> the code allocates twice the amount of the previously existing memory,
> copies the existing SGs over, but does not set the remaining SGs to
> anything. If the caller provides less pages than the number of allocated
> SGs, some SGs are unset. Hence, the deallocation must not do anything with
> the yet uninitialized SGs.

I looked into the issue a bit deeper. In addition to the aforementioned code, 
the following code seems to be a second culprit:

	tx_nents = skcipher_all_sg_nents(ctx);
	sreq->tsg = kcalloc(tx_nents, sizeof(*sg), GFP_KERNEL);
	if (unlikely(!sreq->tsg))
		goto unlock;
	sg_init_table(sreq->tsg, tx_nents);

Here again, an SGL is initialized, but there are no pages mapped to the SGs.

May I ask you to reconsider this patch as well as the patch "[PATCH] crypto: 
call put_page on used pages only" from September 10 since the current code of 
libkcapi can easily trigger these bugs and lead to a kernel crash.

If you consider the patches papering over the heart of the problem, may I ask 
for suggestions on how the mentioned code should be changed such that the 
issues are removed? If the suggestion is to re-architect the memory handling 
in the async part, may I ask to at least apply the patches for now with the 
goal to have time for re-architecting the async code and yet have no open 
holes that lead to crashes?

Thanks.

Ciao
Stephan

^ permalink raw reply

* [PATCH] crypto: arm64/sha2: integrate OpenSSL implementations of SHA256/SHA512
From: Ard Biesheuvel @ 2016-11-11 13:51 UTC (permalink / raw)
  To: linux-crypto, linux-arm-kernel, herbert
  Cc: daniel.thompson, Ard Biesheuvel, catalin.marinas, will.deacon,
	appro, victor.chong

This integrates both the accelerated scalar and the NEON implementations
of SHA-224/256 as well as SHA-384/512 from the OpenSSL project.

Relative performance compared to the respective generic C versions:

                 |  SHA256-scalar  | SHA256-NEON* |  SHA512  |
     ------------+-----------------+--------------+----------+
     Cortex-A53  |      1.63x      |     1.63x    |   2.34x  |
     Cortex-A57  |      1.43x      |     1.59x    |   1.95x  |
     Cortex-A73  |      1.26x      |     1.56x    |     ?    |

The core crypto code was authored by Andy Polyakov of the OpenSSL
project, in collaboration with whom the upstream code was adapted so
that this module can be built from the same version of sha512-armv8.pl.

The version in this patch was taken from OpenSSL commit

   866e505e0d66 sha/asm/sha512-armv8.pl: add NEON version of SHA256.

* The core SHA algorithm is fundamentally sequential, but there is a
  secondary transformation involved, called the schedule update, which
  can be performed independently. The NEON version of SHA-224/SHA-256
  only implements this part of the algorithm using NEON instructions,
  the sequential part is always done using scalar instructions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

This supersedes the SHA-256-NEON-only patch I sent out about 6 weeks ago.

Will, Catalin: note that this pulls in a .pl script, and adds a build rule
locally in arch/arm64/crypto to generate .S files on the fly from Perl
scripts. I will leave it to you to decide whether you are ok with this as
is, or whether you prefer .S_shipped files, in which case the Perl script
is only included as a reference (this is how we did it for arch/arm in the
past, but given that it adds about 3000 lines of generated code to the patch,
I think we may want to simply keep it as below)
 
 arch/arm64/crypto/Kconfig         |   8 +
 arch/arm64/crypto/Makefile        |  15 +
 arch/arm64/crypto/sha256-glue.c   | 185 +++++
 arch/arm64/crypto/sha512-armv8.pl | 778 ++++++++++++++++++++
 arch/arm64/crypto/sha512-glue.c   |  94 +++
 5 files changed, 1080 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9887e1..5f4a617e2957 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -8,6 +8,14 @@ menuconfig ARM64_CRYPTO
 
 if ARM64_CRYPTO
 
+config CRYPTO_SHA256_ARM64
+	tristate "SHA-224/SHA-256 digest algorithm for arm64"
+	select CRYPTO_HASH
+
+config CRYPTO_SHA512_ARM64
+	tristate "SHA-384/SHA-512 digest algorithm for arm64"
+	select CRYPTO_HASH
+
 config CRYPTO_SHA1_ARM64_CE
 	tristate "SHA-1 digest algorithm (ARMv8 Crypto Extensions)"
 	depends on ARM64 && KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index abb79b3cfcfe..861589faf6ef 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -29,6 +29,12 @@ aes-ce-blk-y := aes-glue-ce.o aes-ce.o
 obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
 aes-neon-blk-y := aes-glue-neon.o aes-neon.o
 
+obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o
+sha256-arm64-y := sha256-glue.o sha256-core.o
+
+obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
+sha512-arm64-y := sha512-glue.o sha512-core.o
+
 AFLAGS_aes-ce.o		:= -DINTERLEAVE=4
 AFLAGS_aes-neon.o	:= -DINTERLEAVE=4
 
@@ -40,3 +46,12 @@ CFLAGS_crc32-arm64.o	:= -mcpu=generic+crc
 
 $(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE
 	$(call if_changed_rule,cc_o_c)
+
+quiet_cmd_perl = PERLASM $@
+      cmd_perl = $(PERL) $(<) void $(@)
+
+$(obj)/sha256-core.S: $(src)/sha512-armv8.pl
+	$(call cmd,perl)
+
+$(obj)/sha512-core.S: $(src)/sha512-armv8.pl
+	$(call cmd,perl)
diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
new file mode 100644
index 000000000000..a2226f841960
--- /dev/null
+++ b/arch/arm64/crypto/sha256-glue.c
@@ -0,0 +1,185 @@
+/*
+ * Linux/arm64 port of the OpenSSL SHA256 implementation for AArch64
+ *
+ * Copyright (c) 2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <crypto/internal/hash.h>
+#include <crypto/sha.h>
+#include <crypto/sha256_base.h>
+#include <linux/cryptohash.h>
+#include <linux/types.h>
+#include <linux/string.h>
+
+MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash for arm64");
+MODULE_AUTHOR("Andy Polyakov <appro@openssl.org>");
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("sha224");
+MODULE_ALIAS_CRYPTO("sha256");
+
+asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
+					unsigned int num_blks);
+
+asmlinkage void sha256_block_neon(u32 *digest, const void *data,
+				  unsigned int num_blks);
+
+static int sha256_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int len)
+{
+	return sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+}
+
+static int sha256_finup(struct shash_desc *desc, const u8 *data,
+			unsigned int len, u8 *out)
+{
+	if (len)
+		sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+	sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_data_order);
+
+	return sha256_base_finish(desc, out);
+}
+
+static int sha256_final(struct shash_desc *desc, u8 *out)
+{
+	return sha256_finup(desc, NULL, 0, out);
+}
+
+static struct shash_alg algs[] = { {
+	.digestsize		= SHA256_DIGEST_SIZE,
+	.init			= sha256_base_init,
+	.update			= sha256_update,
+	.final			= sha256_final,
+	.finup			= sha256_finup,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha256",
+	.base.cra_driver_name	= "sha256-arm64",
+	.base.cra_priority	= 100,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA256_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA224_DIGEST_SIZE,
+	.init			= sha224_base_init,
+	.update			= sha256_update,
+	.final			= sha256_final,
+	.finup			= sha256_finup,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha224",
+	.base.cra_driver_name	= "sha224-arm64",
+	.base.cra_priority	= 100,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA224_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
+			      unsigned int len)
+{
+	/*
+	 * Stacking and unstacking a substantial slice of the NEON register
+	 * file may significantly affect performance for small updates when
+	 * executing in interrupt context, so fall back to the scalar code
+	 * in that case.
+	 */
+	if (!may_use_simd())
+		return sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+
+	kernel_neon_begin();
+	sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_neon);
+	kernel_neon_end();
+
+	return 0;
+}
+
+static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
+			     unsigned int len, u8 *out)
+{
+	if (!may_use_simd()) {
+		if (len)
+			sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+		sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_data_order);
+	} else {
+		kernel_neon_begin();
+		if (len)
+			sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_neon);
+		sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_neon);
+		kernel_neon_end();
+	}
+	return sha256_base_finish(desc, out);
+}
+
+static int sha256_final_neon(struct shash_desc *desc, u8 *out)
+{
+	return sha256_finup_neon(desc, NULL, 0, out);
+}
+
+static struct shash_alg neon_algs[] = { {
+	.digestsize		= SHA256_DIGEST_SIZE,
+	.init			= sha256_base_init,
+	.update			= sha256_update_neon,
+	.final			= sha256_final_neon,
+	.finup			= sha256_finup_neon,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha256",
+	.base.cra_driver_name	= "sha256-arm64-neon",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA256_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA224_DIGEST_SIZE,
+	.init			= sha224_base_init,
+	.update			= sha256_update_neon,
+	.final			= sha256_final_neon,
+	.finup			= sha256_finup_neon,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha224",
+	.base.cra_driver_name	= "sha224-arm64-neon",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA224_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init sha256_mod_init(void)
+{
+	int ret = crypto_register_shashes(algs, ARRAY_SIZE(algs));
+	if (ret)
+		return ret;
+
+	if (elf_hwcap & HWCAP_ASIMD) {
+		ret = crypto_register_shashes(neon_algs, ARRAY_SIZE(neon_algs));
+		if (ret)
+			crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+	}
+	return ret;
+}
+
+static void __exit sha256_mod_fini(void)
+{
+	if (elf_hwcap & HWCAP_ASIMD)
+		crypto_unregister_shashes(neon_algs, ARRAY_SIZE(neon_algs));
+	crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+}
+
+module_init(sha256_mod_init);
+module_exit(sha256_mod_fini);
diff --git a/arch/arm64/crypto/sha512-armv8.pl b/arch/arm64/crypto/sha512-armv8.pl
new file mode 100644
index 000000000000..ffae5f23bcd8
--- /dev/null
+++ b/arch/arm64/crypto/sha512-armv8.pl
@@ -0,0 +1,778 @@
+#! /usr/bin/env perl
+# Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
+#
+# Licensed under the OpenSSL license (the "License").  You may not use
+# this file except in compliance with the License.  You can obtain a copy
+# in the file LICENSE in the source distribution or at
+# https://www.openssl.org/source/license.html
+
+# ====================================================================
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+#
+# Permission to use under GPLv2 terms is granted.
+# ====================================================================
+#
+# SHA256/512 for ARMv8.
+#
+# Performance in cycles per processed byte and improvement coefficient
+# over code generated with "default" compiler:
+#
+#		SHA256-hw	SHA256(*)	SHA512
+# Apple A7	1.97		10.5 (+33%)	6.73 (-1%(**))
+# Cortex-A53	2.38		15.5 (+115%)	10.0 (+150%(***))
+# Cortex-A57	2.31		11.6 (+86%)	7.51 (+260%(***))
+# Denver	2.01		10.5 (+26%)	6.70 (+8%)
+# X-Gene			20.0 (+100%)	12.8 (+300%(***))
+# Mongoose	2.36		13.0 (+50%)	8.36 (+33%)
+#
+# (*)	Software SHA256 results are of lesser relevance, presented
+#	mostly for informational purposes.
+# (**)	The result is a trade-off: it's possible to improve it by
+#	10% (or by 1 cycle per round), but at the cost of 20% loss
+#	on Cortex-A53 (or by 4 cycles per round).
+# (***)	Super-impressive coefficients over gcc-generated code are
+#	indication of some compiler "pathology", most notably code
+#	generated with -mgeneral-regs-only is significanty faster
+#	and the gap is only 40-90%.
+#
+# October 2016.
+#
+# Originally it was reckoned that it makes no sense to implement NEON
+# version of SHA256 for 64-bit processors. This is because performance
+# improvement on most wide-spread Cortex-A5x processors was observed
+# to be marginal, same on Cortex-A53 and ~10% on A57. But then it was
+# observed that 32-bit NEON SHA256 performs significantly better than
+# 64-bit scalar version on *some* of the more recent processors. As
+# result 64-bit NEON version of SHA256 was added to provide best
+# all-round performance. For example it executes ~30% faster on X-Gene
+# and Mongoose. [For reference, NEON version of SHA512 is bound to
+# deliver much less improvement, likely *negative* on Cortex-A5x.
+# Which is why NEON support is limited to SHA256.]
+
+$output=pop;
+$flavour=pop;
+
+if ($flavour && $flavour ne "void") {
+    $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+    ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+    ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+    die "can't locate arm-xlate.pl";
+
+    open OUT,"| \"$^X\" $xlate $flavour $output";
+    *STDOUT=*OUT;
+} else {
+    open STDOUT,">$output";
+}
+
+if ($output =~ /512/) {
+	$BITS=512;
+	$SZ=8;
+	@Sigma0=(28,34,39);
+	@Sigma1=(14,18,41);
+	@sigma0=(1,  8, 7);
+	@sigma1=(19,61, 6);
+	$rounds=80;
+	$reg_t="x";
+} else {
+	$BITS=256;
+	$SZ=4;
+	@Sigma0=( 2,13,22);
+	@Sigma1=( 6,11,25);
+	@sigma0=( 7,18, 3);
+	@sigma1=(17,19,10);
+	$rounds=64;
+	$reg_t="w";
+}
+
+$func="sha${BITS}_block_data_order";
+
+($ctx,$inp,$num,$Ktbl)=map("x$_",(0..2,30));
+
+@X=map("$reg_t$_",(3..15,0..2));
+@V=($A,$B,$C,$D,$E,$F,$G,$H)=map("$reg_t$_",(20..27));
+($t0,$t1,$t2,$t3)=map("$reg_t$_",(16,17,19,28));
+
+sub BODY_00_xx {
+my ($i,$a,$b,$c,$d,$e,$f,$g,$h)=@_;
+my $j=($i+1)&15;
+my ($T0,$T1,$T2)=(@X[($i-8)&15],@X[($i-9)&15],@X[($i-10)&15]);
+   $T0=@X[$i+3] if ($i<11);
+
+$code.=<<___	if ($i<16);
+#ifndef	__ARMEB__
+	rev	@X[$i],@X[$i]			// $i
+#endif
+___
+$code.=<<___	if ($i<13 && ($i&1));
+	ldp	@X[$i+1],@X[$i+2],[$inp],#2*$SZ
+___
+$code.=<<___	if ($i==13);
+	ldp	@X[14],@X[15],[$inp]
+___
+$code.=<<___	if ($i>=14);
+	ldr	@X[($i-11)&15],[sp,#`$SZ*(($i-11)%4)`]
+___
+$code.=<<___	if ($i>0 && $i<16);
+	add	$a,$a,$t1			// h+=Sigma0(a)
+___
+$code.=<<___	if ($i>=11);
+	str	@X[($i-8)&15],[sp,#`$SZ*(($i-8)%4)`]
+___
+# While ARMv8 specifies merged rotate-n-logical operation such as
+# 'eor x,y,z,ror#n', it was found to negatively affect performance
+# on Apple A7. The reason seems to be that it requires even 'y' to
+# be available earlier. This means that such merged instruction is
+# not necessarily best choice on critical path... On the other hand
+# Cortex-A5x handles merged instructions much better than disjoint
+# rotate and logical... See (**) footnote above.
+$code.=<<___	if ($i<15);
+	ror	$t0,$e,#$Sigma1[0]
+	add	$h,$h,$t2			// h+=K[i]
+	eor	$T0,$e,$e,ror#`$Sigma1[2]-$Sigma1[1]`
+	and	$t1,$f,$e
+	bic	$t2,$g,$e
+	add	$h,$h,@X[$i&15]			// h+=X[i]
+	orr	$t1,$t1,$t2			// Ch(e,f,g)
+	eor	$t2,$a,$b			// a^b, b^c in next round
+	eor	$t0,$t0,$T0,ror#$Sigma1[1]	// Sigma1(e)
+	ror	$T0,$a,#$Sigma0[0]
+	add	$h,$h,$t1			// h+=Ch(e,f,g)
+	eor	$t1,$a,$a,ror#`$Sigma0[2]-$Sigma0[1]`
+	add	$h,$h,$t0			// h+=Sigma1(e)
+	and	$t3,$t3,$t2			// (b^c)&=(a^b)
+	add	$d,$d,$h			// d+=h
+	eor	$t3,$t3,$b			// Maj(a,b,c)
+	eor	$t1,$T0,$t1,ror#$Sigma0[1]	// Sigma0(a)
+	add	$h,$h,$t3			// h+=Maj(a,b,c)
+	ldr	$t3,[$Ktbl],#$SZ		// *K++, $t2 in next round
+	//add	$h,$h,$t1			// h+=Sigma0(a)
+___
+$code.=<<___	if ($i>=15);
+	ror	$t0,$e,#$Sigma1[0]
+	add	$h,$h,$t2			// h+=K[i]
+	ror	$T1,@X[($j+1)&15],#$sigma0[0]
+	and	$t1,$f,$e
+	ror	$T2,@X[($j+14)&15],#$sigma1[0]
+	bic	$t2,$g,$e
+	ror	$T0,$a,#$Sigma0[0]
+	add	$h,$h,@X[$i&15]			// h+=X[i]
+	eor	$t0,$t0,$e,ror#$Sigma1[1]
+	eor	$T1,$T1,@X[($j+1)&15],ror#$sigma0[1]
+	orr	$t1,$t1,$t2			// Ch(e,f,g)
+	eor	$t2,$a,$b			// a^b, b^c in next round
+	eor	$t0,$t0,$e,ror#$Sigma1[2]	// Sigma1(e)
+	eor	$T0,$T0,$a,ror#$Sigma0[1]
+	add	$h,$h,$t1			// h+=Ch(e,f,g)
+	and	$t3,$t3,$t2			// (b^c)&=(a^b)
+	eor	$T2,$T2,@X[($j+14)&15],ror#$sigma1[1]
+	eor	$T1,$T1,@X[($j+1)&15],lsr#$sigma0[2]	// sigma0(X[i+1])
+	add	$h,$h,$t0			// h+=Sigma1(e)
+	eor	$t3,$t3,$b			// Maj(a,b,c)
+	eor	$t1,$T0,$a,ror#$Sigma0[2]	// Sigma0(a)
+	eor	$T2,$T2,@X[($j+14)&15],lsr#$sigma1[2]	// sigma1(X[i+14])
+	add	@X[$j],@X[$j],@X[($j+9)&15]
+	add	$d,$d,$h			// d+=h
+	add	$h,$h,$t3			// h+=Maj(a,b,c)
+	ldr	$t3,[$Ktbl],#$SZ		// *K++, $t2 in next round
+	add	@X[$j],@X[$j],$T1
+	add	$h,$h,$t1			// h+=Sigma0(a)
+	add	@X[$j],@X[$j],$T2
+___
+	($t2,$t3)=($t3,$t2);
+}
+
+$code.=<<___;
+#ifndef	__KERNEL__
+# include "arm_arch.h"
+#endif
+
+.text
+
+.extern	OPENSSL_armcap_P
+.globl	$func
+.type	$func,%function
+.align	6
+$func:
+___
+$code.=<<___	if ($SZ==4);
+#ifndef	__KERNEL__
+# ifdef	__ILP32__
+	ldrsw	x16,.LOPENSSL_armcap_P
+# else
+	ldr	x16,.LOPENSSL_armcap_P
+# endif
+	adr	x17,.LOPENSSL_armcap_P
+	add	x16,x16,x17
+	ldr	w16,[x16]
+	tst	w16,#ARMV8_SHA256
+	b.ne	.Lv8_entry
+	tst	w16,#ARMV7_NEON
+	b.ne	.Lneon_entry
+#endif
+___
+$code.=<<___;
+	stp	x29,x30,[sp,#-128]!
+	add	x29,sp,#0
+
+	stp	x19,x20,[sp,#16]
+	stp	x21,x22,[sp,#32]
+	stp	x23,x24,[sp,#48]
+	stp	x25,x26,[sp,#64]
+	stp	x27,x28,[sp,#80]
+	sub	sp,sp,#4*$SZ
+
+	ldp	$A,$B,[$ctx]				// load context
+	ldp	$C,$D,[$ctx,#2*$SZ]
+	ldp	$E,$F,[$ctx,#4*$SZ]
+	add	$num,$inp,$num,lsl#`log(16*$SZ)/log(2)`	// end of input
+	ldp	$G,$H,[$ctx,#6*$SZ]
+	adr	$Ktbl,.LK$BITS
+	stp	$ctx,$num,[x29,#96]
+
+.Loop:
+	ldp	@X[0],@X[1],[$inp],#2*$SZ
+	ldr	$t2,[$Ktbl],#$SZ			// *K++
+	eor	$t3,$B,$C				// magic seed
+	str	$inp,[x29,#112]
+___
+for ($i=0;$i<16;$i++)	{ &BODY_00_xx($i,@V); unshift(@V,pop(@V)); }
+$code.=".Loop_16_xx:\n";
+for (;$i<32;$i++)	{ &BODY_00_xx($i,@V); unshift(@V,pop(@V)); }
+$code.=<<___;
+	cbnz	$t2,.Loop_16_xx
+
+	ldp	$ctx,$num,[x29,#96]
+	ldr	$inp,[x29,#112]
+	sub	$Ktbl,$Ktbl,#`$SZ*($rounds+1)`		// rewind
+
+	ldp	@X[0],@X[1],[$ctx]
+	ldp	@X[2],@X[3],[$ctx,#2*$SZ]
+	add	$inp,$inp,#14*$SZ			// advance input pointer
+	ldp	@X[4],@X[5],[$ctx,#4*$SZ]
+	add	$A,$A,@X[0]
+	ldp	@X[6],@X[7],[$ctx,#6*$SZ]
+	add	$B,$B,@X[1]
+	add	$C,$C,@X[2]
+	add	$D,$D,@X[3]
+	stp	$A,$B,[$ctx]
+	add	$E,$E,@X[4]
+	add	$F,$F,@X[5]
+	stp	$C,$D,[$ctx,#2*$SZ]
+	add	$G,$G,@X[6]
+	add	$H,$H,@X[7]
+	cmp	$inp,$num
+	stp	$E,$F,[$ctx,#4*$SZ]
+	stp	$G,$H,[$ctx,#6*$SZ]
+	b.ne	.Loop
+
+	ldp	x19,x20,[x29,#16]
+	add	sp,sp,#4*$SZ
+	ldp	x21,x22,[x29,#32]
+	ldp	x23,x24,[x29,#48]
+	ldp	x25,x26,[x29,#64]
+	ldp	x27,x28,[x29,#80]
+	ldp	x29,x30,[sp],#128
+	ret
+.size	$func,.-$func
+
+.align	6
+.type	.LK$BITS,%object
+.LK$BITS:
+___
+$code.=<<___ if ($SZ==8);
+	.quad	0x428a2f98d728ae22,0x7137449123ef65cd
+	.quad	0xb5c0fbcfec4d3b2f,0xe9b5dba58189dbbc
+	.quad	0x3956c25bf348b538,0x59f111f1b605d019
+	.quad	0x923f82a4af194f9b,0xab1c5ed5da6d8118
+	.quad	0xd807aa98a3030242,0x12835b0145706fbe
+	.quad	0x243185be4ee4b28c,0x550c7dc3d5ffb4e2
+	.quad	0x72be5d74f27b896f,0x80deb1fe3b1696b1
+	.quad	0x9bdc06a725c71235,0xc19bf174cf692694
+	.quad	0xe49b69c19ef14ad2,0xefbe4786384f25e3
+	.quad	0x0fc19dc68b8cd5b5,0x240ca1cc77ac9c65
+	.quad	0x2de92c6f592b0275,0x4a7484aa6ea6e483
+	.quad	0x5cb0a9dcbd41fbd4,0x76f988da831153b5
+	.quad	0x983e5152ee66dfab,0xa831c66d2db43210
+	.quad	0xb00327c898fb213f,0xbf597fc7beef0ee4
+	.quad	0xc6e00bf33da88fc2,0xd5a79147930aa725
+	.quad	0x06ca6351e003826f,0x142929670a0e6e70
+	.quad	0x27b70a8546d22ffc,0x2e1b21385c26c926
+	.quad	0x4d2c6dfc5ac42aed,0x53380d139d95b3df
+	.quad	0x650a73548baf63de,0x766a0abb3c77b2a8
+	.quad	0x81c2c92e47edaee6,0x92722c851482353b
+	.quad	0xa2bfe8a14cf10364,0xa81a664bbc423001
+	.quad	0xc24b8b70d0f89791,0xc76c51a30654be30
+	.quad	0xd192e819d6ef5218,0xd69906245565a910
+	.quad	0xf40e35855771202a,0x106aa07032bbd1b8
+	.quad	0x19a4c116b8d2d0c8,0x1e376c085141ab53
+	.quad	0x2748774cdf8eeb99,0x34b0bcb5e19b48a8
+	.quad	0x391c0cb3c5c95a63,0x4ed8aa4ae3418acb
+	.quad	0x5b9cca4f7763e373,0x682e6ff3d6b2b8a3
+	.quad	0x748f82ee5defb2fc,0x78a5636f43172f60
+	.quad	0x84c87814a1f0ab72,0x8cc702081a6439ec
+	.quad	0x90befffa23631e28,0xa4506cebde82bde9
+	.quad	0xbef9a3f7b2c67915,0xc67178f2e372532b
+	.quad	0xca273eceea26619c,0xd186b8c721c0c207
+	.quad	0xeada7dd6cde0eb1e,0xf57d4f7fee6ed178
+	.quad	0x06f067aa72176fba,0x0a637dc5a2c898a6
+	.quad	0x113f9804bef90dae,0x1b710b35131c471b
+	.quad	0x28db77f523047d84,0x32caab7b40c72493
+	.quad	0x3c9ebe0a15c9bebc,0x431d67c49c100d4c
+	.quad	0x4cc5d4becb3e42b6,0x597f299cfc657e2a
+	.quad	0x5fcb6fab3ad6faec,0x6c44198c4a475817
+	.quad	0	// terminator
+___
+$code.=<<___ if ($SZ==4);
+	.long	0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
+	.long	0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
+	.long	0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
+	.long	0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
+	.long	0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
+	.long	0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
+	.long	0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
+	.long	0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
+	.long	0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
+	.long	0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
+	.long	0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
+	.long	0xd192e819,0xd6990624,0xf40e3585,0x106aa070
+	.long	0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
+	.long	0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
+	.long	0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
+	.long	0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
+	.long	0	//terminator
+___
+$code.=<<___;
+.size	.LK$BITS,.-.LK$BITS
+#ifndef	__KERNEL__
+.align	3
+.LOPENSSL_armcap_P:
+# ifdef	__ILP32__
+	.long	OPENSSL_armcap_P-.
+# else
+	.quad	OPENSSL_armcap_P-.
+# endif
+#endif
+.asciz	"SHA$BITS block transform for ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
+.align	2
+___
+
+if ($SZ==4) {
+my $Ktbl="x3";
+
+my ($ABCD,$EFGH,$abcd)=map("v$_.16b",(0..2));
+my @MSG=map("v$_.16b",(4..7));
+my ($W0,$W1)=("v16.4s","v17.4s");
+my ($ABCD_SAVE,$EFGH_SAVE)=("v18.16b","v19.16b");
+
+$code.=<<___;
+#ifndef	__KERNEL__
+.type	sha256_block_armv8,%function
+.align	6
+sha256_block_armv8:
+.Lv8_entry:
+	stp		x29,x30,[sp,#-16]!
+	add		x29,sp,#0
+
+	ld1.32		{$ABCD,$EFGH},[$ctx]
+	adr		$Ktbl,.LK256
+
+.Loop_hw:
+	ld1		{@MSG[0]-@MSG[3]},[$inp],#64
+	sub		$num,$num,#1
+	ld1.32		{$W0},[$Ktbl],#16
+	rev32		@MSG[0],@MSG[0]
+	rev32		@MSG[1],@MSG[1]
+	rev32		@MSG[2],@MSG[2]
+	rev32		@MSG[3],@MSG[3]
+	orr		$ABCD_SAVE,$ABCD,$ABCD		// offload
+	orr		$EFGH_SAVE,$EFGH,$EFGH
+___
+for($i=0;$i<12;$i++) {
+$code.=<<___;
+	ld1.32		{$W1},[$Ktbl],#16
+	add.i32		$W0,$W0,@MSG[0]
+	sha256su0	@MSG[0],@MSG[1]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+	sha256su1	@MSG[0],@MSG[2],@MSG[3]
+___
+	($W0,$W1)=($W1,$W0);	push(@MSG,shift(@MSG));
+}
+$code.=<<___;
+	ld1.32		{$W1},[$Ktbl],#16
+	add.i32		$W0,$W0,@MSG[0]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+
+	ld1.32		{$W0},[$Ktbl],#16
+	add.i32		$W1,$W1,@MSG[1]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W1
+	sha256h2	$EFGH,$abcd,$W1
+
+	ld1.32		{$W1},[$Ktbl]
+	add.i32		$W0,$W0,@MSG[2]
+	sub		$Ktbl,$Ktbl,#$rounds*$SZ-16	// rewind
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+
+	add.i32		$W1,$W1,@MSG[3]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W1
+	sha256h2	$EFGH,$abcd,$W1
+
+	add.i32		$ABCD,$ABCD,$ABCD_SAVE
+	add.i32		$EFGH,$EFGH,$EFGH_SAVE
+
+	cbnz		$num,.Loop_hw
+
+	st1.32		{$ABCD,$EFGH},[$ctx]
+
+	ldr		x29,[sp],#16
+	ret
+.size	sha256_block_armv8,.-sha256_block_armv8
+#endif
+___
+}
+
+if ($SZ==4) {	######################################### NEON stuff #
+# You'll surely note a lot of similarities with sha256-armv4 module,
+# and of course it's not a coincidence. sha256-armv4 was used as
+# initial template, but was adapted for ARMv8 instruction set and
+# extensively re-tuned for all-round performance.
+
+my @V = ($A,$B,$C,$D,$E,$F,$G,$H) = map("w$_",(3..10));
+my ($t0,$t1,$t2,$t3,$t4) = map("w$_",(11..15));
+my $Ktbl="x16";
+my $Xfer="x17";
+my @X = map("q$_",(0..3));
+my ($T0,$T1,$T2,$T3,$T4,$T5,$T6,$T7) = map("q$_",(4..7,16..19));
+my $j=0;
+
+sub AUTOLOAD()          # thunk [simplified] x86-style perlasm
+{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; $opcode =~ s/_/\./;
+  my $arg = pop;
+    $arg = "#$arg" if ($arg*1 eq $arg);
+    $code .= "\t$opcode\t".join(',',@_,$arg)."\n";
+}
+
+sub Dscalar { shift =~ m|[qv]([0-9]+)|?"d$1":""; }
+sub Dlo     { shift =~ m|[qv]([0-9]+)|?"v$1.d[0]":""; }
+sub Dhi     { shift =~ m|[qv]([0-9]+)|?"v$1.d[1]":""; }
+
+sub Xupdate()
+{ use integer;
+  my $body = shift;
+  my @insns = (&$body,&$body,&$body,&$body);
+  my ($a,$b,$c,$d,$e,$f,$g,$h);
+
+	&ext_8		($T0,@X[0],@X[1],4);	# X[1..4]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ext_8		($T3,@X[2],@X[3],4);	# X[9..12]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&mov		(&Dscalar($T7),&Dhi(@X[3]));	# X[14..15]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ushr_32	($T2,$T0,$sigma0[0]);
+	 eval(shift(@insns));
+	&ushr_32	($T1,$T0,$sigma0[2]);
+	 eval(shift(@insns));
+	&add_32 	(@X[0],@X[0],$T3);	# X[0..3] += X[9..12]
+	 eval(shift(@insns));
+	&sli_32		($T2,$T0,32-$sigma0[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ushr_32	($T3,$T0,$sigma0[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T1,$T1,$T2);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&sli_32		($T3,$T0,32-$sigma0[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T4,$T7,$sigma1[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T1,$T1,$T3);		# sigma0(X[1..4])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T4,$T7,32-$sigma1[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T5,$T7,$sigma1[2]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T3,$T7,$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0],@X[0],$T1);	# X[0..3] += sigma0(X[1..4])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_u32	($T3,$T7,32-$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T5,$T5,$T4);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T5,$T5,$T3);		# sigma1(X[14..15])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0],@X[0],$T5);	# X[0..1] += sigma1(X[14..15])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T6,@X[0],$sigma1[0]);
+	 eval(shift(@insns));
+	  &ushr_32	($T7,@X[0],$sigma1[2]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T6,@X[0],32-$sigma1[0]);
+	 eval(shift(@insns));
+	  &ushr_32	($T5,@X[0],$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T7,$T7,$T6);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T5,@X[0],32-$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_32		("{$T0}","[$Ktbl], #16");
+	 eval(shift(@insns));
+	  &eor_8	($T7,$T7,$T5);		# sigma1(X[16..17])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T5,$T5,$T5);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&mov		(&Dhi($T5), &Dlo($T7));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0],@X[0],$T5);	# X[2..3] += sigma1(X[16..17])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		($T0,$T0,@X[0]);
+	 while($#insns>=1) { eval(shift(@insns)); }
+	&st1_32		("{$T0}","[$Xfer], #16");
+	 eval(shift(@insns));
+
+	push(@X,shift(@X));		# "rotate" X[]
+}
+
+sub Xpreload()
+{ use integer;
+  my $body = shift;
+  my @insns = (&$body,&$body,&$body,&$body);
+  my ($a,$b,$c,$d,$e,$f,$g,$h);
+
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_8		("{@X[0]}","[$inp],#16");
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_32		("{$T0}","[$Ktbl],#16");
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&rev32		(@X[0],@X[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		($T0,$T0,@X[0]);
+	 foreach (@insns) { eval; }	# remaining instructions
+	&st1_32		("{$T0}","[$Xfer], #16");
+
+	push(@X,shift(@X));		# "rotate" X[]
+}
+
+sub body_00_15 () {
+	(
+	'($a,$b,$c,$d,$e,$f,$g,$h)=@V;'.
+	'&add	($h,$h,$t1)',			# h+=X[i]+K[i]
+	'&add	($a,$a,$t4);'.			# h+=Sigma0(a) from the past
+	'&and	($t1,$f,$e)',
+	'&bic	($t4,$g,$e)',
+	'&eor	($t0,$e,$e,"ror#".($Sigma1[1]-$Sigma1[0]))',
+	'&add	($a,$a,$t2)',			# h+=Maj(a,b,c) from the past
+	'&orr	($t1,$t1,$t4)',			# Ch(e,f,g)
+	'&eor	($t0,$t0,$e,"ror#".($Sigma1[2]-$Sigma1[0]))',	# Sigma1(e)
+	'&eor	($t4,$a,$a,"ror#".($Sigma0[1]-$Sigma0[0]))',
+	'&add	($h,$h,$t1)',			# h+=Ch(e,f,g)
+	'&ror	($t0,$t0,"#$Sigma1[0]")',
+	'&eor	($t2,$a,$b)',			# a^b, b^c in next round
+	'&eor	($t4,$t4,$a,"ror#".($Sigma0[2]-$Sigma0[0]))',	# Sigma0(a)
+	'&add	($h,$h,$t0)',			# h+=Sigma1(e)
+	'&ldr	($t1,sprintf "[sp,#%d]",4*(($j+1)&15))	if (($j&15)!=15);'.
+	'&ldr	($t1,"[$Ktbl]")				if ($j==15);'.
+	'&and	($t3,$t3,$t2)',			# (b^c)&=(a^b)
+	'&ror	($t4,$t4,"#$Sigma0[0]")',
+	'&add	($d,$d,$h)',			# d+=h
+	'&eor	($t3,$t3,$b)',			# Maj(a,b,c)
+	'$j++;	unshift(@V,pop(@V)); ($t2,$t3)=($t3,$t2);'
+	)
+}
+
+$code.=<<___;
+#ifdef	__KERNEL__
+.globl	sha256_block_neon
+#endif
+.type	sha256_block_neon,%function
+.align	4
+sha256_block_neon:
+.Lneon_entry:
+	stp	x29, x30, [sp, #-16]!
+	mov	x29, sp
+	sub	sp,sp,#16*4
+
+	adr	$Ktbl,.LK256
+	add	$num,$inp,$num,lsl#6	// len to point at the end of inp
+
+	ld1.8	{@X[0]},[$inp], #16
+	ld1.8	{@X[1]},[$inp], #16
+	ld1.8	{@X[2]},[$inp], #16
+	ld1.8	{@X[3]},[$inp], #16
+	ld1.32	{$T0},[$Ktbl], #16
+	ld1.32	{$T1},[$Ktbl], #16
+	ld1.32	{$T2},[$Ktbl], #16
+	ld1.32	{$T3},[$Ktbl], #16
+	rev32	@X[0],@X[0]		// yes, even on
+	rev32	@X[1],@X[1]		// big-endian
+	rev32	@X[2],@X[2]
+	rev32	@X[3],@X[3]
+	mov	$Xfer,sp
+	add.32	$T0,$T0,@X[0]
+	add.32	$T1,$T1,@X[1]
+	add.32	$T2,$T2,@X[2]
+	st1.32	{$T0-$T1},[$Xfer], #32
+	add.32	$T3,$T3,@X[3]
+	st1.32	{$T2-$T3},[$Xfer]
+	sub	$Xfer,$Xfer,#32
+
+	ldp	$A,$B,[$ctx]
+	ldp	$C,$D,[$ctx,#8]
+	ldp	$E,$F,[$ctx,#16]
+	ldp	$G,$H,[$ctx,#24]
+	ldr	$t1,[sp,#0]
+	mov	$t2,wzr
+	eor	$t3,$B,$C
+	mov	$t4,wzr
+	b	.L_00_48
+
+.align	4
+.L_00_48:
+___
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+$code.=<<___;
+	cmp	$t1,#0				// check for K256 terminator
+	ldr	$t1,[sp,#0]
+	sub	$Xfer,$Xfer,#64
+	bne	.L_00_48
+
+	sub	$Ktbl,$Ktbl,#256		// rewind $Ktbl
+	cmp	$inp,$num
+	mov	$Xfer, #64
+	csel	$Xfer, $Xfer, xzr, eq
+	sub	$inp,$inp,$Xfer			// avoid SEGV
+	mov	$Xfer,sp
+___
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+$code.=<<___;
+	add	$A,$A,$t4			// h+=Sigma0(a) from the past
+	ldp	$t0,$t1,[$ctx,#0]
+	add	$A,$A,$t2			// h+=Maj(a,b,c) from the past
+	ldp	$t2,$t3,[$ctx,#8]
+	add	$A,$A,$t0			// accumulate
+	add	$B,$B,$t1
+	ldp	$t0,$t1,[$ctx,#16]
+	add	$C,$C,$t2
+	add	$D,$D,$t3
+	ldp	$t2,$t3,[$ctx,#24]
+	add	$E,$E,$t0
+	add	$F,$F,$t1
+	 ldr	$t1,[sp,#0]
+	stp	$A,$B,[$ctx,#0]
+	add	$G,$G,$t2
+	 mov	$t2,wzr
+	stp	$C,$D,[$ctx,#8]
+	add	$H,$H,$t3
+	stp	$E,$F,[$ctx,#16]
+	 eor	$t3,$B,$C
+	stp	$G,$H,[$ctx,#24]
+	 mov	$t4,wzr
+	 mov	$Xfer,sp
+	b.ne	.L_00_48
+
+	ldr	x29,[x29]
+	add	sp,sp,#16*4+16
+	ret
+.size	sha256_block_neon,.-sha256_block_neon
+___
+}
+
+$code.=<<___;
+#ifndef	__KERNEL__
+.comm	OPENSSL_armcap_P,4,4
+#endif
+___
+
+{   my  %opcode = (
+	"sha256h"	=> 0x5e004000,	"sha256h2"	=> 0x5e005000,
+	"sha256su0"	=> 0x5e282800,	"sha256su1"	=> 0x5e006000	);
+
+    sub unsha256 {
+	my ($mnemonic,$arg)=@_;
+
+	$arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)[^,]*(?:,\s*[qv]([0-9]+))?/o
+	&&
+	sprintf ".inst\t0x%08x\t//%s %s",
+			$opcode{$mnemonic}|$1|($2<<5)|($3<<16),
+			$mnemonic,$arg;
+    }
+}
+
+open SELF,$0;
+while(<SELF>) {
+        next if (/^#!/);
+        last if (!s/^#/\/\// and !/^$/);
+        print;
+}
+close SELF;
+
+foreach(split("\n",$code)) {
+
+	s/\`([^\`]*)\`/eval($1)/ge;
+
+	s/\b(sha256\w+)\s+([qv].*)/unsha256($1,$2)/ge;
+
+	s/\bq([0-9]+)\b/v$1.16b/g;		# old->new registers
+
+	s/\.[ui]?8(\s)/$1/;
+	s/\.\w?32\b//		and s/\.16b/\.4s/g;
+	m/(ld|st)1[^\[]+\[0\]/	and s/\.4s/\.s/g;
+
+	print $_,"\n";
+}
+
+close STDOUT;
diff --git a/arch/arm64/crypto/sha512-glue.c b/arch/arm64/crypto/sha512-glue.c
new file mode 100644
index 000000000000..aff35c9992a4
--- /dev/null
+++ b/arch/arm64/crypto/sha512-glue.c
@@ -0,0 +1,94 @@
+/*
+ * Linux/arm64 port of the OpenSSL SHA512 implementation for AArch64
+ *
+ * Copyright (c) 2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include <crypto/internal/hash.h>
+#include <linux/cryptohash.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <crypto/sha.h>
+#include <crypto/sha512_base.h>
+#include <asm/neon.h>
+
+MODULE_DESCRIPTION("SHA-384/SHA-512 secure hash for arm64");
+MODULE_AUTHOR("Andy Polyakov <appro@openssl.org>");
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("sha384");
+MODULE_ALIAS_CRYPTO("sha512");
+
+asmlinkage void sha512_block_data_order(u32 *digest, const void *data,
+					unsigned int num_blks);
+
+static int sha512_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int len)
+{
+	return sha512_base_do_update(desc, data, len,
+			(sha512_block_fn *)sha512_block_data_order);
+}
+
+static int sha512_finup(struct shash_desc *desc, const u8 *data,
+			unsigned int len, u8 *out)
+{
+	if (len)
+		sha512_base_do_update(desc, data, len,
+			(sha512_block_fn *)sha512_block_data_order);
+	sha512_base_do_finalize(desc,
+			(sha512_block_fn *)sha512_block_data_order);
+
+	return sha512_base_finish(desc, out);
+}
+
+static int sha512_final(struct shash_desc *desc, u8 *out)
+{
+	return sha512_finup(desc, NULL, 0, out);
+}
+
+static struct shash_alg algs[] = { {
+	.digestsize		= SHA512_DIGEST_SIZE,
+	.init			= sha512_base_init,
+	.update			= sha512_update,
+	.final			= sha512_final,
+	.finup			= sha512_finup,
+	.descsize		= sizeof(struct sha512_state),
+	.base.cra_name		= "sha512",
+	.base.cra_driver_name	= "sha512-arm64",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA512_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA384_DIGEST_SIZE,
+	.init			= sha384_base_init,
+	.update			= sha512_update,
+	.final			= sha512_final,
+	.finup			= sha512_finup,
+	.descsize		= sizeof(struct sha512_state),
+	.base.cra_name		= "sha384",
+	.base.cra_driver_name	= "sha384-arm64",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA384_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init sha512_mod_init(void)
+{
+	return crypto_register_shashes(algs, ARRAY_SIZE(algs));
+}
+
+static void __exit sha512_mod_fini(void)
+{
+	crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+}
+
+module_init(sha512_mod_init);
+module_exit(sha512_mod_fini);
-- 
2.7.4

^ permalink raw reply related

* Re: algif_aead: AIO broken with more than one iocb
From: Stephan Mueller @ 2016-11-11 13:46 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20160913101246.GA30851@gondor.apana.org.au>

Am Dienstag, 13. September 2016, 18:12:46 CET schrieb Herbert Xu:

Hi Herbert,

> On Sun, Sep 11, 2016 at 04:59:19AM +0200, Stephan Mueller wrote:
> > Hi Herbert,
> > 
> > The AIO support for algif_aead is broken when submitting more than one
> > iocb.> 
> > The break happens in aead_recvmsg_async at the following code:
> >         /* ensure output buffer is sufficiently large */
> >         if (usedpages < outlen)
> >         
> >                 goto free;
> > 
> > The reason is that when submitting, say, two iocb, ctx->used contains the
> > buffer length for two AEAD operations (as expected). However, the recvmsg
> > code
> I don't think we should allow that.  We should make it so that you
> must start a recvmsg before you can send data for a new request.
> 
> Remember that the async path should be identical to the sync path,
> except that you don't wait for completion.

Just as a followup: with the patch submitted the other day to cover the AAD 
and tag handling, the algif_aead now supports also multiple iocb.

Ciao
Stephan

^ permalink raw reply

* [PATCH] crypto: nx - drop duplicate header types.h
From: Geliang Tang @ 2016-11-11 12:50 UTC (permalink / raw)
  To: Leonidas S. Barbosa, Paulo Flabiano Smorigo,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Herbert Xu, David S. Miller
  Cc: Geliang Tang, linux-crypto, linuxppc-dev, linux-kernel

Drop duplicate header types.h from nx.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 drivers/crypto/nx/nx.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/crypto/nx/nx.c b/drivers/crypto/nx/nx.c
index 42f0f22..036057a 100644
--- a/drivers/crypto/nx/nx.c
+++ b/drivers/crypto/nx/nx.c
@@ -32,7 +32,6 @@
 #include <linux/scatterlist.h>
 #include <linux/device.h>
 #include <linux/of.h>
-#include <linux/types.h>
 #include <asm/hvcall.h>
 #include <asm/vio.h>
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH] crypto: jitterentropy - drop duplicate header module.h
From: Geliang Tang @ 2016-11-11 12:45 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller; +Cc: Geliang Tang, linux-crypto, linux-kernel

Drop duplicate header module.h from jitterentropy-kcapi.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 crypto/jitterentropy-kcapi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/crypto/jitterentropy-kcapi.c b/crypto/jitterentropy-kcapi.c
index c4938497..787dccc 100644
--- a/crypto/jitterentropy-kcapi.c
+++ b/crypto/jitterentropy-kcapi.c
@@ -39,7 +39,6 @@
 
 #include <linux/module.h>
 #include <linux/slab.h>
-#include <linux/module.h>
 #include <linux/fips.h>
 #include <linux/time.h>
 #include <linux/crypto.h>
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH 1/16] crypto: skcipher - Add skcipher walk interface
From: Herbert Xu @ 2016-11-11 11:19 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Linux Crypto Mailing List
In-Reply-To: <20161102205420.GA17645@google.com>

On Wed, Nov 02, 2016 at 01:54:20PM -0700, Eric Biggers wrote:
>
> I think the case where skcipher_copy_iv() fails may be handled incorrectly.
> Wouldn't it need to set walk.nbytes to 0 so as to not confuse callers which
> expect that behavior?  Or maybe it should be calling skcipher_walk_done().

Good catch.  I'll fix and repost.

> Setting walk->in.sg and walk->out.sg is redundant with the scatterwalk_start()
> calls.

Will remove.

> This gets called with uninitialized 'walk.flags'.  This was somewhat of a
> theoretical problem with the old blkcipher_walk code but it looks like now it
> will interact badly with the new SKCIPHER_WALK_SLEEP flag.  As far as I can see,
> whether the flag will end up set or not can depend on the uninitialized value.
> It would be nice if this problem could be avoided entirely be setting flags=0.

Right.  I'll fix this as well.

> I'm also wondering about the choice to not look at 'atomic' until after the call
> to skcipher_walk_skcipher().  Wouldn't this mean that the choice of 'atomic'
> would not be respected in e.g. the kmalloc() in skcipher_copy_iv()?

The atomic flag is meant to be used in cases such as aesni where
you need to do kernel_fpu_begin after the call to start the walk.
IOW sleeping is fine at the start but not on subsequent walk calls.

> I don't see any users of the "async" walking being introduced; are some planned?

skcipher_walk is meant to unite blkcipher_walk and ablkcipher_walk.
The latter will use the async case.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH v2 11/11] Kbuild: enable -Wmaybe-uninitialized warnings by default
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

Previously the warnings were added back at the W=1 level and above,
this now turns them on again by default, assuming that we have addressed
all warnings and again have a clean build for v4.10.

I found a number of new warnings in linux-next already and submitted
bugfixes for those. Hopefully they are caught by the 0day builder
in the future as soon as this patch is merged.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
Please check if there are any remaining warnings with this
patch before applying.

The one known issue right now is commit 32cb7d27e65d ("iio:
maxim_thermocouple: detect invalid storage size in read()"), which
is currently in linux-next but not yet in mainline.

There are a couple of warnings that I get with randconfig builds,
and I have submitted patches for all of them at some point and
will follow up on them to make sure they get addressed eventually.
---
 scripts/Makefile.extrawarn | 2 --
 1 file changed, 2 deletions(-)

diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
index 7fc2c5a..7c321a6 100644
--- a/scripts/Makefile.extrawarn
+++ b/scripts/Makefile.extrawarn
@@ -60,8 +60,6 @@ endif
 KBUILD_CFLAGS += $(warning)
 else
 
-KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
-
 ifeq ($(cc-name),clang)
 KBUILD_CFLAGS += $(call cc-disable-warning, initializer-overrides)
 KBUILD_CFLAGS += $(call cc-disable-warning, unused-value)
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 10/11] pcmcia: fix return value of soc_pcmcia_regulator_set
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Arnd Bergmann, Anna Schumaker, David S. Miller,
	Herbert Xu, Ilya Dryomov, Javier Martinez Canillas, Jiri Kosina,
	Jonathan Cameron, Ley Foon Tan, Luis R . Rodriguez,
	Martin Schwidefsky, Mauro Carvalho Chehab, Michal Marek,
	Russell King, Sean Young, Sebastian Ott, Trond Myklebust, x86,
	linux-kbuild
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

The newly introduced soc_pcmcia_regulator_set() function sometimes returns
without setting its return code, as shown by this warning:

drivers/pcmcia/soc_common.c: In function 'soc_pcmcia_regulator_set':
drivers/pcmcia/soc_common.c:112:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This changes it to propagate the regulator_disable() result instead.

Fixes: ac61b6001a63 ("pcmcia: soc_common: add support for Vcc and Vpp regulators")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/pcmcia/soc_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pcmcia/soc_common.c b/drivers/pcmcia/soc_common.c
index 153f312..b6b316d 100644
--- a/drivers/pcmcia/soc_common.c
+++ b/drivers/pcmcia/soc_common.c
@@ -107,7 +107,7 @@ int soc_pcmcia_regulator_set(struct soc_pcmcia_socket *skt,
 
 		ret = regulator_enable(r->reg);
 	} else {
-		regulator_disable(r->reg);
+		ret = regulator_disable(r->reg);
 	}
 	if (ret == 0)
 		r->on = on;
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 09/11] [v3] infiniband: shut up a maybe-uninitialized warning
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

Some configurations produce this harmless warning when built with
gcc -Wmaybe-uninitialized:

infiniband/core/cma.c: In function 'cma_get_net_dev':
infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized]

I previously reported this for the powerpc64 defconfig, but have now
reproduced the same thing for x86 as well, using gcc-5 or higher.

The code looks correct to me, and this change just rearranges it
by making sure we alway initialize the entire address structure
to make the warning disappear. My first approach added an
initialization at the time of the declaration, which Doug commented
may be too costly, so I hope this version doesn't add overhead.

Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed
Link: https://patchwork.kernel.org/patch/9212825/
Acked-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
This is marked v2 as the rest of the series but is actually version
three of the patch as I had to do some other changes already.

v3: remove accidental leftover change of the original patch

 drivers/infiniband/core/cma.c | 54 ++++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 36bf50e..89a6b05 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1094,47 +1094,47 @@ static void cma_save_ib_info(struct sockaddr *src_addr,
 	}
 }
 
-static void cma_save_ip4_info(struct sockaddr *src_addr,
-			      struct sockaddr *dst_addr,
+static void cma_save_ip4_info(struct sockaddr_in *src_addr,
+			      struct sockaddr_in *dst_addr,
 			      struct cma_hdr *hdr,
 			      __be16 local_port)
 {
-	struct sockaddr_in *ip4;
-
 	if (src_addr) {
-		ip4 = (struct sockaddr_in *)src_addr;
-		ip4->sin_family = AF_INET;
-		ip4->sin_addr.s_addr = hdr->dst_addr.ip4.addr;
-		ip4->sin_port = local_port;
+		*src_addr = (struct sockaddr_in) {
+			.sin_family = AF_INET,
+			.sin_addr.s_addr = hdr->dst_addr.ip4.addr,
+			.sin_port = local_port,
+		};
 	}
 
 	if (dst_addr) {
-		ip4 = (struct sockaddr_in *)dst_addr;
-		ip4->sin_family = AF_INET;
-		ip4->sin_addr.s_addr = hdr->src_addr.ip4.addr;
-		ip4->sin_port = hdr->port;
+		*dst_addr = (struct sockaddr_in) {
+			.sin_family = AF_INET,
+			.sin_addr.s_addr = hdr->src_addr.ip4.addr,
+			.sin_port = hdr->port,
+		};
 	}
 }
 
-static void cma_save_ip6_info(struct sockaddr *src_addr,
-			      struct sockaddr *dst_addr,
+static void cma_save_ip6_info(struct sockaddr_in6 *src_addr,
+			      struct sockaddr_in6 *dst_addr,
 			      struct cma_hdr *hdr,
 			      __be16 local_port)
 {
-	struct sockaddr_in6 *ip6;
-
 	if (src_addr) {
-		ip6 = (struct sockaddr_in6 *)src_addr;
-		ip6->sin6_family = AF_INET6;
-		ip6->sin6_addr = hdr->dst_addr.ip6;
-		ip6->sin6_port = local_port;
+		*src_addr = (struct sockaddr_in6) {
+			.sin6_family = AF_INET6,
+			.sin6_addr = hdr->dst_addr.ip6,
+			.sin6_port = local_port,
+		};
 	}
 
 	if (dst_addr) {
-		ip6 = (struct sockaddr_in6 *)dst_addr;
-		ip6->sin6_family = AF_INET6;
-		ip6->sin6_addr = hdr->src_addr.ip6;
-		ip6->sin6_port = hdr->port;
+		*dst_addr = (struct sockaddr_in6) {
+			.sin6_family = AF_INET6,
+			.sin6_addr = hdr->src_addr.ip6,
+			.sin6_port = hdr->port,
+		};
 	}
 }
 
@@ -1159,10 +1159,12 @@ static int cma_save_ip_info(struct sockaddr *src_addr,
 
 	switch (cma_get_ip_ver(hdr)) {
 	case 4:
-		cma_save_ip4_info(src_addr, dst_addr, hdr, port);
+		cma_save_ip4_info((struct sockaddr_in *)src_addr,
+				  (struct sockaddr_in *)dst_addr, hdr, port);
 		break;
 	case 6:
-		cma_save_ip6_info(src_addr, dst_addr, hdr, port);
+		cma_save_ip6_info((struct sockaddr_in6 *)src_addr,
+				  (struct sockaddr_in6 *)dst_addr, hdr, port);
 		break;
 	default:
 		return -EAFNOSUPPORT;
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 08/11] crypto: aesni: shut up -Wmaybe-uninitialized warning
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Arnd Bergmann, Anna Schumaker, David S. Miller,
	Herbert Xu, Ilya Dryomov, Javier Martinez Canillas, Jiri Kosina,
	Jonathan Cameron, Ley Foon Tan, Luis R . Rodriguez,
	Martin Schwidefsky, Mauro Carvalho Chehab, Michal Marek,
	Russell King, Sean Young, Sebastian Ott, Trond Myklebust, x86,
	linux-kbuild
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:

arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]

The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.

This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn
on this warning again by default without producing useless output.
A follow-up patch for v4.10 rearranges the code to make the warning
go away.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/x86/crypto/aesni-intel_glue.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 0ab5ee1..aa8b067 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -888,7 +888,7 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
 	unsigned long auth_tag_len = crypto_aead_authsize(tfm);
 	u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
 	struct scatter_walk src_sg_walk;
-	struct scatter_walk dst_sg_walk;
+	struct scatter_walk dst_sg_walk = {};
 	unsigned int i;
 
 	/* Assuming we are supporting rfc4106 64-bit extended */
@@ -968,7 +968,7 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
 	u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
 	u8 authTag[16];
 	struct scatter_walk src_sg_walk;
-	struct scatter_walk dst_sg_walk;
+	struct scatter_walk dst_sg_walk = {};
 	unsigned int i;
 
 	if (unlikely(req->assoclen != 16 && req->assoclen != 20))
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 07/11] [media] rc: print correct variable for z8f0811
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

A recent rework accidentally left a debugging printk untouched
while changing the meaning of the variables, leading to an
uninitialized variable being printed:

drivers/media/i2c/ir-kbd-i2c.c: In function 'get_key_haup_common':
drivers/media/i2c/ir-kbd-i2c.c:62:2: error: 'toggle' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This prints the correct one instead, as we did before the patch.

Fixes: 00bb820755ed ("[media] rc: Hauppauge z8f0811 can decode RC6")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/media/i2c/ir-kbd-i2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

I submitted this repeatedly as it is a v4.9 regression, but
I never saw a reply for it.

diff --git a/drivers/media/i2c/ir-kbd-i2c.c b/drivers/media/i2c/ir-kbd-i2c.c
index f95a6bc..cede397 100644
--- a/drivers/media/i2c/ir-kbd-i2c.c
+++ b/drivers/media/i2c/ir-kbd-i2c.c
@@ -118,7 +118,7 @@ static int get_key_haup_common(struct IR_i2c *ir, enum rc_type *protocol,
 			*protocol = RC_TYPE_RC6_MCE;
 			dev &= 0x7f;
 			dprintk(1, "ir hauppauge (rc6-mce): t%d vendor=%d dev=%d code=%d\n",
-						toggle, vendor, dev, code);
+						*ptoggle, vendor, dev, code);
 		} else {
 			*ptoggle = 0;
 			*protocol = RC_TYPE_RC6_6A_32;
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 06/11] [media] dib0700: fix nec repeat handling
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Arnd Bergmann, Anna Schumaker, David S. Miller,
	Herbert Xu, Ilya Dryomov, Javier Martinez Canillas, Jiri Kosina,
	Jonathan Cameron, Ley Foon Tan, Luis R . Rodriguez,
	Martin Schwidefsky, Mauro Carvalho Chehab, Michal Marek,
	Russell King, Sean Young, Sebastian Ott, Trond Myklebust, x86,
	linux-kbuild
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

From: Sean Young <sean@mess.org>

When receiving a nec repeat, ensure the correct scancode is repeated
rather than a random value from the stack. This removes the need
for the bogus uninitialized_var() and also fixes the warnings:

    drivers/media/usb/dvb-usb/dib0700_core.c: In function ‘dib0700_rc_urb_completion’:
    drivers/media/usb/dvb-usb/dib0700_core.c:679: warning: ‘protocol’ may be used uninitialized in this function

[sean addon: So after writing the patch and submitting it, I've bought the
             hardware on ebay. Without this patch you get random scancodes
             on nec repeats, which the patch indeed fixes.]

Signed-off-by: Sean Young <sean@mess.org>
Tested-by: Sean Young <sean@mess.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/media/usb/dvb-usb/dib0700_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/usb/dvb-usb/dib0700_core.c b/drivers/media/usb/dvb-usb/dib0700_core.c
index 92d5408..47ce9d5 100644
--- a/drivers/media/usb/dvb-usb/dib0700_core.c
+++ b/drivers/media/usb/dvb-usb/dib0700_core.c
@@ -704,7 +704,7 @@ static void dib0700_rc_urb_completion(struct urb *purb)
 	struct dvb_usb_device *d = purb->context;
 	struct dib0700_rc_response *poll_reply;
 	enum rc_type protocol;
-	u32 uninitialized_var(keycode);
+	u32 keycode;
 	u8 toggle;
 
 	deb_info("%s()\n", __func__);
@@ -745,7 +745,8 @@ static void dib0700_rc_urb_completion(struct urb *purb)
 		    poll_reply->nec.data       == 0x00 &&
 		    poll_reply->nec.not_data   == 0xff) {
 			poll_reply->data_state = 2;
-			break;
+			rc_repeat(d->rc_dev);
+			goto resubmit;
 		}
 
 		if ((poll_reply->nec.data ^ poll_reply->nec.not_data) != 0xff) {
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 05/11] s390: pci: don't print uninitialized data for debugging
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

gcc correctly warns about an incorrect use of the 'pa' variable
in case we pass an empty scatterlist to __s390_dma_map_sg:

arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds a bogus initialization to the function to sanitize
the debug output.  I would have preferred a solution without
the initialization, but I only got the report from the
kbuild bot after turning on the warning again, and didn't
manage to reproduce it myself.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/pci/pci_dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 7350c8b..6b2f72f 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -423,7 +423,7 @@ static int __s390_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	dma_addr_t dma_addr_base, dma_addr;
 	int flags = ZPCI_PTE_VALID;
 	struct scatterlist *s;
-	unsigned long pa;
+	unsigned long pa = 0;
 	int ret;
 
 	size = PAGE_ALIGN(size);
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 04/11] nios2: fix timer initcall return value
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

When called more than twice, the nios2_time_init() function
return an uninitialized value, as detected by gcc -Wmaybe-uninitialized

arch/nios2/kernel/time.c: warning: 'ret' may be used uninitialized in this function

This makes it return '0' here, matching the comment above the
function.

Acked-by: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/nios2/kernel/time.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/nios2/kernel/time.c b/arch/nios2/kernel/time.c
index d9563dd..746bf5c 100644
--- a/arch/nios2/kernel/time.c
+++ b/arch/nios2/kernel/time.c
@@ -324,6 +324,7 @@ static int __init nios2_time_init(struct device_node *timer)
 		ret = nios2_clocksource_init(timer);
 		break;
 	default:
+		ret = 0;
 		break;
 	}
 
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 03/11] x86: apm: avoid uninitialized data
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Arnd Bergmann, Anna Schumaker, David S. Miller,
	Herbert Xu, Ilya Dryomov, Javier Martinez Canillas, Jiri Kosina,
	Jonathan Cameron, Ley Foon Tan, Luis R . Rodriguez,
	Martin Schwidefsky, Mauro Carvalho Chehab, Michal Marek,
	Russell King, Sean Young, Sebastian Ott, Trond Myklebust, x86,
	linux-kbuild
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

apm_bios_call() can fail, and return a status in its argument
structure. If that status however is zero during a call from
apm_get_power_status(), we end up using data that may have
never been set, as reported by "gcc -Wmaybe-uninitialized":

arch/x86/kernel/apm_32.c: In function ‘apm’:
arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here
arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here

This changes the function to return "APM_NO_ERROR" here, which
makes the code more robust to broken BIOS versions, and avoids
the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 arch/x86/kernel/apm_32.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c
index c7364bd..51287cd 100644
--- a/arch/x86/kernel/apm_32.c
+++ b/arch/x86/kernel/apm_32.c
@@ -1042,8 +1042,11 @@ static int apm_get_power_status(u_short *status, u_short *bat, u_short *life)
 
 	if (apm_info.get_power_status_broken)
 		return APM_32_UNSUPPORTED;
-	if (apm_bios_call(&call))
+	if (apm_bios_call(&call)) {
+		if (!call.err)
+			return APM_NO_ERROR;
 		return call.err;
+	}
 	*status = call.ebx;
 	*bat = call.ecx;
 	if (apm_info.get_power_status_swabinminutes) {
-- 
2.9.0


^ permalink raw reply related

* [PATCH v2 02/11] NFSv4.1: work around -Wmaybe-uninitialized warning
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use
if we enable -Wmaybe-uninitialized again:

fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized]

gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair
results in a nonzero return value here. Using PTR_ERR_OR_ZERO()
instead makes this clear to the compiler.

Fixes: e09c978aae5b ("NFSv4.1: Fix Oopsable condition in server callback races")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
First submitted on Aug 31, but ended up not getting applied then
as the warning was disabled in v4.8-rc

Anna Schumaker said at the kernel summit that she had applied
it and would send it for 4.9, but as of 2016-11-09 it has not
made it into linux-next.

 fs/nfs/nfs4session.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/nfs4session.c b/fs/nfs/nfs4session.c
index b629730..150c5a1 100644
--- a/fs/nfs/nfs4session.c
+++ b/fs/nfs/nfs4session.c
@@ -178,12 +178,14 @@ static int nfs4_slot_get_seqid(struct nfs4_slot_table  *tbl, u32 slotid,
 	__must_hold(&tbl->slot_tbl_lock)
 {
 	struct nfs4_slot *slot;
+	int ret;
 
 	slot = nfs4_lookup_slot(tbl, slotid);
-	if (IS_ERR(slot))
-		return PTR_ERR(slot);
-	*seq_nr = slot->seq_nr;
-	return 0;
+	ret = PTR_ERR_OR_ZERO(slot);
+	if (!ret)
+		*seq_nr = slot->seq_nr;
+
+	return ret;
 }
 
 /*
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 01/11] Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton
In-Reply-To: <20161110164454.293477-1-arnd@arndb.de>

Traditionally, we have always had warnings about uninitialized variables
enabled, as this is part of -Wall, and generally a good idea [1], but it
also always produced false positives, mainly because this is a variation
of the halting problem and provably impossible to get right in all cases
[2].

Various people have identified cases that are particularly bad for false
positives, and in commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized
when building with -Os"), I turned off the warning for any build that
was done with CC_OPTIMIZE_FOR_SIZE.  This drastically reduced the number
of false positive warnings in the default build but unfortunately had
the side effect of turning the warning off completely in 'allmodconfig'
builds, which in turn led to a lot of warnings (both actual bugs, and
remaining false positives) to go in unnoticed.

With commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE
definition") enabled the warning again for allmodconfig builds in v4.7
and in v4.8-rc1, I had finally managed to address all warnings I get in
an ARM allmodconfig build and most other maybe-uninitialized warnings
for ARM randconfig builds.

However, commit 6e8d666e9253 ("Disable "maybe-uninitialized" warning
globally") was merged at the same time and disabled it completely for
all configurations, because of false-positive warnings on x86 that I had
not addressed until then. This caused a lot of actual bugs to get merged
into mainline, and I sent several dozen patches for these during the v4.9
development cycle. Most of these are actual bugs, some are for correct
code that is safe because it is only called under external constraints
that make it impossible to run into the case that gcc sees, and in a
few cases gcc is just stupid and finds something that can obviously
never happen.

I have now done a few thousand randconfig builds on x86 and collected all
patches that I needed to address every single warning I got (I can provide
the combined patch for the other warnings if anyone is interested),
so I hope we can get the warning back and let people catch the actual
bugs earlier.

This reverts the change to disable the warning completely and for
now brings it back at the "make W=1" level, so we can get it merged
into mainline without introducing false positives. A follow-up
patch enables it on all levels unless some configuration option
turns it off because of false-positives.

Link: https://rusty.ozlabs.org/?p=232 [1]
Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings [2]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 Makefile                   | 10 ++++++----
 arch/arc/Makefile          |  4 +++-
 scripts/Makefile.extrawarn |  3 +++
 scripts/Makefile.ubsan     |  4 ++++
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index f97f786..06e2b73 100644
--- a/Makefile
+++ b/Makefile
@@ -370,7 +370,7 @@ LDFLAGS_MODULE  =
 CFLAGS_KERNEL	=
 AFLAGS_KERNEL	=
 LDFLAGS_vmlinux =
-CFLAGS_GCOV	= -fprofile-arcs -ftest-coverage -fno-tree-loop-im
+CFLAGS_GCOV	= -fprofile-arcs -ftest-coverage -fno-tree-loop-im -Wno-maybe-uninitialized
 CFLAGS_KCOV	:= $(call cc-option,-fsanitize-coverage=trace-pc,)
 
 
@@ -620,7 +620,6 @@ ARCH_CFLAGS :=
 include arch/$(SRCARCH)/Makefile
 
 KBUILD_CFLAGS	+= $(call cc-option,-fno-delete-null-pointer-checks,)
-KBUILD_CFLAGS	+= $(call cc-disable-warning,maybe-uninitialized,)
 KBUILD_CFLAGS	+= $(call cc-disable-warning,frame-address,)
 
 ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
@@ -629,15 +628,18 @@ KBUILD_CFLAGS	+= $(call cc-option,-fdata-sections,)
 endif
 
 ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
-KBUILD_CFLAGS	+= -Os
+KBUILD_CFLAGS	+= -Os $(call cc-disable-warning,maybe-uninitialized,)
 else
 ifdef CONFIG_PROFILE_ALL_BRANCHES
-KBUILD_CFLAGS	+= -O2
+KBUILD_CFLAGS	+= -O2 $(call cc-disable-warning,maybe-uninitialized,)
 else
 KBUILD_CFLAGS   += -O2
 endif
 endif
 
+KBUILD_CFLAGS += $(call cc-ifversion, -lt, 0409, \
+			$(call cc-disable-warning,maybe-uninitialized,))
+
 # Tell gcc to never replace conditional load with a non-conditional one
 KBUILD_CFLAGS	+= $(call cc-option,--param=allow-store-data-races=0)
 
diff --git a/arch/arc/Makefile b/arch/arc/Makefile
index 864adad..25f81a1 100644
--- a/arch/arc/Makefile
+++ b/arch/arc/Makefile
@@ -68,7 +68,9 @@ cflags-$(CONFIG_ARC_DW2_UNWIND)		+= -fasynchronous-unwind-tables $(cfi)
 ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE
 # Generic build system uses -O2, we want -O3
 # Note: No need to add to cflags-y as that happens anyways
-ARCH_CFLAGS += -O3
+#
+# Disable the false maybe-uninitialized warings gcc spits out at -O3
+ARCH_CFLAGS += -O3 $(call cc-disable-warning,maybe-uninitialized,)
 endif
 
 # small data is default for elf32 tool-chain. If not usable, disable it
diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
index 53449a6..7fc2c5a 100644
--- a/scripts/Makefile.extrawarn
+++ b/scripts/Makefile.extrawarn
@@ -36,6 +36,7 @@ warning-2 += -Wshadow
 warning-2 += $(call cc-option, -Wlogical-op)
 warning-2 += $(call cc-option, -Wmissing-field-initializers)
 warning-2 += $(call cc-option, -Wsign-compare)
+warning-2 += $(call cc-option, -Wmaybe-uninitialized)
 
 warning-3 := -Wbad-function-cast
 warning-3 += -Wcast-qual
@@ -59,6 +60,8 @@ endif
 KBUILD_CFLAGS += $(warning)
 else
 
+KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
+
 ifeq ($(cc-name),clang)
 KBUILD_CFLAGS += $(call cc-disable-warning, initializer-overrides)
 KBUILD_CFLAGS += $(call cc-disable-warning, unused-value)
diff --git a/scripts/Makefile.ubsan b/scripts/Makefile.ubsan
index dd779c4..3b1b138 100644
--- a/scripts/Makefile.ubsan
+++ b/scripts/Makefile.ubsan
@@ -17,4 +17,8 @@ endif
 ifdef CONFIG_UBSAN_NULL
       CFLAGS_UBSAN += $(call cc-option, -fsanitize=null)
 endif
+
+      # -fsanitize=* options makes GCC less smart than usual and
+      # increase number of 'maybe-uninitialized false-positives
+      CFLAGS_UBSAN += $(call cc-option, -Wno-maybe-uninitialized)
 endif
-- 
2.9.0

^ permalink raw reply related

* [PATCH v2 00/11] getting back -Wmaybe-uninitialized
From: Arnd Bergmann @ 2016-11-10 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean Young, Trond Myklebust, linux-s390, Herbert Xu, x86,
	Sebastian Ott, Russell King, Javier Martinez Canillas,
	Ilya Dryomov, linux-snps-arc, linux-media, Arnd Bergmann,
	linux-kbuild, Jiri Kosina, Michal Marek, nios2-dev,
	Mauro Carvalho Chehab, linux-nfs, linux-kernel, Anna Schumaker,
	Luis R . Rodriguez, linux-crypto, Martin Schwidefsky,
	Ley Foon Tan, Andrew Morton

Hi Linus,

It took a while for some patches to make it into mainline through
maintainer trees, but the 28-patch series is now reduced to 10, with
one tiny patch added at the end.  I hope this can still make it into
v4.9. Aside from patches that are no longer required, I did these changes
compared to version 1:

- Dropped "iio: maxim_thermocouple: detect invalid storage size in
  read()", which is currently in linux-next as commit 32cb7d27e65d.
  This is the only remaining warning I see for a couple of corner
  cases (kbuild bot reports it on blackfin, kernelci bot and
  arm-soc bot both report it on arm64)

- Dropped "brcmfmac: avoid maybe-uninitialized warning in
  brcmf_cfg80211_start_ap", which is currently in net/master
  merge pending.

- Dropped two x86 patches, "x86: math-emu: possible uninitialized
  variable use" and "x86: mark target address as output in 'insb' asm"
  as they do not seem to trigger for a default build, and I got
  no feedback on them. Both of these are ancient issues and seem
  harmless, I will send them again to the x86 maintainers once
  the rest is merged.
  
- Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
  feedback from Ilya Dryomov, who already has a different fix queued up
  for v4.10. The kbuild bot reports this as a warning for xtensa.
 
- Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with a
  simpler patch, this one always triggers but my first solution would not
  be safe for linux-4.9 any more at this point. I'll follow up with
  the larger patch as a cleanup for 4.10.
  
- Replaced "dib0700: fix nec repeat handling" with a better one,
  contributed by Sean Young.

Please merge these directly if you are happy with the result.

As the minimum, I'd hope to see the first patch get in soon,
but the individual bugfixes are hopefully now all appropriate
as well. If you see any regressions with the final patch, just
leave that one out and let me know what problems remain.

	Arnd

Arnd Bergmann (10):
  Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
  NFSv4.1: work around -Wmaybe-uninitialized warning
  x86: apm: avoid uninitialized data
  nios2: fix timer initcall return value
  s390: pci: don't print uninitialized data for debugging
  [media] rc: print correct variable for z8f0811
  crypto: aesni: shut up -Wmaybe-uninitialized warning
  infiniband: shut up a maybe-uninitialized warning
  pcmcia: fix return value of soc_pcmcia_regulator_set
  Kbuild: enable -Wmaybe-uninitialized warnings by default

Sean Young (1):
  [media] dib0700: fix nec repeat handling

 Makefile                                 | 10 +++---
 arch/arc/Makefile                        |  4 ++-
 arch/nios2/kernel/time.c                 |  1 +
 arch/s390/pci/pci_dma.c                  |  2 +-
 arch/x86/crypto/aesni-intel_glue.c       |  4 +--
 arch/x86/kernel/apm_32.c                 |  5 ++-
 drivers/infiniband/core/cma.c            | 54 +++++++++++++++++---------------
 drivers/media/i2c/ir-kbd-i2c.c           |  2 +-
 drivers/media/usb/dvb-usb/dib0700_core.c |  5 +--
 drivers/pcmcia/soc_common.c              |  2 +-
 fs/nfs/nfs4session.c                     | 10 +++---
 scripts/Makefile.extrawarn               |  1 +
 scripts/Makefile.ubsan                   |  4 +++
 13 files changed, 61 insertions(+), 43 deletions(-)

-- 
2.9.0

Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Javier Martinez Canillas <javier@osg.samsung.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sean Young <sean@mess.org>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: x86@kernel.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linux-s390@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-nfs@vger.kernel.org

^ permalink raw reply

* Re: [PATCH] nvmem: sunxi-sid: SID content is not a valid source of randomness
From: Corentin Labbe @ 2016-11-10 15:14 UTC (permalink / raw)
  To: Maxime Ripard, herbert
  Cc: srinivas.kandagatla, wens, linux-kernel, linux-arm-kernel,
	linux-crypto
In-Reply-To: <20161025132648.txeo3rw6yz5wutrg@lukather>

On Tue, Oct 25, 2016 at 03:26:48PM +0200, Maxime Ripard wrote:
> On Tue, Oct 25, 2016 at 07:38:55AM +0200, LABBE Corentin wrote:
> > On Mon, Oct 24, 2016 at 10:10:20PM +0200, Maxime Ripard wrote:
> > > On Sat, Oct 22, 2016 at 03:53:28PM +0200, Corentin Labbe wrote:
> > > > Since SID's content is constant over reboot,
> > > 
> > > That's not true, at least not across all the Allwinner SoCs, and
> > > especially not on the A10 and A20 that this driver supports.
> > > 
> > 
> > On my cubieboard2 (A20)
> > hexdump -C /sys/devices/platform/soc\@01c00000/1c23800.eeprom/sunxi-sid0/nvmem 
> > 00000000  16 51 66 83 80 48 50 72  56 54 48 48 03 c2 75 72  |.Qf..HPrVTHH..ur|
> > 00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00000100  16 51 66 83 80 48 50 72  56 54 48 48 03 c2 75 72  |.Qf..HPrVTHH..ur|
> > 00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00000200
> > cubiedev ~ # reboot
> > cubiedev ~ # hexdump -C /sys/devices/platform/soc\@01c00000/1c23800.eeprom/sunxi-sid0/nvmem 
> > 00000000  16 51 66 83 80 48 50 72  56 54 48 48 03 c2 75 72  |.Qf..HPrVTHH..ur|
> > 00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00000100  16 51 66 83 80 48 50 72  56 54 48 48 03 c2 75 72  |.Qf..HPrVTHH..ur|
> > 00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 00000200
> > 
> > So clearly for me its constant.
> 
> It's constant across reboots, but not across devices. Each device have
> a different SID content, therefore it's a relevant source of entropy
> in the system.
> 

Not the 3 leading digit and not the tailing zeros which are the same accross device.
So only 50% of data are really different accross devices.
Perhaps a "random-range" property could be used ?

Herbert, does it is safe to add that 50% duplicate content via add_device_randomness() ?
Reading add_device_randomness doc, it seems finally it is safe, but if you could confirm it.

Regards

^ permalink raw reply

* Re: [PATCH 10/16] crypto: testmgr - Do not test internal algorithms
From: Marcelo Cerri @ 2016-11-10 11:32 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Linux Crypto Mailing List
In-Reply-To: <E1c1iKm-0004DQ-A4@gondolin.me.apana.org.au>

[-- Attachment #1: Type: text/plain, Size: 9874 bytes --]

I tested this patch and it's working fine.

-- 
Regards,
Marcelo

On Wed, Nov 02, 2016 at 07:19:12AM +0800, Herbert Xu wrote:
> Currently we manually filter out internal algorithms using a list
> in testmgr.  This is dangerous as internal algorithms cannot be
> safely used even by testmgr.  This patch ensures that they're never
> processed by testmgr at all.
> 
> This patch also removes an obsolete bypass for nivciphers which
> no longer exist.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> 
>  crypto/algboss.c |    8 --
>  crypto/testmgr.c |  153 +++----------------------------------------------------
>  2 files changed, 11 insertions(+), 150 deletions(-)
> 
> diff --git a/crypto/algboss.c b/crypto/algboss.c
> index 6e39d9c..ccb85e1 100644
> --- a/crypto/algboss.c
> +++ b/crypto/algboss.c
> @@ -247,12 +247,8 @@ static int cryptomgr_schedule_test(struct crypto_alg *alg)
>  	memcpy(param->alg, alg->cra_name, sizeof(param->alg));
>  	type = alg->cra_flags;
>  
> -	/* This piece of crap needs to disappear into per-type test hooks. */
> -	if (!((type ^ CRYPTO_ALG_TYPE_BLKCIPHER) &
> -	      CRYPTO_ALG_TYPE_BLKCIPHER_MASK) && !(type & CRYPTO_ALG_GENIV) &&
> -	    ((alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
> -	     CRYPTO_ALG_TYPE_BLKCIPHER ? alg->cra_blkcipher.ivsize :
> -					 alg->cra_ablkcipher.ivsize))
> +	/* Do not test internal algorithms. */
> +	if (type & CRYPTO_ALG_INTERNAL)
>  		type |= CRYPTO_ALG_TESTED;
>  
>  	param->type = type;
> diff --git a/crypto/testmgr.c b/crypto/testmgr.c
> index ded50b6..6ac4696 100644
> --- a/crypto/testmgr.c
> +++ b/crypto/testmgr.c
> @@ -1625,7 +1625,7 @@ static int alg_test_aead(const struct alg_test_desc *desc, const char *driver,
>  	struct crypto_aead *tfm;
>  	int err = 0;
>  
> -	tfm = crypto_alloc_aead(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_aead(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		printk(KERN_ERR "alg: aead: Failed to load transform for %s: "
>  		       "%ld\n", driver, PTR_ERR(tfm));
> @@ -1654,7 +1654,7 @@ static int alg_test_cipher(const struct alg_test_desc *desc,
>  	struct crypto_cipher *tfm;
>  	int err = 0;
>  
> -	tfm = crypto_alloc_cipher(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_cipher(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		printk(KERN_ERR "alg: cipher: Failed to load transform for "
>  		       "%s: %ld\n", driver, PTR_ERR(tfm));
> @@ -1683,7 +1683,7 @@ static int alg_test_skcipher(const struct alg_test_desc *desc,
>  	struct crypto_skcipher *tfm;
>  	int err = 0;
>  
> -	tfm = crypto_alloc_skcipher(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_skcipher(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		printk(KERN_ERR "alg: skcipher: Failed to load transform for "
>  		       "%s: %ld\n", driver, PTR_ERR(tfm));
> @@ -1750,7 +1750,7 @@ static int alg_test_hash(const struct alg_test_desc *desc, const char *driver,
>  	struct crypto_ahash *tfm;
>  	int err;
>  
> -	tfm = crypto_alloc_ahash(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_ahash(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		printk(KERN_ERR "alg: hash: Failed to load transform for %s: "
>  		       "%ld\n", driver, PTR_ERR(tfm));
> @@ -1778,7 +1778,7 @@ static int alg_test_crc32c(const struct alg_test_desc *desc,
>  	if (err)
>  		goto out;
>  
> -	tfm = crypto_alloc_shash(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_shash(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		printk(KERN_ERR "alg: crc32c: Failed to load transform for %s: "
>  		       "%ld\n", driver, PTR_ERR(tfm));
> @@ -1820,7 +1820,7 @@ static int alg_test_cprng(const struct alg_test_desc *desc, const char *driver,
>  	struct crypto_rng *rng;
>  	int err;
>  
> -	rng = crypto_alloc_rng(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	rng = crypto_alloc_rng(driver, type, mask);
>  	if (IS_ERR(rng)) {
>  		printk(KERN_ERR "alg: cprng: Failed to load transform for %s: "
>  		       "%ld\n", driver, PTR_ERR(rng));
> @@ -1847,7 +1847,7 @@ static int drbg_cavs_test(struct drbg_testvec *test, int pr,
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	drng = crypto_alloc_rng(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	drng = crypto_alloc_rng(driver, type, mask);
>  	if (IS_ERR(drng)) {
>  		printk(KERN_ERR "alg: drbg: could not allocate DRNG handle for "
>  		       "%s\n", driver);
> @@ -2041,7 +2041,7 @@ static int alg_test_kpp(const struct alg_test_desc *desc, const char *driver,
>  	struct crypto_kpp *tfm;
>  	int err = 0;
>  
> -	tfm = crypto_alloc_kpp(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_kpp(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		pr_err("alg: kpp: Failed to load tfm for %s: %ld\n",
>  		       driver, PTR_ERR(tfm));
> @@ -2200,7 +2200,7 @@ static int alg_test_akcipher(const struct alg_test_desc *desc,
>  	struct crypto_akcipher *tfm;
>  	int err = 0;
>  
> -	tfm = crypto_alloc_akcipher(driver, type | CRYPTO_ALG_INTERNAL, mask);
> +	tfm = crypto_alloc_akcipher(driver, type, mask);
>  	if (IS_ERR(tfm)) {
>  		pr_err("alg: akcipher: Failed to load tfm for %s: %ld\n",
>  		       driver, PTR_ERR(tfm));
> @@ -2223,88 +2223,6 @@ static int alg_test_null(const struct alg_test_desc *desc,
>  /* Please keep this list sorted by algorithm name. */
>  static const struct alg_test_desc alg_test_descs[] = {
>  	{
> -		.alg = "__cbc-cast5-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__cbc-cast6-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__cbc-serpent-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__cbc-serpent-avx2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__cbc-serpent-sse2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__cbc-twofish-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-aes-aesni",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "__driver-cbc-camellia-aesni",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-camellia-aesni-avx2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-cast5-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-cast6-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-serpent-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-serpent-avx2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-serpent-sse2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-cbc-twofish-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-aes-aesni",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "__driver-ecb-camellia-aesni",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-camellia-aesni-avx2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-cast5-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-cast6-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-serpent-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-serpent-avx2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-serpent-sse2",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-ecb-twofish-avx",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "__driver-gcm-aes-aesni",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "__ghash-pclmulqdqni",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
>  		.alg = "ansi_cprng",
>  		.test = alg_test_cprng,
>  		.suite = {
> @@ -2791,55 +2709,6 @@ static int alg_test_null(const struct alg_test_desc *desc,
>  			}
>  		}
>  	}, {
> -		.alg = "cryptd(__driver-cbc-aes-aesni)",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "cryptd(__driver-cbc-camellia-aesni)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-cbc-camellia-aesni-avx2)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-cbc-serpent-avx2)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-aes-aesni)",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-camellia-aesni)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-camellia-aesni-avx2)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-cast5-avx)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-cast6-avx)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-serpent-avx)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-serpent-avx2)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-serpent-sse2)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-ecb-twofish-avx)",
> -		.test = alg_test_null,
> -	}, {
> -		.alg = "cryptd(__driver-gcm-aes-aesni)",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
> -		.alg = "cryptd(__ghash-pclmulqdqni)",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
>  		.alg = "ctr(aes)",
>  		.test = alg_test_skcipher,
>  		.fips_allowed = 1,
> @@ -3166,10 +3035,6 @@ static int alg_test_null(const struct alg_test_desc *desc,
>  		.fips_allowed = 1,
>  		.test = alg_test_null,
>  	}, {
> -		.alg = "ecb(__aes-aesni)",
> -		.test = alg_test_null,
> -		.fips_allowed = 1,
> -	}, {
>  		.alg = "ecb(aes)",
>  		.test = alg_test_skcipher,
>  		.fips_allowed = 1,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply

* Re: [PATCH 6/6] Add support for AEAD algos.
From: Harsh Jain @ 2016-11-10  5:00 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: dan.carpenter, herbert, linux-crypto, jlulla, atul.gupta,
	yeshaswi, hariprasad
In-Reply-To: <25e12af5-190d-8204-0d5e-bf9a7fd190c9@chelsio.com>



On 08-11-2016 19:51, Harsh Jain wrote:
>
> On 08-11-2016 18:29, Stephan Mueller wrote:
>> Am Dienstag, 8. November 2016, 17:16:38 CET schrieb Harsh Jain:
>>
>> Hi Harsh,
>>
>>> On 08-11-2016 16:45, Stephan Mueller wrote:
>>>> Am Donnerstag, 27. Oktober 2016, 15:36:08 CET schrieb Harsh Jain:
>>>>
>>>> Hi Harsh,
>>>>
>>>>>>> +static void chcr_verify_tag(struct aead_request *req, u8 *input, int
>>>>>>> *err)
>>>>>>> +{
>>>>>>> +	u8 temp[SHA512_DIGEST_SIZE];
>>>>>>> +	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
>>>>>>> +	int authsize = crypto_aead_authsize(tfm);
>>>>>>> +	struct cpl_fw6_pld *fw6_pld;
>>>>>>> +	int cmp = 0;
>>>>>>> +
>>>>>>> +	fw6_pld = (struct cpl_fw6_pld *)input;
>>>>>>> +	if ((get_aead_subtype(tfm) == CRYPTO_ALG_SUB_TYPE_AEAD_RFC4106) ||
>>>>>>> +	    (get_aead_subtype(tfm) == CRYPTO_ALG_SUB_TYPE_AEAD_GCM)) {
>>>>>>> +		cmp = memcmp(&fw6_pld->data[2], (fw6_pld + 1), authsize);
>>>>>>> +	} else {
>>>>>>> +
>>>>>>> +		sg_pcopy_to_buffer(req->src, sg_nents(req->src), temp,
>>>>>>> +				authsize, req->assoclen +
>>>>>>> +				req->cryptlen - authsize);
>>>>>> I am wondering whether the math is correct here in any case. It is
>>>>>> permissible that we have an AAD size of 0 and even a zero-sized
>>>>>> ciphertext. How is such scenario covered here?
>>>>> Here we are trying to copy user supplied tag to local buffer(temp) for
>>>>> decrypt operation only. relative index of tag in src sg list will not
>>>>> change when AAD is zero and in decrypt operation cryptlen > authsize.
>>>> I am just wondering where this is checked. Since all of these
>>>> implementations are directly accessible from unprivileged user space, we
>>>> should be careful.
>>> chcr_verify_tag() will be called when req->verify is set to "VERIFY_SW", 
>>> same will set in decrypt callback function of Algo(like chcr_aead_decrypt)
>>> only. It will ensure calling of chcr_verify_tag() in de-crypt operation
>>> only.
>> I think that limiting to the decryption path may not be enough. What happens 
>> if a caller sets some assoclen, but when invoking the decryption operation it 
>> provides input data that is smaller than the assoclen? The API allows this 
>> scenario.
> If I understand correctly, in this case passed sg list will be smaller. We should return with error -EINVAL at entry point only (like create_gcm_wr), control should not reach to chcr_verify_tag().
I had a look in software implementation for check related to aad len > src sg list.  I doubt same is not handled in software also. See  below
                In "crypto_authenc_encrypt" if assoclen passed to "scatterwalk_ffwd" is greater than src. It may panic with NULL pointer exception.

 I will add  this check in V2 of chcr driver.

>
>> Ciao
>> Stephan

^ permalink raw reply

* [PATCH 0/3] crypto: AF_ALG - AEAD memory handling fixes
From: Stephan Mueller @ 2016-11-10  3:30 UTC (permalink / raw)
  To: herbert; +Cc: mathew.j.martineau, linux-crypto

Hi Herbert,

The first patch is unchanged compared to a previous submission to the
mailing list. It is rolled into this patch set to have a common reference
for the AEAD user space interface changes. Therefore, please disregard
the old patch submission.

The third patch should go into stable as this fixes a kernel crash that
is present in current kernel versions.

The patch set can be tested with the HEAD branch of libkcapi found at [1].
If the patch set is accepted, the development branch will be released as
0.13.0.

[1] https://github.com/smuellerDD/libkcapi

Stephan Mueller (3):
  crypto: AF_ALG - fix AEAD tag memory handling
  crypto: AF_ALG - disregard AAD buffer space for output
  crypto: AF_ALG - fix AEAD AIO handling of zero buffer

 crypto/algif_aead.c | 215 ++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 166 insertions(+), 49 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH 1/3] crypto: AF_ALG - fix AEAD tag memory handling
From: Stephan Mueller @ 2016-11-10  3:31 UTC (permalink / raw)
  To: herbert; +Cc: mathew.j.martineau, linux-crypto
In-Reply-To: <3302668.EORF2jai5r@positron.chronox.de>

For encryption, the AEAD ciphers require AAD || PT as input and generate
AAD || CT || Tag as output and vice versa for decryption. Prior to this
patch, the AF_ALG interface for AEAD ciphers requires the buffer to be
present as input for encryption. Similarly, the output buffer for
decryption required the presence of the tag buffer too. This implies
that the kernel reads / writes data buffers from/to kernel space
even though this operation is not required.

This patch changes the AF_ALG AEAD interface to be consistent with the
in-kernel AEAD cipher requirements.

In addition, the code now handles the situation where the provided
output buffer is too small by reducing the size of the processed
input buffer accordingly. Due to this handling, he changes are
transparent to user space with one exception: the return code of recv
indicates the mount of output buffer. That output buffer has a different
size compared to before the patch which implies that the return code of
recv will also be different. For example, a decryption operation uses 16
bytes AAD, 16 bytes CT and 16 bytes tag, the AF_ALG AEAD interface
before showed a recv return code of 48 (bytes) whereas after this patch,
the return code is 32 since the tag is not returned any more.

Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 crypto/algif_aead.c | 77 ++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 55 insertions(+), 22 deletions(-)

diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 80a0f1a..c54bcb8 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -81,7 +81,11 @@ static inline bool aead_sufficient_data(struct aead_ctx *ctx)
 {
 	unsigned as = crypto_aead_authsize(crypto_aead_reqtfm(&ctx->aead_req));
 
-	return ctx->used >= ctx->aead_assoclen + as;
+	/*
+	 * The minimum amount of memory needed for an AEAD cipher is
+	 * the AAD and in case of decryption the tag.
+	 */
+	return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as);
 }
 
 static void aead_reset_ctx(struct aead_ctx *ctx)
@@ -426,12 +430,15 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
 			goto unlock;
 	}
 
-	used = ctx->used;
-	outlen = used;
-
 	if (!aead_sufficient_data(ctx))
 		goto unlock;
 
+	used = ctx->used;
+	if (ctx->enc)
+		outlen = used + as;
+	else
+		outlen = used - as;
+
 	req = sock_kmalloc(sk, reqlen, GFP_KERNEL);
 	if (unlikely(!req))
 		goto unlock;
@@ -445,7 +452,7 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
 	aead_request_set_ad(req, ctx->aead_assoclen);
 	aead_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
 				  aead_async_cb, sk);
-	used -= ctx->aead_assoclen + (ctx->enc ? as : 0);
+	used -= ctx->aead_assoclen;
 
 	/* take over all tx sgls from ctx */
 	areq->tsgl = sock_kmalloc(sk, sizeof(*areq->tsgl) * sgl->cur,
@@ -461,7 +468,7 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
 	areq->tsgls = sgl->cur;
 
 	/* create rx sgls */
-	while (iov_iter_count(&msg->msg_iter)) {
+	while (outlen > usedpages && iov_iter_count(&msg->msg_iter)) {
 		size_t seglen = min_t(size_t, iov_iter_count(&msg->msg_iter),
 				      (outlen - usedpages));
 
@@ -491,16 +498,20 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
 
 		last_rsgl = rsgl;
 
-		/* we do not need more iovecs as we have sufficient memory */
-		if (outlen <= usedpages)
-			break;
-
 		iov_iter_advance(&msg->msg_iter, err);
 	}
-	err = -EINVAL;
+
 	/* ensure output buffer is sufficiently large */
-	if (usedpages < outlen)
-		goto free;
+	if (usedpages < outlen) {
+		size_t less = outlen - usedpages;
+
+		if (used < less) {
+			err = -EINVAL;
+			goto unlock;
+		}
+		used -= less;
+		outlen -= less;
+	}
 
 	aead_request_set_crypt(req, areq->tsgl, areq->first_rsgl.sgl.sg, used,
 			       areq->iv);
@@ -571,6 +582,7 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
 			goto unlock;
 	}
 
+	/* data length provided by caller via sendmsg/sendpage */
 	used = ctx->used;
 
 	/*
@@ -585,16 +597,27 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
 	if (!aead_sufficient_data(ctx))
 		goto unlock;
 
-	outlen = used;
+	/*
+	 * Calculate the minimum output buffer size holding the result of the
+	 * cipher operation. When encrypting data, the receiving buffer is
+	 * larger by the tag length compared to the input buffer as the
+	 * encryption operation generates the tag. For decryption, the input
+	 * buffer provides the tag which is consumed resulting in only the
+	 * plaintext without a buffer for the tag returned to the caller.
+	 */
+	if (ctx->enc)
+		outlen = used + as;
+	else
+		outlen = used - as;
 
 	/*
 	 * The cipher operation input data is reduced by the associated data
 	 * length as this data is processed separately later on.
 	 */
-	used -= ctx->aead_assoclen + (ctx->enc ? as : 0);
+	used -= ctx->aead_assoclen;
 
 	/* convert iovecs of output buffers into scatterlists */
-	while (iov_iter_count(&msg->msg_iter)) {
+	while (outlen > usedpages && iov_iter_count(&msg->msg_iter)) {
 		size_t seglen = min_t(size_t, iov_iter_count(&msg->msg_iter),
 				      (outlen - usedpages));
 
@@ -621,16 +644,26 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
 
 		last_rsgl = rsgl;
 
-		/* we do not need more iovecs as we have sufficient memory */
-		if (outlen <= usedpages)
-			break;
 		iov_iter_advance(&msg->msg_iter, err);
 	}
 
-	err = -EINVAL;
 	/* ensure output buffer is sufficiently large */
-	if (usedpages < outlen)
-		goto unlock;
+	if (usedpages < outlen) {
+		size_t less = outlen - usedpages;
+
+		if (used < less) {
+			err = -EINVAL;
+			goto unlock;
+		}
+
+		/*
+		 * Caller has smaller output buffer than needed, reduce
+		 * the input data length to be processed to fit the provided
+		 * output buffer.
+		 */
+		used -= less;
+		outlen -= less;
+	}
 
 	sg_mark_end(sgl->sg + sgl->cur - 1);
 	aead_request_set_crypt(&ctx->aead_req, sgl->sg, ctx->first_rsgl.sgl.sg,
-- 
2.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox