* Re: [PATCH 0/3] crypto: picoxcell - Cleanups removing non-DT code
From: Herbert Xu @ 2017-01-12 16:39 UTC (permalink / raw)
To: Javier Martinez Canillas
Cc: linux-kernel, Arnd Bergmann, Jamie Iles, David S. Miller,
linux-crypto, linux-arm-kernel
In-Reply-To: <1483376819-26726-1-git-send-email-javier@osg.samsung.com>
On Mon, Jan 02, 2017 at 02:06:56PM -0300, Javier Martinez Canillas wrote:
> Hello,
>
> This small series contains a couple of cleanups that removes some driver's code
> that isn't needed due the driver being for a DT-only platform.
>
> The changes were suggested by Arnd Bergmann as a response to a previous patch:
> https://lkml.org/lkml/2017/1/2/342
>
> Patch #1 allows the driver to be built when the COMPILE_TEST option is enabled.
> Patch #2 removes the platform ID table since isn't needed for DT-only drivers.
> Patch #3 removes a wrapper function that's also not needed if driver is DT-only.
All applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH] crypto: mediatek: don't return garbage err on successful return
From: Herbert Xu @ 2017-01-12 16:39 UTC (permalink / raw)
To: Colin King
Cc: David S . Miller, Matthias Brugger, Ryder Lee, linux-crypto,
linux-arm-kernel, linux-mediatek, linux-kernel
In-Reply-To: <20170103132122.26900-1-colin.king@canonical.com>
On Tue, Jan 03, 2017 at 01:21:22PM +0000, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> In the case where keylen <= bs mtk_sha_setkey returns an uninitialized
> return value in err. Fix this by returning 0 instead of err.
>
> Issue detected by static analysis with cppcheck.
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Patch applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v1 3/8] crypto:chcr- Fix key length for RFC4106
From: Herbert Xu @ 2017-01-12 16:42 UTC (permalink / raw)
To: Harsh Jain; +Cc: hariprasad, netdev, linux-crypto
In-Reply-To: <648404c5-4234-7895-d478-858433b0d093@chelsio.com>
On Thu, Jan 12, 2017 at 10:08:46PM +0530, Harsh Jain wrote:
>
> That case is already handled in next if condition.It will error out with -EINVAL in next condition.
>
> if (keylen == AES_KEYSIZE_128) {
Good point. Please split the patches according to whether they
should go into 4.10/4.11 and then resubmit.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests
From: Herbert Xu @ 2017-01-12 16:44 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List,
Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
Ard Biesheuvel
In-Reply-To: <890f4bdb28a1cf72f6b802b220b35ebaf0f76bb9.1484090585.git.luto@kernel.org>
On Tue, Jan 10, 2017 at 03:24:46PM -0800, Andy Lutomirski wrote:
> There are some hashes (e.g. sha224) that have some internal trickery
> to make sure that only the correct number of output bytes are
> generated. If something goes wrong, they could potentially overrun
> the output buffer.
>
> Make the test more robust by allocating only enough space for the
> correct output size so that memory debugging will catch the error if
> the output is overrun.
>
> Tested by intentionally breaking sha224 to output all 256
> internally-generated bits while running on KASAN.
>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
Patch applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH 2/2] crypto: mediatek - fix format string for 64-bit builds
From: Herbert Xu @ 2017-01-12 16:44 UTC (permalink / raw)
To: Arnd Bergmann
Cc: David S. Miller, Matthias Brugger, Ryder Lee, linux-crypto,
linux-arm-kernel, linux-mediatek, linux-kernel
In-Reply-To: <20170111135601.4047225-1-arnd@arndb.de>
On Wed, Jan 11, 2017 at 02:55:20PM +0100, Arnd Bergmann wrote:
> After I enabled COMPILE_TEST for non-ARM targets, I ran into these
> warnings:
>
> crypto/mediatek/mtk-aes.c: In function 'mtk_aes_info_map':
> crypto/mediatek/mtk-aes.c:224:28: error: format '%d' expects argument of type 'int', but argument 3 has type 'long unsigned int' [-Werror=format=]
> dev_err(cryp->dev, "dma %d bytes error\n", sizeof(*info));
> crypto/mediatek/mtk-sha.c:344:28: error: format '%d' expects argument of type 'int', but argument 3 has type 'long unsigned int' [-Werror=format=]
> crypto/mediatek/mtk-sha.c:550:21: error: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'size_t {aka long unsigned int}' [-Werror=format=]
>
> The correct format for size_t is %zu, so use that in all three
> cases.
>
> Fixes: 785e5c616c84 ("crypto: mediatek - Add crypto driver support for some MediaTek chips")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Patch applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v2 0/7] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11
From: Herbert Xu @ 2017-01-12 16:45 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: linux-crypto, linux-arm-kernel
In-Reply-To: <1484152915-26517-1-git-send-email-ard.biesheuvel@linaro.org>
On Wed, Jan 11, 2017 at 04:41:48PM +0000, Ard Biesheuvel wrote:
> This adds ARM and arm64 implementations of ChaCha20, scalar AES and SIMD
> AES (using bit slicing). The SIMD algorithms in this series take advantage
> of the new skcipher walksize attribute to iterate over the input in the most
> efficient manner possible.
>
> Patch #1 adds a NEON implementation of ChaCha20 for ARM.
>
> Patch #2 adds a NEON implementation of ChaCha20 for arm64.
>
> Patch #3 modifies the existing NEON and ARMv8 Crypto Extensions implementations
> of AES-CTR to be available as a synchronous skcipher as well. This is intended
> for the mac80211 code, which uses synchronous encapsulations of ctr(aes)
> [ccm, gcm] in softirq context, during which arm64 supports use of SIMD code.
>
> Patch #4 adds a scalar implementation of AES for arm64, using the key schedule
> generation routines and lookup tables of the generic code in crypto/aes_generic.
>
> Patch #5 does the same for ARM, replacing existing scalar code that originated
> in the OpenSSL project, and contains redundant key schedule generation routines
> and lookup tables (and is slightly slower on modern cores)
>
> Patch #6 replaces the ARM bit sliced NEON code with a new implementation that
> has a number of advantages over the original code (which also originated in the
> OpenSSL project.) The performance should be identical.
>
> Patch #7 adds a port of the ARM bit-sliced AES code to arm64, in ECB, CBC, CTR
> and XTS modes.
>
> Due to the size of patch #7, it may be difficult to apply these patches from
> patchwork, so I pushed them here as well:
It seems to have made it.
All applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH 1/2] crypto: mediatek - remove ARM dependencies
From: Herbert Xu @ 2017-01-12 16:44 UTC (permalink / raw)
To: Arnd Bergmann
Cc: David S. Miller, Matthias Brugger, Ryder Lee, linux-crypto,
linux-kernel, linux-arm-kernel, linux-mediatek
In-Reply-To: <20170111135104.3961730-1-arnd@arndb.de>
On Wed, Jan 11, 2017 at 02:50:19PM +0100, Arnd Bergmann wrote:
> Building the mediatek driver on an older ARM architecture results in a
> harmless warning:
>
> warning: (ARCH_OMAP2PLUS_TYPICAL && CRYPTO_DEV_MEDIATEK) selects NEON which has unmet direct dependencies (VFPv3 && CPU_V7)
>
> We could add an explicit dependency on CPU_V7, but it seems nicer to
> open up the build to additional configurations. This replaces the ARM
> optimized algorithm selection with the normal one that all other drivers
> use, and that in turn lets us relax the dependency on ARM and drop
> a number of the unrelated 'select' statements.
>
> Obviously a real user would still select those other optimized drivers
> as a fallback, but as there is no strict dependency, we can leave that
> up to the user.
>
> Fixes: 785e5c616c84 ("crypto: mediatek - Add crypto driver support for some MediaTek chips")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Patch applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v2 0/7] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11
From: Ard Biesheuvel @ 2017-01-12 16:48 UTC (permalink / raw)
To: Herbert Xu
Cc: linux-crypto@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <20170112164504.GD20313@gondor.apana.org.au>
On 12 January 2017 at 16:45, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, Jan 11, 2017 at 04:41:48PM +0000, Ard Biesheuvel wrote:
>> This adds ARM and arm64 implementations of ChaCha20, scalar AES and SIMD
>> AES (using bit slicing). The SIMD algorithms in this series take advantage
>> of the new skcipher walksize attribute to iterate over the input in the most
>> efficient manner possible.
>>
>> Patch #1 adds a NEON implementation of ChaCha20 for ARM.
>>
>> Patch #2 adds a NEON implementation of ChaCha20 for arm64.
>>
>> Patch #3 modifies the existing NEON and ARMv8 Crypto Extensions implementations
>> of AES-CTR to be available as a synchronous skcipher as well. This is intended
>> for the mac80211 code, which uses synchronous encapsulations of ctr(aes)
>> [ccm, gcm] in softirq context, during which arm64 supports use of SIMD code.
>>
>> Patch #4 adds a scalar implementation of AES for arm64, using the key schedule
>> generation routines and lookup tables of the generic code in crypto/aes_generic.
>>
>> Patch #5 does the same for ARM, replacing existing scalar code that originated
>> in the OpenSSL project, and contains redundant key schedule generation routines
>> and lookup tables (and is slightly slower on modern cores)
>>
>> Patch #6 replaces the ARM bit sliced NEON code with a new implementation that
>> has a number of advantages over the original code (which also originated in the
>> OpenSSL project.) The performance should be identical.
>>
>> Patch #7 adds a port of the ARM bit-sliced AES code to arm64, in ECB, CBC, CTR
>> and XTS modes.
>>
>> Due to the size of patch #7, it may be difficult to apply these patches from
>> patchwork, so I pushed them here as well:
>
> It seems to have made it.
>
> All applied. Thanks.
Actually, patch #6 was the huge one not #7, and I don't see it in your tree yet.
https://git.kernel.org/cgit/linux/kernel/git/ardb/linux.git/commit/?h=crypto-arm-v4.11&id=cbf03b255f7c
The order does not matter, though, so could you please put it on top? Thanks.
--
Ard.
^ permalink raw reply
* [PATCH 0/4] n2rng: add support for m5/m7 rng register layout
From: Shannon Nelson @ 2017-01-12 18:52 UTC (permalink / raw)
To: linux-crypto; +Cc: sparclinux, herbert, linux-kernel, Shannon Nelson
Commit c1e9b3b0eea1 ("hwrng: n2 - Attach on T5/M5, T7/M7 SPARC CPUs")
added config strings to enable the random number generator in the sparc
m5 and m7 platforms. This worked fine for client LDoms, but not for the
primary LDom, or running on bare metal, because the actual rng hardware
layout changed and self-test would now fail, continually spewing error
messages on the console.
This patch series adds correct support for the new rng register layout,
and adds a limiter to the spewing of error messages.
Orabug: 25127795
Shannon Nelson (4):
n2rng: limit error spewage when self-test fails
n2rng: add device data descriptions
n2rng: support new hardware register layout
n2rng: update version info
drivers/char/hw_random/n2-drv.c | 204 +++++++++++++++++++++++++++++----------
drivers/char/hw_random/n2rng.h | 51 ++++++++--
2 files changed, 196 insertions(+), 59 deletions(-)
^ permalink raw reply
* [PATCH 1/4] n2rng: limit error spewage when self-test fails
From: Shannon Nelson @ 2017-01-12 18:52 UTC (permalink / raw)
To: linux-crypto; +Cc: sparclinux, herbert, linux-kernel, Shannon Nelson
In-Reply-To: <1484247169-245086-1-git-send-email-shannon.nelson@oracle.com>
If the self-test fails, it probably won't actually suddenly
start working. Currently, this causes an endless spew of
error messages on the console and in the logs, so this patch
adds a limiter to the test.
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
drivers/char/hw_random/n2-drv.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c
index 3b06c1d..102560f 100644
--- a/drivers/char/hw_random/n2-drv.c
+++ b/drivers/char/hw_random/n2-drv.c
@@ -589,6 +589,7 @@ static void n2rng_work(struct work_struct *work)
{
struct n2rng *np = container_of(work, struct n2rng, work.work);
int err = 0;
+ static int retries = 4;
if (!(np->flags & N2RNG_FLAG_CONTROL)) {
err = n2rng_guest_check(np);
@@ -606,7 +607,9 @@ static void n2rng_work(struct work_struct *work)
dev_info(&np->op->dev, "RNG ready\n");
}
- if (err && !(np->flags & N2RNG_FLAG_SHUTDOWN))
+ if (--retries == 0)
+ dev_err(&np->op->dev, "Self-test retries failed, RNG not ready\n");
+ else if (err && !(np->flags & N2RNG_FLAG_SHUTDOWN))
schedule_delayed_work(&np->work, HZ * 2);
}
--
1.7.1
^ permalink raw reply related
* [PATCH 2/4] n2rng: add device data descriptions
From: Shannon Nelson @ 2017-01-12 18:52 UTC (permalink / raw)
To: linux-crypto; +Cc: sparclinux, herbert, linux-kernel, Shannon Nelson
In-Reply-To: <1484247169-245086-1-git-send-email-shannon.nelson@oracle.com>
Since we're going to need to keep track of more than just one
attribute of the hardware, we'll change the use of the data field
from the match struct from a single flag to a struct pointer.
This patch adds the struct template and initial descriptions.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
drivers/char/hw_random/n2-drv.c | 47 ++++++++++++++++++++++++++++++++------
drivers/char/hw_random/n2rng.h | 15 ++++++++++++
2 files changed, 54 insertions(+), 8 deletions(-)
diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c
index 102560f..74c26c7 100644
--- a/drivers/char/hw_random/n2-drv.c
+++ b/drivers/char/hw_random/n2-drv.c
@@ -625,24 +625,23 @@ static void n2rng_driver_version(void)
static int n2rng_probe(struct platform_device *op)
{
const struct of_device_id *match;
- int multi_capable;
int err = -ENOMEM;
struct n2rng *np;
match = of_match_device(n2rng_match, &op->dev);
if (!match)
return -EINVAL;
- multi_capable = (match->data != NULL);
n2rng_driver_version();
np = devm_kzalloc(&op->dev, sizeof(*np), GFP_KERNEL);
if (!np)
goto out;
np->op = op;
+ np->data = (struct n2rng_template *)match->data;
INIT_DELAYED_WORK(&np->work, n2rng_work);
- if (multi_capable)
+ if (np->data->multi_capable)
np->flags |= N2RNG_FLAG_MULTI;
err = -ENODEV;
@@ -673,8 +672,9 @@ static int n2rng_probe(struct platform_device *op)
dev_err(&op->dev, "VF RNG lacks rng-#units property\n");
goto out_hvapi_unregister;
}
- } else
+ } else {
np->num_units = 1;
+ }
dev_info(&op->dev, "Registered RNG HVAPI major %lu minor %lu\n",
np->hvapi_major, np->hvapi_minor);
@@ -731,30 +731,61 @@ static int n2rng_remove(struct platform_device *op)
return 0;
}
+static struct n2rng_template n2_template = {
+ .id = N2_n2_rng,
+ .multi_capable = 0,
+ .chip_version = 1,
+};
+
+static struct n2rng_template vf_template = {
+ .id = N2_vf_rng,
+ .multi_capable = 1,
+ .chip_version = 1,
+};
+
+static struct n2rng_template kt_template = {
+ .id = N2_kt_rng,
+ .multi_capable = 1,
+ .chip_version = 1,
+};
+
+static struct n2rng_template m4_template = {
+ .id = N2_m4_rng,
+ .multi_capable = 1,
+ .chip_version = 2,
+};
+
+static struct n2rng_template m7_template = {
+ .id = N2_m7_rng,
+ .multi_capable = 1,
+ .chip_version = 2,
+};
+
static const struct of_device_id n2rng_match[] = {
{
.name = "random-number-generator",
.compatible = "SUNW,n2-rng",
+ .data = &n2_template,
},
{
.name = "random-number-generator",
.compatible = "SUNW,vf-rng",
- .data = (void *) 1,
+ .data = &vf_template,
},
{
.name = "random-number-generator",
.compatible = "SUNW,kt-rng",
- .data = (void *) 1,
+ .data = &kt_template,
},
{
.name = "random-number-generator",
.compatible = "ORCL,m4-rng",
- .data = (void *) 1,
+ .data = &m4_template,
},
{
.name = "random-number-generator",
.compatible = "ORCL,m7-rng",
- .data = (void *) 1,
+ .data = &m7_template,
},
{},
};
diff --git a/drivers/char/hw_random/n2rng.h b/drivers/char/hw_random/n2rng.h
index f244ac8..e41e55a 100644
--- a/drivers/char/hw_random/n2rng.h
+++ b/drivers/char/hw_random/n2rng.h
@@ -60,6 +60,20 @@ extern unsigned long sun4v_rng_data_read_diag_v2(unsigned long data_ra,
extern unsigned long sun4v_rng_data_read(unsigned long data_ra,
unsigned long *tick_delta);
+enum n2rng_compat_id {
+ N2_n2_rng,
+ N2_vf_rng,
+ N2_kt_rng,
+ N2_m4_rng,
+ N2_m7_rng,
+};
+
+struct n2rng_template {
+ enum n2rng_compat_id id;
+ int multi_capable;
+ int chip_version;
+};
+
struct n2rng_unit {
u64 control[HV_RNG_NUM_CONTROL];
};
@@ -74,6 +88,7 @@ struct n2rng {
#define N2RNG_FLAG_SHUTDOWN 0x00000010 /* Driver unregistering */
#define N2RNG_FLAG_BUFFER_VALID 0x00000020 /* u32 buffer holds valid data */
+ struct n2rng_template *data;
int num_units;
struct n2rng_unit *units;
--
1.7.1
^ permalink raw reply related
* [PATCH 3/4] n2rng: support new hardware register layout
From: Shannon Nelson @ 2017-01-12 18:52 UTC (permalink / raw)
To: linux-crypto; +Cc: sparclinux, herbert, linux-kernel, Shannon Nelson
In-Reply-To: <1484247169-245086-1-git-send-email-shannon.nelson@oracle.com>
Add the new register layout constants and the requisite logic
for using them.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
drivers/char/hw_random/n2-drv.c | 144 +++++++++++++++++++++++++++++----------
drivers/char/hw_random/n2rng.h | 36 +++++++---
2 files changed, 134 insertions(+), 46 deletions(-)
diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c
index 74c26c7..f0bd5ee 100644
--- a/drivers/char/hw_random/n2-drv.c
+++ b/drivers/char/hw_random/n2-drv.c
@@ -302,26 +302,57 @@ static int n2rng_try_read_ctl(struct n2rng *np)
return n2rng_hv_err_trans(hv_err);
}
-#define CONTROL_DEFAULT_BASE \
- ((2 << RNG_CTL_ASEL_SHIFT) | \
- (N2RNG_ACCUM_CYCLES_DEFAULT << RNG_CTL_WAIT_SHIFT) | \
- RNG_CTL_LFSR)
-
-#define CONTROL_DEFAULT_0 \
- (CONTROL_DEFAULT_BASE | \
- (1 << RNG_CTL_VCO_SHIFT) | \
- RNG_CTL_ES1)
-#define CONTROL_DEFAULT_1 \
- (CONTROL_DEFAULT_BASE | \
- (2 << RNG_CTL_VCO_SHIFT) | \
- RNG_CTL_ES2)
-#define CONTROL_DEFAULT_2 \
- (CONTROL_DEFAULT_BASE | \
- (3 << RNG_CTL_VCO_SHIFT) | \
- RNG_CTL_ES3)
-#define CONTROL_DEFAULT_3 \
- (CONTROL_DEFAULT_BASE | \
- RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3)
+static u64 n2rng_control_default(struct n2rng *np, int ctl)
+{
+ u64 val = 0;
+
+ if (np->data->chip_version == 1) {
+ val = ((2 << RNG_v1_CTL_ASEL_SHIFT) |
+ (N2RNG_ACCUM_CYCLES_DEFAULT << RNG_v1_CTL_WAIT_SHIFT) |
+ RNG_CTL_LFSR);
+
+ switch (ctl) {
+ case 0:
+ val |= (1 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES1;
+ break;
+ case 1:
+ val |= (2 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES2;
+ break;
+ case 2:
+ val |= (3 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES3;
+ break;
+ case 3:
+ val |= RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3;
+ break;
+ default:
+ break;
+ }
+
+ } else {
+ val = ((2 << RNG_v2_CTL_ASEL_SHIFT) |
+ (N2RNG_ACCUM_CYCLES_DEFAULT << RNG_v2_CTL_WAIT_SHIFT) |
+ RNG_CTL_LFSR);
+
+ switch (ctl) {
+ case 0:
+ val |= (1 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES1;
+ break;
+ case 1:
+ val |= (2 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES2;
+ break;
+ case 2:
+ val |= (3 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES3;
+ break;
+ case 3:
+ val |= RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3;
+ break;
+ default:
+ break;
+ }
+ }
+
+ return val;
+}
static void n2rng_control_swstate_init(struct n2rng *np)
{
@@ -336,10 +367,10 @@ static void n2rng_control_swstate_init(struct n2rng *np)
for (i = 0; i < np->num_units; i++) {
struct n2rng_unit *up = &np->units[i];
- up->control[0] = CONTROL_DEFAULT_0;
- up->control[1] = CONTROL_DEFAULT_1;
- up->control[2] = CONTROL_DEFAULT_2;
- up->control[3] = CONTROL_DEFAULT_3;
+ up->control[0] = n2rng_control_default(np, 0);
+ up->control[1] = n2rng_control_default(np, 1);
+ up->control[2] = n2rng_control_default(np, 2);
+ up->control[3] = n2rng_control_default(np, 3);
}
np->hv_state = HV_RNG_STATE_UNCONFIGURED;
@@ -399,6 +430,7 @@ static int n2rng_data_read(struct hwrng *rng, u32 *data)
} else {
int err = n2rng_generic_read_data(ra);
if (!err) {
+ np->flags |= N2RNG_FLAG_BUFFER_VALID;
np->buffer = np->test_data >> 32;
*data = np->test_data & 0xffffffff;
len = 4;
@@ -487,9 +519,21 @@ static void n2rng_dump_test_buffer(struct n2rng *np)
static int n2rng_check_selftest_buffer(struct n2rng *np, unsigned long unit)
{
- u64 val = SELFTEST_VAL;
+ u64 val;
int err, matches, limit;
+ switch (np->data->id) {
+ case N2_n2_rng:
+ case N2_vf_rng:
+ case N2_kt_rng:
+ case N2_m4_rng: /* yes, m4 uses the old value */
+ val = RNG_v1_SELFTEST_VAL;
+ break;
+ default:
+ val = RNG_v2_SELFTEST_VAL;
+ break;
+ }
+
matches = 0;
for (limit = 0; limit < SELFTEST_LOOPS_MAX; limit++) {
matches += n2rng_test_buffer_find(np, val);
@@ -512,14 +556,32 @@ static int n2rng_check_selftest_buffer(struct n2rng *np, unsigned long unit)
static int n2rng_control_selftest(struct n2rng *np, unsigned long unit)
{
int err;
+ u64 base, base3;
+
+ switch (np->data->id) {
+ case N2_n2_rng:
+ case N2_vf_rng:
+ case N2_kt_rng:
+ base = RNG_v1_CTL_ASEL_NOOUT << RNG_v1_CTL_ASEL_SHIFT;
+ base3 = base | RNG_CTL_LFSR |
+ ((RNG_v1_SELFTEST_TICKS - 2) << RNG_v1_CTL_WAIT_SHIFT);
+ break;
+ case N2_m4_rng:
+ base = RNG_v2_CTL_ASEL_NOOUT << RNG_v2_CTL_ASEL_SHIFT;
+ base3 = base | RNG_CTL_LFSR |
+ ((RNG_v1_SELFTEST_TICKS - 2) << RNG_v2_CTL_WAIT_SHIFT);
+ break;
+ default:
+ base = RNG_v2_CTL_ASEL_NOOUT << RNG_v2_CTL_ASEL_SHIFT;
+ base3 = base | RNG_CTL_LFSR |
+ (RNG_v2_SELFTEST_TICKS << RNG_v2_CTL_WAIT_SHIFT);
+ break;
+ }
- np->test_control[0] = (0x2 << RNG_CTL_ASEL_SHIFT);
- np->test_control[1] = (0x2 << RNG_CTL_ASEL_SHIFT);
- np->test_control[2] = (0x2 << RNG_CTL_ASEL_SHIFT);
- np->test_control[3] = ((0x2 << RNG_CTL_ASEL_SHIFT) |
- RNG_CTL_LFSR |
- ((SELFTEST_TICKS - 2) << RNG_CTL_WAIT_SHIFT));
-
+ np->test_control[0] = base;
+ np->test_control[1] = base;
+ np->test_control[2] = base;
+ np->test_control[3] = base3;
err = n2rng_entropy_diag_read(np, unit, np->test_control,
HV_RNG_STATE_HEALTHCHECK,
@@ -557,11 +619,19 @@ static int n2rng_control_configure_units(struct n2rng *np)
struct n2rng_unit *up = &np->units[unit];
unsigned long ctl_ra = __pa(&up->control[0]);
int esrc;
- u64 base;
+ u64 base, shift;
- base = ((np->accum_cycles << RNG_CTL_WAIT_SHIFT) |
- (2 << RNG_CTL_ASEL_SHIFT) |
- RNG_CTL_LFSR);
+ if (np->data->chip_version == 1) {
+ base = ((np->accum_cycles << RNG_v1_CTL_WAIT_SHIFT) |
+ (RNG_v1_CTL_ASEL_NOOUT << RNG_v1_CTL_ASEL_SHIFT) |
+ RNG_CTL_LFSR);
+ shift = RNG_v1_CTL_VCO_SHIFT;
+ } else {
+ base = ((np->accum_cycles << RNG_v2_CTL_WAIT_SHIFT) |
+ (RNG_v2_CTL_ASEL_NOOUT << RNG_v2_CTL_ASEL_SHIFT) |
+ RNG_CTL_LFSR);
+ shift = RNG_v2_CTL_VCO_SHIFT;
+ }
/* XXX This isn't the best. We should fetch a bunch
* XXX of words using each entropy source combined XXX
@@ -570,7 +640,7 @@ static int n2rng_control_configure_units(struct n2rng *np)
*/
for (esrc = 0; esrc < 3; esrc++)
up->control[esrc] = base |
- (esrc << RNG_CTL_VCO_SHIFT) |
+ (esrc << shift) |
(RNG_CTL_ES1 << esrc);
up->control[3] = base |
diff --git a/drivers/char/hw_random/n2rng.h b/drivers/char/hw_random/n2rng.h
index e41e55a..6bad6cc 100644
--- a/drivers/char/hw_random/n2rng.h
+++ b/drivers/char/hw_random/n2rng.h
@@ -6,18 +6,34 @@
#ifndef _N2RNG_H
#define _N2RNG_H
-#define RNG_CTL_WAIT 0x0000000001fffe00ULL /* Minimum wait time */
-#define RNG_CTL_WAIT_SHIFT 9
-#define RNG_CTL_BYPASS 0x0000000000000100ULL /* VCO voltage source */
-#define RNG_CTL_VCO 0x00000000000000c0ULL /* VCO rate control */
-#define RNG_CTL_VCO_SHIFT 6
-#define RNG_CTL_ASEL 0x0000000000000030ULL /* Analog MUX select */
-#define RNG_CTL_ASEL_SHIFT 4
+/* ver1 devices - n2-rng, vf-rng, kt-rng */
+#define RNG_v1_CTL_WAIT 0x0000000001fffe00ULL /* Minimum wait time */
+#define RNG_v1_CTL_WAIT_SHIFT 9
+#define RNG_v1_CTL_BYPASS 0x0000000000000100ULL /* VCO voltage source */
+#define RNG_v1_CTL_VCO 0x00000000000000c0ULL /* VCO rate control */
+#define RNG_v1_CTL_VCO_SHIFT 6
+#define RNG_v1_CTL_ASEL 0x0000000000000030ULL /* Analog MUX select */
+#define RNG_v1_CTL_ASEL_SHIFT 4
+#define RNG_v1_CTL_ASEL_NOOUT 2
+
+/* these are the same in v2 as in v1 */
#define RNG_CTL_LFSR 0x0000000000000008ULL /* Use LFSR or plain shift */
#define RNG_CTL_ES3 0x0000000000000004ULL /* Enable entropy source 3 */
#define RNG_CTL_ES2 0x0000000000000002ULL /* Enable entropy source 2 */
#define RNG_CTL_ES1 0x0000000000000001ULL /* Enable entropy source 1 */
+/* ver2 devices - m4-rng, m7-rng */
+#define RNG_v2_CTL_WAIT 0x0000000007fff800ULL /* Minimum wait time */
+#define RNG_v2_CTL_WAIT_SHIFT 12
+#define RNG_v2_CTL_BYPASS 0x0000000000000400ULL /* VCO voltage source */
+#define RNG_v2_CTL_VCO 0x0000000000000300ULL /* VCO rate control */
+#define RNG_v2_CTL_VCO_SHIFT 9
+#define RNG_v2_CTL_PERF 0x0000000000000180ULL /* Perf */
+#define RNG_v2_CTL_ASEL 0x0000000000000070ULL /* Analog MUX select */
+#define RNG_v2_CTL_ASEL_SHIFT 4
+#define RNG_v2_CTL_ASEL_NOOUT 7
+
+
#define HV_FAST_RNG_GET_DIAG_CTL 0x130
#define HV_FAST_RNG_CTL_READ 0x131
#define HV_FAST_RNG_CTL_WRITE 0x132
@@ -112,8 +128,10 @@ struct n2rng {
u64 scratch_control[HV_RNG_NUM_CONTROL];
-#define SELFTEST_TICKS 38859
-#define SELFTEST_VAL ((u64)0xB8820C7BD387E32C)
+#define RNG_v1_SELFTEST_TICKS 38859
+#define RNG_v1_SELFTEST_VAL ((u64)0xB8820C7BD387E32C)
+#define RNG_v2_SELFTEST_TICKS 64
+#define RNG_v2_SELFTEST_VAL ((u64)0xffffffffffffffff)
#define SELFTEST_POLY ((u64)0x231DCEE91262B8A3)
#define SELFTEST_MATCH_GOAL 6
#define SELFTEST_LOOPS_MAX 40000
--
1.7.1
^ permalink raw reply related
* [PATCH 4/4] n2rng: update version info
From: Shannon Nelson @ 2017-01-12 18:52 UTC (permalink / raw)
To: linux-crypto; +Cc: sparclinux, herbert, linux-kernel, Shannon Nelson
In-Reply-To: <1484247169-245086-1-git-send-email-shannon.nelson@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
drivers/char/hw_random/n2-drv.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c
index f0bd5ee..31cbdbb 100644
--- a/drivers/char/hw_random/n2-drv.c
+++ b/drivers/char/hw_random/n2-drv.c
@@ -21,11 +21,11 @@
#define DRV_MODULE_NAME "n2rng"
#define PFX DRV_MODULE_NAME ": "
-#define DRV_MODULE_VERSION "0.2"
-#define DRV_MODULE_RELDATE "July 27, 2011"
+#define DRV_MODULE_VERSION "0.3"
+#define DRV_MODULE_RELDATE "Jan 7, 2017"
static char version[] =
- DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
+ DRV_MODULE_NAME " v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
MODULE_AUTHOR("David S. Miller (davem@davemloft.net)");
MODULE_DESCRIPTION("Niagara2 RNG driver");
@@ -765,7 +765,7 @@ static int n2rng_probe(struct platform_device *op)
"multi-unit-capable" : "single-unit"),
np->num_units);
- np->hwrng.name = "n2rng";
+ np->hwrng.name = DRV_MODULE_NAME;
np->hwrng.data_read = n2rng_data_read;
np->hwrng.priv = (unsigned long) np;
--
1.7.1
^ permalink raw reply related
* [cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
From: kbuild test robot @ 2017-01-12 19:04 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: kbuild-all, linux-crypto, Herbert Xu
[-- Attachment #1: Type: text/plain, Size: 7214 bytes --]
tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86
commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - replace scalar AES cipher
config: arm-multi_v7_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a
# save the attached .config to linux build tree
make.cross ARCH=arm
All errors (new ones prefixed by >>):
arch/arm/crypto/aes-cipher-core.S: Assembler messages:
>> arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movw tt,#:lower16:crypto_ft_tab'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movt tt,#:upper16:crypto_ft_tab'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r5,[tt,r5,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r4,[tt,r4,lsl#2]'
vim +21 arch/arm/crypto/aes-cipher-core.S
15 .align 5
16
17 rk .req r0
18 rounds .req r1
19 in .req r2
20 out .req r3
> 21 tt .req ip
22
23 t0 .req lr
24 t1 .req r2
25 t2 .req r3
26
27 .macro __select, out, in, idx
28 .if __LINUX_ARM_ARCH__ < 7
29 and \out, \in, #0xff << (8 * \idx)
30 .else
31 ubfx \out, \in, #(8 * \idx), #8
32 .endif
33 .endm
34
35 .macro __load, out, in, idx
36 .if __LINUX_ARM_ARCH__ < 7 && \idx > 0
37 ldr \out, [tt, \in, lsr #(8 * \idx) - 2]
38 .else
39 ldr \out, [tt, \in, lsl #2]
40 .endif
41 .endm
42
43 .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc
44 __select \out0, \in0, 0
45 __select t0, \in1, 1
46 __load \out0, \out0, 0
47 __load t0, t0, 1
48
49 .if \enc
50 __select \out1, \in1, 0
51 __select t1, \in2, 1
52 .else
53 __select \out1, \in3, 0
54 __select t1, \in0, 1
55 .endif
56 __load \out1, \out1, 0
57 __select t2, \in2, 2
58 __load t1, t1, 1
59 __load t2, t2, 2
60
61 eor \out0, \out0, t0, ror #24
62
63 __select t0, \in3, 3
64 .if \enc
65 __select \t3, \in3, 2
66 __select \t4, \in0, 3
67 .else
68 __select \t3, \in1, 2
69 __select \t4, \in2, 3
70 .endif
71 __load \t3, \t3, 2
72 __load t0, t0, 3
73 __load \t4, \t4, 3
74
75 eor \out1, \out1, t1, ror #24
76 eor \out0, \out0, t2, ror #16
77 ldm rk!, {t1, t2}
78 eor \out1, \out1, \t3, ror #16
79 eor \out0, \out0, t0, ror #8
80 eor \out1, \out1, \t4, ror #8
81 eor \out0, \out0, t1
82 eor \out1, \out1, t2
83 .endm
84
85 .macro fround, out0, out1, out2, out3, in0, in1, in2, in3
86 __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
87 __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
88 .endm
89
90 .macro iround, out0, out1, out2, out3, in0, in1, in2, in3
91 __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
92 __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
93 .endm
94
95 .macro __rev, out, in
96 .if __LINUX_ARM_ARCH__ < 6
97 lsl t0, \in, #24
98 and t1, \in, #0xff00
99 and t2, \in, #0xff0000
100 orr \out, t0, \in, lsr #24
101 orr \out, \out, t1, lsl #8
102 orr \out, \out, t2, lsr #8
103 .else
104 rev \out, \in
105 .endif
106 .endm
107
108 .macro __adrl, out, sym, c
109 .if __LINUX_ARM_ARCH__ < 7
110 ldr\c \out, =\sym
111 .else
112 movw\c \out, #:lower16:\sym
113 movt\c \out, #:upper16:\sym
114 .endif
115 .endm
116
117 .macro do_crypt, round, ttab, ltab
118 push {r3-r11, lr}
119
120 ldr r4, [in]
121 ldr r5, [in, #4]
122 ldr r6, [in, #8]
123 ldr r7, [in, #12]
124
125 ldm rk!, {r8-r11}
126
127 #ifdef CONFIG_CPU_BIG_ENDIAN
128 __rev r4, r4
129 __rev r5, r5
130 __rev r6, r6
131 __rev r7, r7
132 #endif
133
134 eor r4, r4, r8
135 eor r5, r5, r9
136 eor r6, r6, r10
137 eor r7, r7, r11
138
139 __adrl tt, \ttab
140
141 tst rounds, #2
142 bne 1f
143
144 0: \round r8, r9, r10, r11, r4, r5, r6, r7
145 \round r4, r5, r6, r7, r8, r9, r10, r11
146
147 1: subs rounds, rounds, #4
148 \round r8, r9, r10, r11, r4, r5, r6, r7
149 __adrl tt, \ltab, ls
150 \round r4, r5, r6, r7, r8, r9, r10, r11
151 bhi 0b
152
153 #ifdef CONFIG_CPU_BIG_ENDIAN
154 __rev r4, r4
155 __rev r5, r5
156 __rev r6, r6
157 __rev r7, r7
158 #endif
159
160 ldr out, [sp]
161
162 str r4, [out]
163 str r5, [out, #4]
164 str r6, [out, #8]
165 str r7, [out, #12]
166
167 pop {r3-r11, pc}
168
169 .align 3
170 .ltorg
171 .endm
172
173 ENTRY(__aes_arm_encrypt)
> 174 do_crypt fround, crypto_ft_tab, crypto_fl_tab
175 ENDPROC(__aes_arm_encrypt)
176
177 ENTRY(__aes_arm_decrypt)
> 178 do_crypt iround, crypto_it_tab, crypto_il_tab
179 ENDPROC(__aes_arm_decrypt)
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39910 bytes --]
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Linus Torvalds @ 2017-01-12 19:51 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andy Lutomirski, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <20170112140215.rh247gwk55fjzmg7@treble>
[-- Attachment #1: Type: text/plain, Size: 3363 bytes --]
On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> Just to clarify, I think you're asking if, for versions of gcc which
> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
> functions to ensure their stacks are 16-byte aligned.
>
> It's certainly possible, but I don't see how that solves the problem.
> The stack will still be misaligned by entry code. Or am I missing
> something?
I think the argument is that we *could* try to align things, if we
just had some tool that actually then verified that we aren't missing
anything.
I'm not entirely happy with checking the generated code, though,
because as Ingo says, you have a 50:50 chance of just getting it right
by mistake. So I'd much rather have some static tool that checks
things at a code level (ie coccinelle or sparse).
Almost totally untested "sparse" patch appended. The problem with
sparse, obviously, is that few enough people run it, and it gives a
lot of other warnings. But maybe Herbert can test whether this would
actually have caught his situation, doing something like an
allmodconfig build with "C=2" to force a sparse run on everything, and
redirecting the warnings to stderr.
But this patch does seem to give a warning for the patch that Herbert
had, and that caused problems.
And in fact it seems to find a few other possible problems (most, but
not all, in crypto). This run was with the broken chacha20 patch
applied, to verify that I get a warning for that case:
arch/x86/crypto/chacha20_glue.c:70:13: warning: symbol 'state' has
excessive alignment (16)
arch/x86/crypto/aesni-intel_glue.c:724:12: warning: symbol 'iv' has
excessive alignment (16)
arch/x86/crypto/aesni-intel_glue.c:803:12: warning: symbol 'iv' has
excessive alignment (16)
crypto/shash.c:82:12: warning: symbol 'ubuf' has excessive alignment (16)
crypto/shash.c:118:12: warning: symbol 'ubuf' has excessive alignment (16)
drivers/char/hw_random/via-rng.c:89:14: warning: symbol 'buf' has
excessive alignment (16)
net/bridge/netfilter/ebtables.c:1809:31: warning: symbol 'tinfo'
has excessive alignment (64)
drivers/crypto/padlock-sha.c:85:14: warning: symbol 'buf' has
excessive alignment (16)
drivers/crypto/padlock-sha.c:147:14: warning: symbol 'buf' has
excessive alignment (16)
drivers/crypto/padlock-sha.c:304:12: warning: symbol 'buf' has
excessive alignment (16)
drivers/crypto/padlock-sha.c:388:12: warning: symbol 'buf' has
excessive alignment (16)
net/openvswitch/actions.c:797:33: warning: symbol 'ovs_rt' has
excessive alignment (64)
drivers/net/ethernet/neterion/vxge/vxge-config.c:1006:38: warning:
symbol 'vpath' has excessive alignment (64)
although I think at least some of these happen to be ok.
There are a few places that clearly don't care about exact alignment,
and use "__attribute__((aligned))" without any specific alignment
value.
It's just sparse that thinks that implies 16-byte alignment (it
doesn't, really - it's unspecified, and is telling gcc to use "maximum
useful alignment", so who knows _what_ gcc will assume).
But some of them may well be real issues - if the alignment is about
correctness rather than anything else.
Anyway, the advantage of this kind of source-level check is that it
should really catch things regardless of "luck" wrt alignment.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 887 bytes --]
flow.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/flow.c b/flow.c
index 7db9548..c876869 100644
--- a/flow.c
+++ b/flow.c
@@ -601,6 +601,20 @@ static void simplify_one_symbol(struct entrypoint *ep, struct symbol *sym)
unsigned long mod;
int all, stores, complex;
+ /*
+ * Warn about excessive local variable alignment.
+ *
+ * This needs to be linked up with some flag to enable
+ * it, and specify the alignment. The 'max_int_alignment'
+ * just happens to be what we want for the kernel for x86-64.
+ */
+ mod = sym->ctype.modifiers;
+ if (!(mod & (MOD_NONLOCAL | MOD_STATIC))) {
+ unsigned int alignment = sym->ctype.alignment;
+ if (alignment > max_int_alignment)
+ warning(sym->pos, "symbol '%s' has excessive alignment (%u)", show_ident(sym->ident), alignment);
+ }
+
/* Never used as a symbol? */
pseudo = sym->pseudo;
if (!pseudo)
^ permalink raw reply related
* Re: x86-64: Maintain 16-byte stack alignment
From: Andy Lutomirski @ 2017-01-12 20:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Poimboeuf, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <CA+55aFxP+V6Wbq5Xw_NOksiWouEMg4gjBJgeGa-qFyxDMnTmcA@mail.gmail.com>
On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>>
>> Just to clarify, I think you're asking if, for versions of gcc which
>> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
>> functions to ensure their stacks are 16-byte aligned.
>>
>> It's certainly possible, but I don't see how that solves the problem.
>> The stack will still be misaligned by entry code. Or am I missing
>> something?
>
> I think the argument is that we *could* try to align things, if we
> just had some tool that actually then verified that we aren't missing
> anything.
>
> I'm not entirely happy with checking the generated code, though,
> because as Ingo says, you have a 50:50 chance of just getting it right
> by mistake. So I'd much rather have some static tool that checks
> things at a code level (ie coccinelle or sparse).
What I meant was checking the entry code to see if it aligns stack
frames, and good luck getting sparse to do that. Hmm, getting 16-byte
alignment for real may actually be entirely a lost cause. After all,
I think we have some inline functions that do asm volatile ("call
..."), and I don't see any credible way of forcing alignment short of
generating an entirely new stack frame and aligning that. Ick. This
whole situation stinks, and I wish that the gcc developers had been
less daft here in the first place or that we'd noticed and gotten it
fixed much longer ago.
Can we come up with a macro like STACK_ALIGN_16 that turns into
__aligned__(32) on bad gcc versions and combine that with your sparse
patch?
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Josh Poimboeuf @ 2017-01-12 20:15 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Linus Torvalds, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <CALCETrVz-wEFVUwrpS8-Ln9SWnsF5KxkqJC-Br6wJ+e0LGM9UA@mail.gmail.com>
On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >>
> >> Just to clarify, I think you're asking if, for versions of gcc which
> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
> >> functions to ensure their stacks are 16-byte aligned.
> >>
> >> It's certainly possible, but I don't see how that solves the problem.
> >> The stack will still be misaligned by entry code. Or am I missing
> >> something?
> >
> > I think the argument is that we *could* try to align things, if we
> > just had some tool that actually then verified that we aren't missing
> > anything.
> >
> > I'm not entirely happy with checking the generated code, though,
> > because as Ingo says, you have a 50:50 chance of just getting it right
> > by mistake. So I'd much rather have some static tool that checks
> > things at a code level (ie coccinelle or sparse).
>
> What I meant was checking the entry code to see if it aligns stack
> frames, and good luck getting sparse to do that. Hmm, getting 16-byte
> alignment for real may actually be entirely a lost cause. After all,
> I think we have some inline functions that do asm volatile ("call
> ..."), and I don't see any credible way of forcing alignment short of
> generating an entirely new stack frame and aligning that.
Actually we already found all such cases and fixed them by forcing a new
stack frame, thanks to objtool. For example, see 55a76b59b5fe.
> Ick. This
> whole situation stinks, and I wish that the gcc developers had been
> less daft here in the first place or that we'd noticed and gotten it
> fixed much longer ago.
>
> Can we come up with a macro like STACK_ALIGN_16 that turns into
> __aligned__(32) on bad gcc versions and combine that with your sparse
> patch?
--
Josh
^ permalink raw reply
* Re: [cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
From: Ard Biesheuvel @ 2017-01-12 20:44 UTC (permalink / raw)
To: kbuild test robot; +Cc: kbuild-all, linux-crypto@vger.kernel.org, Herbert Xu
In-Reply-To: <201701130326.19FpToYy%fengguang.wu@intel.com>
Hi Arnd,
On 12 January 2017 at 19:04, kbuild test robot <fengguang.wu@intel.com> wrote:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
> head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86
> commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - replace scalar AES cipher
> config: arm-multi_v7_defconfig (attached as .config)
> compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
> reproduce:
> wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a
> # save the attached .config to linux build tree
> make.cross ARCH=arm
>
> All errors (new ones prefixed by >>):
>
> arch/arm/crypto/aes-cipher-core.S: Assembler messages:
>>> arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
Did you ever see this error? This is very odd: .req simply declares an
alias for a register name, and this works fine locally
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movw tt,#:lower16:crypto_ft_tab'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movt tt,#:upper16:crypto_ft_tab'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r5,[tt,r5,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsl#2]'
>>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r4,[tt,r4,lsl#2]'
>
> vim +21 arch/arm/crypto/aes-cipher-core.S
>
> 15 .align 5
> 16
> 17 rk .req r0
> 18 rounds .req r1
> 19 in .req r2
> 20 out .req r3
> > 21 tt .req ip
> 22
> 23 t0 .req lr
> 24 t1 .req r2
> 25 t2 .req r3
> 26
> 27 .macro __select, out, in, idx
> 28 .if __LINUX_ARM_ARCH__ < 7
> 29 and \out, \in, #0xff << (8 * \idx)
> 30 .else
> 31 ubfx \out, \in, #(8 * \idx), #8
> 32 .endif
> 33 .endm
> 34
> 35 .macro __load, out, in, idx
> 36 .if __LINUX_ARM_ARCH__ < 7 && \idx > 0
> 37 ldr \out, [tt, \in, lsr #(8 * \idx) - 2]
> 38 .else
> 39 ldr \out, [tt, \in, lsl #2]
> 40 .endif
> 41 .endm
> 42
> 43 .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc
> 44 __select \out0, \in0, 0
> 45 __select t0, \in1, 1
> 46 __load \out0, \out0, 0
> 47 __load t0, t0, 1
> 48
> 49 .if \enc
> 50 __select \out1, \in1, 0
> 51 __select t1, \in2, 1
> 52 .else
> 53 __select \out1, \in3, 0
> 54 __select t1, \in0, 1
> 55 .endif
> 56 __load \out1, \out1, 0
> 57 __select t2, \in2, 2
> 58 __load t1, t1, 1
> 59 __load t2, t2, 2
> 60
> 61 eor \out0, \out0, t0, ror #24
> 62
> 63 __select t0, \in3, 3
> 64 .if \enc
> 65 __select \t3, \in3, 2
> 66 __select \t4, \in0, 3
> 67 .else
> 68 __select \t3, \in1, 2
> 69 __select \t4, \in2, 3
> 70 .endif
> 71 __load \t3, \t3, 2
> 72 __load t0, t0, 3
> 73 __load \t4, \t4, 3
> 74
> 75 eor \out1, \out1, t1, ror #24
> 76 eor \out0, \out0, t2, ror #16
> 77 ldm rk!, {t1, t2}
> 78 eor \out1, \out1, \t3, ror #16
> 79 eor \out0, \out0, t0, ror #8
> 80 eor \out1, \out1, \t4, ror #8
> 81 eor \out0, \out0, t1
> 82 eor \out1, \out1, t2
> 83 .endm
> 84
> 85 .macro fround, out0, out1, out2, out3, in0, in1, in2, in3
> 86 __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
> 87 __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
> 88 .endm
> 89
> 90 .macro iround, out0, out1, out2, out3, in0, in1, in2, in3
> 91 __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
> 92 __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
> 93 .endm
> 94
> 95 .macro __rev, out, in
> 96 .if __LINUX_ARM_ARCH__ < 6
> 97 lsl t0, \in, #24
> 98 and t1, \in, #0xff00
> 99 and t2, \in, #0xff0000
> 100 orr \out, t0, \in, lsr #24
> 101 orr \out, \out, t1, lsl #8
> 102 orr \out, \out, t2, lsr #8
> 103 .else
> 104 rev \out, \in
> 105 .endif
> 106 .endm
> 107
> 108 .macro __adrl, out, sym, c
> 109 .if __LINUX_ARM_ARCH__ < 7
> 110 ldr\c \out, =\sym
> 111 .else
> 112 movw\c \out, #:lower16:\sym
> 113 movt\c \out, #:upper16:\sym
> 114 .endif
> 115 .endm
> 116
> 117 .macro do_crypt, round, ttab, ltab
> 118 push {r3-r11, lr}
> 119
> 120 ldr r4, [in]
> 121 ldr r5, [in, #4]
> 122 ldr r6, [in, #8]
> 123 ldr r7, [in, #12]
> 124
> 125 ldm rk!, {r8-r11}
> 126
> 127 #ifdef CONFIG_CPU_BIG_ENDIAN
> 128 __rev r4, r4
> 129 __rev r5, r5
> 130 __rev r6, r6
> 131 __rev r7, r7
> 132 #endif
> 133
> 134 eor r4, r4, r8
> 135 eor r5, r5, r9
> 136 eor r6, r6, r10
> 137 eor r7, r7, r11
> 138
> 139 __adrl tt, \ttab
> 140
> 141 tst rounds, #2
> 142 bne 1f
> 143
> 144 0: \round r8, r9, r10, r11, r4, r5, r6, r7
> 145 \round r4, r5, r6, r7, r8, r9, r10, r11
> 146
> 147 1: subs rounds, rounds, #4
> 148 \round r8, r9, r10, r11, r4, r5, r6, r7
> 149 __adrl tt, \ltab, ls
> 150 \round r4, r5, r6, r7, r8, r9, r10, r11
> 151 bhi 0b
> 152
> 153 #ifdef CONFIG_CPU_BIG_ENDIAN
> 154 __rev r4, r4
> 155 __rev r5, r5
> 156 __rev r6, r6
> 157 __rev r7, r7
> 158 #endif
> 159
> 160 ldr out, [sp]
> 161
> 162 str r4, [out]
> 163 str r5, [out, #4]
> 164 str r6, [out, #8]
> 165 str r7, [out, #12]
> 166
> 167 pop {r3-r11, pc}
> 168
> 169 .align 3
> 170 .ltorg
> 171 .endm
> 172
> 173 ENTRY(__aes_arm_encrypt)
> > 174 do_crypt fround, crypto_ft_tab, crypto_fl_tab
> 175 ENDPROC(__aes_arm_encrypt)
> 176
> 177 ENTRY(__aes_arm_decrypt)
> > 178 do_crypt iround, crypto_it_tab, crypto_il_tab
> 179 ENDPROC(__aes_arm_decrypt)
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Josh Poimboeuf @ 2017-01-12 20:55 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Linus Torvalds, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <20170112201511.yj5ekqmj76r2yv6t@treble>
On Thu, Jan 12, 2017 at 02:15:11PM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
> > On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >>
> > >> Just to clarify, I think you're asking if, for versions of gcc which
> > >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
> > >> functions to ensure their stacks are 16-byte aligned.
> > >>
> > >> It's certainly possible, but I don't see how that solves the problem.
> > >> The stack will still be misaligned by entry code. Or am I missing
> > >> something?
> > >
> > > I think the argument is that we *could* try to align things, if we
> > > just had some tool that actually then verified that we aren't missing
> > > anything.
> > >
> > > I'm not entirely happy with checking the generated code, though,
> > > because as Ingo says, you have a 50:50 chance of just getting it right
> > > by mistake. So I'd much rather have some static tool that checks
> > > things at a code level (ie coccinelle or sparse).
> >
> > What I meant was checking the entry code to see if it aligns stack
> > frames, and good luck getting sparse to do that. Hmm, getting 16-byte
> > alignment for real may actually be entirely a lost cause. After all,
> > I think we have some inline functions that do asm volatile ("call
> > ..."), and I don't see any credible way of forcing alignment short of
> > generating an entirely new stack frame and aligning that.
>
> Actually we already found all such cases and fixed them by forcing a new
> stack frame, thanks to objtool. For example, see 55a76b59b5fe.
>
> > Ick. This
> > whole situation stinks, and I wish that the gcc developers had been
> > less daft here in the first place or that we'd noticed and gotten it
> > fixed much longer ago.
> >
> > Can we come up with a macro like STACK_ALIGN_16 that turns into
> > __aligned__(32) on bad gcc versions and combine that with your sparse
> > patch?
This could work. Only concerns I'd have are:
- Are there (or will there be in the future) any asm functions which
assume a 16-byte aligned stack? (Seems unlikely. Stack alignment is
common in the crypto code but they do the alignment manually.)
- Who's going to run sparse all the time to catch unauthorized users of
__aligned__(16)?
--
Josh
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Linus Torvalds @ 2017-01-12 21:40 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andy Lutomirski, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <20170112205504.gb6z2w52mektyc73@treble>
On Thu, Jan 12, 2017 at 12:55 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> - Who's going to run sparse all the time to catch unauthorized users of
> __aligned__(16)?
Well, considering that we apparently only have a small handful of
existing users without anybody having ever run any tool at all, I
don't think this is necessarily a huge problem.
One of the build servers could easily add the "make C=2" case to a
build test, and just grep the error reports for the 'excessive
alignment' string. The zero-day build bot already does much fancier
things.
So I don't think it would necessarily be all that hard to get a clean
build, and just say "if you need aligned stack space, you have to do
it yourself by hand".
That saId, if we now always enable frame pointers on x86 (and it has
gotten more and more difficult to avoid it), then the 16-byte
alignment would fairly natural.
The 8-byte alignment mainly makes sense when the basic call sequence
just adds 8 bytes, and you have functions without frames (that still
call other functions).
Linus
^ permalink raw reply
* [cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr tt,=crypto_ft_tab'
From: kbuild test robot @ 2017-01-12 23:28 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: kbuild-all, linux-crypto, Herbert Xu
[-- Attachment #1: Type: text/plain, Size: 8989 bytes --]
tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86
commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - replace scalar AES cipher
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a
# save the attached .config to linux build tree
make.cross ARCH=arm
All errors (new ones prefixed by >>):
arch/arm/crypto/aes-cipher-core.S: Assembler messages:
arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr tt,=crypto_ft_tab'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*1)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsr#(8*1)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsr#(8*2)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsr#(8*2)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsr#(8*3)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*1)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsr#(8*1)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsr#(8*2)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r5,[tt,r5,lsr#(8*2)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsr#(8*3)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r4,[tt,r4,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*1)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r5,[tt,r5,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsr#(8*1)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsr#(8*2)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsr#(8*2)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r7,[tt,r7,lsr#(8*3)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*1)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r7,[tt,r7,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t1,[tt,t1,lsr#(8*1)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t2,[tt,t2,lsr#(8*2)-2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsr#(8*2)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsr#(8*3)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]'
>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*1)-2]'
arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]'
vim +174 arch/arm/crypto/aes-cipher-core.S
15 .align 5
16
17 rk .req r0
18 rounds .req r1
19 in .req r2
20 out .req r3
> 21 tt .req ip
22
23 t0 .req lr
24 t1 .req r2
25 t2 .req r3
26
27 .macro __select, out, in, idx
28 .if __LINUX_ARM_ARCH__ < 7
29 and \out, \in, #0xff << (8 * \idx)
30 .else
31 ubfx \out, \in, #(8 * \idx), #8
32 .endif
33 .endm
34
35 .macro __load, out, in, idx
36 .if __LINUX_ARM_ARCH__ < 7 && \idx > 0
37 ldr \out, [tt, \in, lsr #(8 * \idx) - 2]
38 .else
39 ldr \out, [tt, \in, lsl #2]
40 .endif
41 .endm
42
43 .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc
44 __select \out0, \in0, 0
45 __select t0, \in1, 1
46 __load \out0, \out0, 0
47 __load t0, t0, 1
48
49 .if \enc
50 __select \out1, \in1, 0
51 __select t1, \in2, 1
52 .else
53 __select \out1, \in3, 0
54 __select t1, \in0, 1
55 .endif
56 __load \out1, \out1, 0
57 __select t2, \in2, 2
58 __load t1, t1, 1
59 __load t2, t2, 2
60
61 eor \out0, \out0, t0, ror #24
62
63 __select t0, \in3, 3
64 .if \enc
65 __select \t3, \in3, 2
66 __select \t4, \in0, 3
67 .else
68 __select \t3, \in1, 2
69 __select \t4, \in2, 3
70 .endif
71 __load \t3, \t3, 2
72 __load t0, t0, 3
73 __load \t4, \t4, 3
74
75 eor \out1, \out1, t1, ror #24
76 eor \out0, \out0, t2, ror #16
77 ldm rk!, {t1, t2}
78 eor \out1, \out1, \t3, ror #16
79 eor \out0, \out0, t0, ror #8
80 eor \out1, \out1, \t4, ror #8
81 eor \out0, \out0, t1
82 eor \out1, \out1, t2
83 .endm
84
85 .macro fround, out0, out1, out2, out3, in0, in1, in2, in3
86 __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
87 __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
88 .endm
89
90 .macro iround, out0, out1, out2, out3, in0, in1, in2, in3
91 __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
92 __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
93 .endm
94
95 .macro __rev, out, in
96 .if __LINUX_ARM_ARCH__ < 6
97 lsl t0, \in, #24
98 and t1, \in, #0xff00
99 and t2, \in, #0xff0000
100 orr \out, t0, \in, lsr #24
101 orr \out, \out, t1, lsl #8
102 orr \out, \out, t2, lsr #8
103 .else
104 rev \out, \in
105 .endif
106 .endm
107
108 .macro __adrl, out, sym, c
109 .if __LINUX_ARM_ARCH__ < 7
110 ldr\c \out, =\sym
111 .else
112 movw\c \out, #:lower16:\sym
113 movt\c \out, #:upper16:\sym
114 .endif
115 .endm
116
117 .macro do_crypt, round, ttab, ltab
118 push {r3-r11, lr}
119
120 ldr r4, [in]
121 ldr r5, [in, #4]
122 ldr r6, [in, #8]
123 ldr r7, [in, #12]
124
125 ldm rk!, {r8-r11}
126
127 #ifdef CONFIG_CPU_BIG_ENDIAN
128 __rev r4, r4
129 __rev r5, r5
130 __rev r6, r6
131 __rev r7, r7
132 #endif
133
134 eor r4, r4, r8
135 eor r5, r5, r9
136 eor r6, r6, r10
137 eor r7, r7, r11
138
139 __adrl tt, \ttab
140
141 tst rounds, #2
142 bne 1f
143
144 0: \round r8, r9, r10, r11, r4, r5, r6, r7
145 \round r4, r5, r6, r7, r8, r9, r10, r11
146
147 1: subs rounds, rounds, #4
148 \round r8, r9, r10, r11, r4, r5, r6, r7
149 __adrl tt, \ltab, ls
150 \round r4, r5, r6, r7, r8, r9, r10, r11
151 bhi 0b
152
153 #ifdef CONFIG_CPU_BIG_ENDIAN
154 __rev r4, r4
155 __rev r5, r5
156 __rev r6, r6
157 __rev r7, r7
158 #endif
159
160 ldr out, [sp]
161
162 str r4, [out]
163 str r5, [out, #4]
164 str r6, [out, #8]
165 str r7, [out, #12]
166
167 pop {r3-r11, pc}
168
169 .align 3
170 .ltorg
171 .endm
172
173 ENTRY(__aes_arm_encrypt)
> 174 do_crypt fround, crypto_ft_tab, crypto_fl_tab
175 ENDPROC(__aes_arm_encrypt)
176
177 ENTRY(__aes_arm_decrypt)
> 178 do_crypt iround, crypto_it_tab, crypto_il_tab
179 ENDPROC(__aes_arm_decrypt)
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 60405 bytes --]
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Andy Lutomirski @ 2017-01-13 1:46 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <20170112201511.yj5ekqmj76r2yv6t@treble>
On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
>> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>> >>
>> >> Just to clarify, I think you're asking if, for versions of gcc which
>> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
>> >> functions to ensure their stacks are 16-byte aligned.
>> >>
>> >> It's certainly possible, but I don't see how that solves the problem.
>> >> The stack will still be misaligned by entry code. Or am I missing
>> >> something?
>> >
>> > I think the argument is that we *could* try to align things, if we
>> > just had some tool that actually then verified that we aren't missing
>> > anything.
>> >
>> > I'm not entirely happy with checking the generated code, though,
>> > because as Ingo says, you have a 50:50 chance of just getting it right
>> > by mistake. So I'd much rather have some static tool that checks
>> > things at a code level (ie coccinelle or sparse).
>>
>> What I meant was checking the entry code to see if it aligns stack
>> frames, and good luck getting sparse to do that. Hmm, getting 16-byte
>> alignment for real may actually be entirely a lost cause. After all,
>> I think we have some inline functions that do asm volatile ("call
>> ..."), and I don't see any credible way of forcing alignment short of
>> generating an entirely new stack frame and aligning that.
>
> Actually we already found all such cases and fixed them by forcing a new
> stack frame, thanks to objtool. For example, see 55a76b59b5fe.
What I mean is: what guarantees that the stack is properly aligned for
the subroutine call? gcc promises to set up a stack frame, but does
it promise that rsp will be properly aligned to call a C function?
^ permalink raw reply
* RE: [PATCH v8 1/1] crypto: add virtio-crypto driver
From: Gonglei (Arei) @ 2017-01-13 1:56 UTC (permalink / raw)
To: Michael S. Tsirkin, Christian Borntraeger
Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org,
virtio-dev@lists.oasis-open.org,
virtualization@lists.linux-foundation.org,
linux-crypto@vger.kernel.org, davem@davemloft.net,
herbert@gondor.apana.org.au, Huangweidong (C), Claudio Fontana,
Luonengjun, Hanweidong (Randy), Xuquan (Quan Xu),
Wanzongshun (Vincent), stefanha@redhat.com,
"Zhoujian (jay, Euler)" <
In-Reply-To: <20170112161729-mutt-send-email-mst@kernel.org>
>
> On Thu, Jan 12, 2017 at 03:10:25PM +0100, Christian Borntraeger wrote:
> > On 01/10/2017 01:56 PM, Christian Borntraeger wrote:
> > > On 01/10/2017 01:36 PM, Gonglei (Arei) wrote:
> > >> Hi,
> > >>
> > >>>
> > >>> On 12/15/2016 03:03 AM, Gonglei wrote:
> > >>> [...]
> > >>>> +
> > >>>> +static struct crypto_alg virtio_crypto_algs[] = { {
> > >>>> + .cra_name = "cbc(aes)",
> > >>>> + .cra_driver_name = "virtio_crypto_aes_cbc",
> > >>>> + .cra_priority = 501,
> > >>>
> > >>>
> > >>> This is still higher than the hardware-accelerators (like intel aesni or the
> > >>> s390 cpacf functions or the arm hw). aesni and s390/cpacf are supported
> by the
> > >>> hardware virtualization and available to the guests. I do not see a way
> how
> > >>> virtio
> > >>> crypto can be faster than that (in the end it might be cpacf/aesni +
> overhead)
> > >>> instead it will very likely be slower.
> > >>> So we should use a number that is higher than software implementations
> but
> > >>> lower than the hw ones.
> > >>>
> > >>> Just grepping around, the software ones seem be be around 100 and the
> > >>> hardware
> > >>> ones around 200-400. So why was 150 not enough?
> > >>>
> > >> I didn't find a documentation about how we use the priority, and I assumed
> > >> people use virtio-crypto will configure hardware accelerators in the
> > >> host. So I choosed the number which bigger than aesni's priority.
> > >
> > > Yes, but the aesni driver will only bind if there is HW support in the guest.
> > > And if aesni is available in the guest (or the s390 aes function from cpacf)
> > > it will always be faster than the same in the host via virtio.So your priority
> > > should be smaller.
> >
> >
> > any opinion on this?
>
> Going forward, we might add an emulated aesni device and that might
> become slower than virtio. OTOH if or when this happens, we can solve it
> by adding a priority or a feature flag to virtio to raise its priority.
>
> So I think I agree with Christian here, let's lower the priority.
> Gonglei, could you send a patch like this?
>
OK, will do.
Thanks,
-Gonglei
^ permalink raw reply
* Re: x86-64: Maintain 16-byte stack alignment
From: Josh Poimboeuf @ 2017-01-13 3:11 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Linus Torvalds, Herbert Xu, Linux Kernel Mailing List,
Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <CALCETrXom8aY2XhpAyOtAwQQYF7wftBHJE_px1xr0iRmcYEJoA@mail.gmail.com>
On Thu, Jan 12, 2017 at 05:46:55PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
> >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
> >> <torvalds@linux-foundation.org> wrote:
> >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >> >>
> >> >> Just to clarify, I think you're asking if, for versions of gcc which
> >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
> >> >> functions to ensure their stacks are 16-byte aligned.
> >> >>
> >> >> It's certainly possible, but I don't see how that solves the problem.
> >> >> The stack will still be misaligned by entry code. Or am I missing
> >> >> something?
> >> >
> >> > I think the argument is that we *could* try to align things, if we
> >> > just had some tool that actually then verified that we aren't missing
> >> > anything.
> >> >
> >> > I'm not entirely happy with checking the generated code, though,
> >> > because as Ingo says, you have a 50:50 chance of just getting it right
> >> > by mistake. So I'd much rather have some static tool that checks
> >> > things at a code level (ie coccinelle or sparse).
> >>
> >> What I meant was checking the entry code to see if it aligns stack
> >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte
> >> alignment for real may actually be entirely a lost cause. After all,
> >> I think we have some inline functions that do asm volatile ("call
> >> ..."), and I don't see any credible way of forcing alignment short of
> >> generating an entirely new stack frame and aligning that.
> >
> > Actually we already found all such cases and fixed them by forcing a new
> > stack frame, thanks to objtool. For example, see 55a76b59b5fe.
>
> What I mean is: what guarantees that the stack is properly aligned for
> the subroutine call? gcc promises to set up a stack frame, but does
> it promise that rsp will be properly aligned to call a C function?
Yes, I did an experiment and you're right. I had naively assumed that
all stack frames would be aligned.
--
Josh
^ permalink raw reply
* Re: [RFC PATCH 5/6] crypto: aesni-intel - Add bulk request support
From: Eric Biggers @ 2017-01-13 3:19 UTC (permalink / raw)
To: Ondrej Mosnacek
Cc: Herbert Xu, linux-crypto, dm-devel, Mike Snitzer, Milan Broz,
Mikulas Patocka, Binoy Jayan
In-Reply-To: <c32a28630157c619ac2a7c851be586e72f193c68.1484215956.git.omosnacek@gmail.com>
On Thu, Jan 12, 2017 at 01:59:57PM +0100, Ondrej Mosnacek wrote:
> This patch implements bulk request handling in the AES-NI crypto drivers.
> The major advantage of this is that with bulk requests, the kernel_fpu_*
> functions (which are usually quite slow) are now called only once for the whole
> request.
>
Hi Ondrej,
To what extent does the performance benefit of this patchset result from just
the reduced numbers of calls to kernel_fpu_begin() and kernel_fpu_end()?
If it's most of the benefit, would it make any sense to optimize
kernel_fpu_begin() and kernel_fpu_end() instead?
And if there are other examples besides kernel_fpu_begin/kernel_fpu_end where
the bulk API would provide a significant performance boost, can you mention
them?
Interestingly, the arm64 equivalent to kernel_fpu_begin()
(kernel_neon_begin_partial() in arch/arm64/kernel/fpsimd.c) appears to have an
optimization where the SIMD registers aren't saved if they were already saved.
I wonder why something similar isn't done on x86.
Eric
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox