* [PATCH 1/2] dt-bindings: zx296718-clk: add compatible for audio clock controller
From: Rob Herring @ 2016-12-12 17:10 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481189157-8995-1-git-send-email-shawnguo@kernel.org>
On Thu, Dec 08, 2016 at 05:25:56PM +0800, Shawn Guo wrote:
> From: Shawn Guo <shawn.guo@linaro.org>
>
> It adds the compatible string for zx296718 audio clock controller.
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> Documentation/devicetree/bindings/clock/zx296718-clk.txt | 3 +++
> 1 file changed, 3 insertions(+)
Acked-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* [PATCH] dt-bindings: Document the hi3660 reset bindings
From: Rob Herring @ 2016-12-12 17:20 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481249504-7942-1-git-send-email-zhangfei.gao@linaro.org>
On Fri, Dec 09, 2016 at 10:11:44AM +0800, Zhangfei Gao wrote:
> Add DT bindings documentation for hi3660 SoC reset controller.
>
> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> ---
> .../bindings/reset/hisilicon,hi3660-reset.txt | 43 ++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/reset/hisilicon,hi3660-reset.txt
Acked-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* [PATCH] clk: bcm: Fix 'maybe-uninitialized' warning in bcm2835_clock_choose_div_and_prate()
From: Eric Anholt @ 2016-12-12 17:24 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481529653-28133-1-git-send-email-boris.brezillon@free-electrons.com>
Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> best_rate is reported as potentially uninitialized by gcc.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> Fixes: 155e8b3b0ee3 ("clk: bcm: Support rate change propagation on bcm2835 clocks")
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161212/65cadde3/attachment.sig>
^ permalink raw reply
* [PATCH 2/4] dt-bindings: mfd: Remove TPS65217 interrupts
From: Rob Herring @ 2016-12-12 17:25 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161209062833.5768-3-woogyom.kim@gmail.com>
On Fri, Dec 09, 2016 at 03:28:31PM +0900, Milo Kim wrote:
> Interrupt numbers are from the datasheet, so no need to keep them in
> the ABI. Use the number in the DT file.
I don't see the purpose of ripping this out. The headers have always
been for convienence, not whether the values come from the datasheet or
not.
> Signed-off-by: Milo Kim <woogyom.kim@gmail.com>
> ---
> arch/arm/boot/dts/am335x-bone-common.dtsi | 8 +++-----
> include/dt-bindings/mfd/tps65217.h | 26 --------------------------
> 2 files changed, 3 insertions(+), 31 deletions(-)
> delete mode 100644 include/dt-bindings/mfd/tps65217.h
^ permalink raw reply
* [PATCH 3/4] dt-bindings: power/supply: Update TPS65217 properties
From: Rob Herring @ 2016-12-12 17:26 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161209062833.5768-4-woogyom.kim@gmail.com>
On Fri, Dec 09, 2016 at 03:28:32PM +0900, Milo Kim wrote:
> Add interrupt specifiers for USB and AC charger input. Interrupt numbers
> are from the datasheet.
> Fix wrong property for compatible string.
>
> Signed-off-by: Milo Kim <woogyom.kim@gmail.com>
> ---
> .../devicetree/bindings/power/supply/tps65217_charger.txt | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
Acked-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* [PATCH 4/4] dt-bindings: input: Specify the interrupt number of TPS65217 power button
From: Rob Herring @ 2016-12-12 17:27 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161209062833.5768-5-woogyom.kim@gmail.com>
On Fri, Dec 09, 2016 at 03:28:33PM +0900, Milo Kim wrote:
> Specify the power button interrupt number which is from the datasheet.
>
> Signed-off-by: Milo Kim <woogyom.kim@gmail.com>
> ---
> Documentation/devicetree/bindings/input/tps65218-pwrbutton.txt | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
Acked-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* [PATCH] ARM: dts: vexpress: Support GICC_DIR operations
From: Marc Zyngier @ 2016-12-12 17:35 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161210201351.25894-1-christoffer.dall@linaro.org>
[+Sudeep]
On 10/12/16 20:13, Christoffer Dall wrote:
> The GICv2 CPU interface registers span across 8K, not 4K as indicated in
> the DT. Only the GICC_DIR register is located after the initial 4K
> boundary, leaving a functional system but without support for separately
> EOI'ing and deactivating interrupts.
>
> After this change the system support split priority drop and interrupt
> deactivation.
>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
> arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> index 0205c97..2e0cf39 100644
> --- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> +++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> @@ -126,7 +126,7 @@
> #address-cells = <0>;
> interrupt-controller;
> reg = <0 0x2c001000 0 0x1000>,
> - <0 0x2c002000 0 0x1000>,
> + <0 0x2c002000 0 0x2000>,
> <0 0x2c004000 0 0x2000>,
> <0 0x2c006000 0 0x2000>;
> interrupts = <1 9 0xf04>;
>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply
* [PATCH] crypto: arm64/aes: reimplement bit-sliced ARM/NEON implementation for arm64
From: Ard Biesheuvel @ 2016-12-12 17:45 UTC (permalink / raw)
To: linux-arm-kernel
This is a reimplementation of the NEON version of the bit-sliced AES
algorithm. This code is heavily based on Andy Polyakov's OpenSSL version
for ARM, which is also available in the kernel. This is an alternative for
the existing NEON implementation for arm64 authored by me, which suffers
from poor performance due to its reliance on the pathologically slow four
register variant of the tbl/tbx NEON instruction.
This version is about ~30% (*) faster than the generic C code, but only in
cases where the input can be 8x interleaved (this is a fundamental property
of bit slicing). For this reason, only the chaining modes ECB, XTS and CTR
are implemented. (The significance of ECB is that it could potentially be
used by other chaining modes)
* Measured on Cortex-A57. Note that this is still an order of magnitude
slower than the implementations that use the dedicated AES instructions
introduced in ARMv8, but those are part of an optional extension, and so
it is good to have a fallback.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/crypto/Kconfig | 6 +
arch/arm64/crypto/Makefile | 3 +
arch/arm64/crypto/aes-neonbs-core.S | 905 ++++++++++++++++++++++++++++++++++++
arch/arm64/crypto/aes-neonbs-glue.c | 300 ++++++++++++
4 files changed, 1214 insertions(+)
create mode 100644 arch/arm64/crypto/aes-neonbs-core.S
create mode 100644 arch/arm64/crypto/aes-neonbs-glue.c
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 450a85df041a..cd0e7a6146b7 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -72,4 +72,10 @@ config CRYPTO_CRC32_ARM64
depends on ARM64
select CRYPTO_HASH
+config CRYPTO_AES_NEON_BS
+ tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
+ depends on KERNEL_MODE_NEON
+ select CRYPTO_BLKCIPHER
+ select CRYPTO_AES
+
endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index aa8888d7b744..11d20714ec48 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -41,6 +41,9 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
sha512-arm64-y := sha512-glue.o sha512-core.o
+obj-$(CONFIG_CRYPTO_AES_NEON_BS) += aes-neon-bs.o
+aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
+
AFLAGS_aes-ce.o := -DINTERLEAVE=4
AFLAGS_aes-neon.o := -DINTERLEAVE=4
diff --git a/arch/arm64/crypto/aes-neonbs-core.S b/arch/arm64/crypto/aes-neonbs-core.S
new file mode 100644
index 000000000000..d027c276cc75
--- /dev/null
+++ b/arch/arm64/crypto/aes-neonbs-core.S
@@ -0,0 +1,905 @@
+/*
+ * Bit sliced AES using NEON instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/*
+ * The algorithm implemented here is described in detail by the paper
+ * 'Faster and Timing-Attack Resistant AES-GCM' by Emilia Kaesper and
+ * Peter Schwabe (https://eprint.iacr.org/2009/129.pdf)
+ *
+ * This implementation is based primarily on the OpenSSL implementation
+ * for 32-bit ARM written by Andy Polyakov <appro@openssl.org>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ .text
+
+ rounds .req x11
+ bskey .req x12
+
+ .macro in_bs_ch, b0, b1, b2, b3, b4, b5, b6, b7
+ eor \b2, \b2, \b1
+ eor \b5, \b5, \b6
+ eor \b3, \b3, \b0
+ eor \b6, \b6, \b2
+ eor \b5, \b5, \b0
+ eor \b6, \b6, \b3
+ eor \b3, \b3, \b7
+ eor \b7, \b7, \b5
+ eor \b3, \b3, \b4
+ eor \b4, \b4, \b5
+ eor \b2, \b2, \b7
+ eor \b3, \b3, \b1
+ eor \b1, \b1, \b5
+ .endm
+
+ .macro out_bs_ch, b0, b1, b2, b3, b4, b5, b6, b7
+ eor \b0, \b0, \b6
+ eor \b1, \b1, \b4
+ eor \b4, \b4, \b6
+ eor \b2, \b2, \b0
+ eor \b6, \b6, \b1
+ eor \b1, \b1, \b5
+ eor \b5, \b5, \b3
+ eor \b3, \b3, \b7
+ eor \b7, \b7, \b5
+ eor \b2, \b2, \b5
+ eor \b4, \b4, \b7
+ .endm
+
+ .macro inv_in_bs_ch, b6, b1, b2, b4, b7, b0, b3, b5
+ eor \b1, \b1, \b7
+ eor \b4, \b4, \b7
+ eor \b7, \b7, \b5
+ eor \b1, \b1, \b3
+ eor \b2, \b2, \b5
+ eor \b3, \b3, \b7
+ eor \b6, \b6, \b1
+ eor \b2, \b2, \b0
+ eor \b5, \b5, \b3
+ eor \b4, \b4, \b6
+ eor \b0, \b0, \b6
+ eor \b1, \b1, \b4
+ .endm
+
+ .macro inv_out_bs_ch, b6, b5, b0, b3, b7, b1, b4, b2
+ eor \b1, \b1, \b5
+ eor \b2, \b2, \b7
+ eor \b3, \b3, \b1
+ eor \b4, \b4, \b5
+ eor \b7, \b7, \b5
+ eor \b3, \b3, \b4
+ eor \b5, \b5, \b0
+ eor \b3, \b3, \b7
+ eor \b6, \b6, \b2
+ eor \b2, \b2, \b1
+ eor \b6, \b6, \b3
+ eor \b3, \b3, \b0
+ eor \b5, \b5, \b6
+ .endm
+
+ .macro mul_gf4, x0, x1, y0, y1, t0, t1
+ eor \t0, \y0, \y1
+ and \t0, \t0, \x0
+ eor \x0, \x0, \x1
+ and \t1, \x1, \y0
+ and \x0, \x0, \y1
+ eor \x1, \t1, \t0
+ eor \x0, \x0, \t1
+ .endm
+
+ .macro mul_gf4_n, x0, x1, y0, y1, t0
+ eor \t0, \y0, \y1
+ and \t0, \t0, \x0
+ eor \x0, \x0, \x1
+ and \x1, \x1, \y0
+ and \x0, \x0, \y1
+ eor \x1, \x1, \x0
+ eor \x0, \x0, \t0
+ .endm
+
+ .macro mul_gf4_n_gf4, x0, x1, y0, y1, t0, x2, x3, y2, y3, t1
+ eor \t0, \y0, \y1
+ eor \t1, \y2, \y3
+ and \t0, \t0, \x0
+ and \t1, \t1, \x2
+ eor \x0, \x0, \x1
+ eor \x2, \x2, \x3
+ and \x1, \x1, \y0
+ and \x3, \x3, \y2
+ and \x0, \x0, \y1
+ and \x2, \x2, \y3
+ eor \x1, \x1, \x0
+ eor \x2, \x2, \x3
+ eor \x0, \x0, \t0
+ eor \x3, \x3, \t1
+ .endm
+
+ .macro mul_gf16_2, x0, x1, x2, x3, x4, x5, x6, x7, \
+ y0, y1, y2, y3, t0, t1, t2, t3
+ eor \t0, \x0, \x2
+ eor \t1, \x1, \x3
+ mul_gf4 \x0, \x1, \y0, \y1, \t2, \t3
+ eor \y0, \y0, \y2
+ eor \y1, \y1, \y3
+ mul_gf4_n_gf4 \t0, \t1, \y0, \y1, \t3, \x2, \x3, \y2, \y3, \t2
+ eor \x0, \x0, \t0
+ eor \x2, \x2, \t0
+ eor \x1, \x1, \t1
+ eor \x3, \x3, \t1
+ eor \t0, \x4, \x6
+ eor \t1, \x5, \x7
+ mul_gf4_n_gf4 \t0, \t1, \y0, \y1, \t3, \x6, \x7, \y2, \y3, \t2
+ eor \y0, \y0, \y2
+ eor \y1, \y1, \y3
+ mul_gf4 \x4, \x5, \y0, \y1, \t2, \t3
+ eor \x4, \x4, \t0
+ eor \x6, \x6, \t0
+ eor \x5, \x5, \t1
+ eor \x7, \x7, \t1
+ .endm
+
+ .macro inv_gf256, x0, x1, x2, x3, x4, x5, x6, x7, \
+ t0, t1, t2, t3, s0, s1, s2, s3
+ eor \t3, \x4, \x6
+ eor \t2, \x5, \x7
+ eor \t1, \x1, \x3
+ eor \s1, \x7, \x6
+ mov \t0, \t2
+ eor \s0, \x0, \x2
+ orr \t2, \t2, \t1
+ eor \s3, \t3, \t0
+ and \s2, \t3, \s0
+ orr \t3, \t3, \s0
+ eor \s0, \s0, \t1
+ and \t0, \t0, \t1
+ eor \t1, \x3, \x2
+ and \s3, \s3, \s0
+ and \s1, \s1, \t1
+ eor \t1, \x4, \x5
+ eor \s0, \x1, \x0
+ eor \t3, \t3, \s1
+ eor \t2, \t2, \s1
+ and \s1, \t1, \s0
+ orr \t1, \t1, \s0
+ eor \t3, \t3, \s3
+ eor \t0, \t0, \s1
+ eor \t2, \t2, \s2
+ eor \t1, \t1, \s3
+ eor \t0, \t0, \s2
+ and \s0, \x7, \x3
+ eor \t1, \t1, \s2
+ and \s1, \x6, \x2
+ and \s2, \x5, \x1
+ orr \s3, \x4, \x0
+ eor \t3, \t3, \s0
+ eor \t1, \t1, \s2
+ eor \t0, \t0, \s3
+ eor \t2, \t2, \s1
+ and \s2, \t3, \t1
+ mov \s0, \t0
+ eor \s1, \t2, \s2
+ eor \s3, \t0, \s2
+ eor \s2, \t0, \s2
+ bsl \s1, \t1, \t0
+ bsl \s3, \t3, \t2
+ eor \t3, \t3, \t2
+ bsl \s0, \s1, \s2
+ bsl \t0, \s2, \s1
+ and \s2, \s0, \s3
+ eor \t1, \t1, \t0
+ eor \s2, \s2, \t3
+ mul_gf16_2 \x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \
+ \s3, \s2, \s1, \t1, \s0, \t0, \t2, \t3
+ .endm
+
+ .macro sbox, b0, b1, b2, b3, b4, b5, b6, b7, \
+ t0, t1, t2, t3, s0, s1, s2, s3
+ in_bs_ch \b0\().16b, \b1\().16b, \b2\().16b, \b3\().16b, \
+ \b4\().16b, \b5\().16b, \b6\().16b, \b7\().16b
+ inv_gf256 \b6\().16b, \b5\().16b, \b0\().16b, \b3\().16b, \
+ \b7\().16b, \b1\().16b, \b4\().16b, \b2\().16b, \
+ \t0\().16b, \t1\().16b, \t2\().16b, \t3\().16b, \
+ \s0\().16b, \s1\().16b, \s2\().16b, \s3\().16b
+ out_bs_ch \b7\().16b, \b1\().16b, \b4\().16b, \b2\().16b, \
+ \b6\().16b, \b5\().16b, \b0\().16b, \b3\().16b
+ .endm
+
+ .macro inv_sbox, b0, b1, b2, b3, b4, b5, b6, b7, \
+ t0, t1, t2, t3, s0, s1, s2, s3
+ inv_in_bs_ch \b0\().16b, \b1\().16b, \b2\().16b, \b3\().16b, \
+ \b4\().16b, \b5\().16b, \b6\().16b, \b7\().16b
+ inv_gf256 \b5\().16b, \b1\().16b, \b2\().16b, \b6\().16b, \
+ \b3\().16b, \b7\().16b, \b0\().16b, \b4\().16b, \
+ \t0\().16b, \t1\().16b, \t2\().16b, \t3\().16b, \
+ \s0\().16b, \s1\().16b, \s2\().16b, \s3\().16b
+ inv_out_bs_ch \b3\().16b, \b7\().16b, \b0\().16b, \b4\().16b, \
+ \b5\().16b, \b1\().16b, \b2\().16b, \b6\().16b
+ .endm
+
+ .macro enc_next_rk
+ ldp q16, q17, [bskey], #32
+ ldp q18, q19, [bskey], #32
+ ldp q20, q21, [bskey], #32
+ ldp q22, q23, [bskey], #32
+ .endm
+
+ .macro dec_next_rk
+ ldp q16, q17, [bskey, #-128]!
+ ldp q18, q19, [bskey, #32]
+ ldp q20, q21, [bskey, #64]
+ ldp q22, q23, [bskey, #96]
+ .endm
+
+ .macro add_round_key, x0, x1, x2, x3, x4, x5, x6, x7
+ eor \x0\().16b, \x0\().16b, v16.16b
+ eor \x1\().16b, \x1\().16b, v17.16b
+ eor \x2\().16b, \x2\().16b, v18.16b
+ eor \x3\().16b, \x3\().16b, v19.16b
+ eor \x4\().16b, \x4\().16b, v20.16b
+ eor \x5\().16b, \x5\().16b, v21.16b
+ eor \x6\().16b, \x6\().16b, v22.16b
+ eor \x7\().16b, \x7\().16b, v23.16b
+ .endm
+
+ .macro shift_rows, x0, x1, x2, x3, x4, x5, x6, x7, mask
+ tbl \x0\().16b, {\x0\().16b}, \mask\().16b
+ tbl \x1\().16b, {\x1\().16b}, \mask\().16b
+ tbl \x2\().16b, {\x2\().16b}, \mask\().16b
+ tbl \x3\().16b, {\x3\().16b}, \mask\().16b
+ tbl \x4\().16b, {\x4\().16b}, \mask\().16b
+ tbl \x5\().16b, {\x5\().16b}, \mask\().16b
+ tbl \x6\().16b, {\x6\().16b}, \mask\().16b
+ tbl \x7\().16b, {\x7\().16b}, \mask\().16b
+ .endm
+
+ .macro mix_cols, x0, x1, x2, x3, x4, x5, x6, x7, \
+ t0, t1, t2, t3, t4, t5, t6, t7, inv
+ ext \t0\().16b, \x0\().16b, \x0\().16b, #12
+ ext \t1\().16b, \x1\().16b, \x1\().16b, #12
+ eor \x0\().16b, \x0\().16b, \t0\().16b
+ ext \t2\().16b, \x2\().16b, \x2\().16b, #12
+ eor \x1\().16b, \x1\().16b, \t1\().16b
+ ext \t3\().16b, \x3\().16b, \x3\().16b, #12
+ eor \x2\().16b, \x2\().16b, \t2\().16b
+ ext \t4\().16b, \x4\().16b, \x4\().16b, #12
+ eor \x3\().16b, \x3\().16b, \t3\().16b
+ ext \t5\().16b, \x5\().16b, \x5\().16b, #12
+ eor \x4\().16b, \x4\().16b, \t4\().16b
+ ext \t6\().16b, \x6\().16b, \x6\().16b, #12
+ eor \x5\().16b, \x5\().16b, \t5\().16b
+ ext \t7\().16b, \x7\().16b, \x7\().16b, #12
+ eor \x6\().16b, \x6\().16b, \t6\().16b
+ eor \t1\().16b, \t1\().16b, \x0\().16b
+ eor \x7\().16b, \x7\().16b, \t7\().16b
+ ext \x0\().16b, \x0\().16b, \x0\().16b, #8
+ eor \t2\().16b, \t2\().16b, \x1\().16b
+ eor \t0\().16b, \t0\().16b, \x7\().16b
+ eor \t1\().16b, \t1\().16b, \x7\().16b
+ ext \x1\().16b, \x1\().16b, \x1\().16b, #8
+ eor \t5\().16b, \t5\().16b, \x4\().16b
+ eor \x0\().16b, \x0\().16b, \t0\().16b
+ eor \t6\().16b, \t6\().16b, \x5\().16b
+ eor \x1\().16b, \x1\().16b, \t1\().16b
+ ext \t0\().16b, \x4\().16b, \x4\().16b, #8
+ eor \t4\().16b, \t4\().16b, \x3\().16b
+ ext \t1\().16b, \x5\().16b, \x5\().16b, #8
+ eor \t7\().16b, \t7\().16b, \x6\().16b
+ ext \x4\().16b, \x3\().16b, \x3\().16b, #8
+ eor \t3\().16b, \t3\().16b, \x2\().16b
+ ext \x5\().16b, \x7\().16b, \x7\().16b, #8
+ eor \t4\().16b, \t4\().16b, \x7\().16b
+ ext \x3\().16b, \x6\().16b, \x6\().16b, #8
+ eor \t3\().16b, \t3\().16b, \x7\().16b
+ ext \x6\().16b, \x2\().16b, \x2\().16b, #8
+ eor \x7\().16b, \t1\().16b, \t5\().16b
+ .ifb \inv
+ eor \x2\().16b, \t0\().16b, \t4\().16b
+ eor \x4\().16b, \x4\().16b, \t3\().16b
+ eor \x5\().16b, \x5\().16b, \t7\().16b
+ eor \x3\().16b, \x3\().16b, \t6\().16b
+ eor \x6\().16b, \x6\().16b, \t2\().16b
+ .else
+ eor \t3\().16b, \t3\().16b, \x4\().16b
+ eor \x5\().16b, \x5\().16b, \t7\().16b
+ eor \x2\().16b, \x3\().16b, \t6\().16b
+ eor \x3\().16b, \t0\().16b, \t4\().16b
+ eor \x4\().16b, \x6\().16b, \t2\().16b
+ mov \x6\().16b, \t3\().16b
+ .endif
+ .endm
+
+ .macro inv_mix_cols, x0, x1, x2, x3, x4, x5, x6, x7, \
+ t0, t1, t2, t3, t4, t5, t6, t7
+ ext \t0\().16b, \x0\().16b, \x0\().16b, #8
+ ext \t6\().16b, \x6\().16b, \x6\().16b, #8
+ ext \t7\().16b, \x7\().16b, \x7\().16b, #8
+ eor \t0\().16b, \t0\().16b, \x0\().16b
+ ext \t1\().16b, \x1\().16b, \x1\().16b, #8
+ eor \t6\().16b, \t6\().16b, \x6\().16b
+ ext \t2\().16b, \x2\().16b, \x2\().16b, #8
+ eor \t7\().16b, \t7\().16b, \x7\().16b
+ ext \t3\().16b, \x3\().16b, \x3\().16b, #8
+ eor \t1\().16b, \t1\().16b, \x1\().16b
+ ext \t4\().16b, \x4\().16b, \x4\().16b, #8
+ eor \t2\().16b, \t2\().16b, \x2\().16b
+ ext \t5\().16b, \x5\().16b, \x5\().16b, #8
+ eor \t3\().16b, \t3\().16b, \x3\().16b
+ eor \t4\().16b, \t4\().16b, \x4\().16b
+ eor \t5\().16b, \t5\().16b, \x5\().16b
+ eor \x0\().16b, \x0\().16b, \t6\().16b
+ eor \x1\().16b, \x1\().16b, \t6\().16b
+ eor \x2\().16b, \x2\().16b, \t0\().16b
+ eor \x4\().16b, \x4\().16b, \t2\().16b
+ eor \x3\().16b, \x3\().16b, \t1\().16b
+ eor \x1\().16b, \x1\().16b, \t7\().16b
+ eor \x2\().16b, \x2\().16b, \t7\().16b
+ eor \x4\().16b, \x4\().16b, \t6\().16b
+ eor \x5\().16b, \x5\().16b, \t3\().16b
+ eor \x3\().16b, \x3\().16b, \t6\().16b
+ eor \x6\().16b, \x6\().16b, \t4\().16b
+ eor \x4\().16b, \x4\().16b, \t7\().16b
+ eor \x5\().16b, \x5\().16b, \t7\().16b
+ eor \x7\().16b, \x7\().16b, \t5\().16b
+ mix_cols \x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \
+ \t0, \t1, \t2, \t3, \t4, \t5, \t6, \t7, 1
+ .endm
+
+ .macro swapmove_2x, a0, b0, a1, b1, n, mask, t0, t1
+ ushr \t0\().2d, \b0\().2d, #\n
+ ushr \t1\().2d, \b1\().2d, #\n
+ eor \t0\().16b, \t0\().16b, \a0\().16b
+ eor \t1\().16b, \t1\().16b, \a1\().16b
+ and \t0\().16b, \t0\().16b, \mask\().16b
+ and \t1\().16b, \t1\().16b, \mask\().16b
+ eor \a0\().16b, \a0\().16b, \t0\().16b
+ shl \t0\().2d, \t0\().2d, #\n
+ eor \a1\().16b, \a1\().16b, \t1\().16b
+ shl \t1\().2d, \t1\().2d, #\n
+ eor \b0\().16b, \b0\().16b, \t0\().16b
+ eor \b1\().16b, \b1\().16b, \t1\().16b
+ .endm
+
+ .macro bitslice, x7, x6, x5, x4, x3, x2, x1, x0, t0, t1, t2, t3
+ movi \t0\().16b, #0x55
+ movi \t1\().16b, #0x33
+ swapmove_2x \x0, \x1, \x2, \x3, 1, \t0, \t2, \t3
+ swapmove_2x \x4, \x5, \x6, \x7, 1, \t0, \t2, \t3
+ movi \t0\().16b, #0x0f
+ swapmove_2x \x0, \x2, \x1, \x3, 2, \t1, \t2, \t3
+ swapmove_2x \x4, \x6, \x5, \x7, 2, \t1, \t2, \t3
+ swapmove_2x \x0, \x4, \x1, \x5, 4, \t0, \t2, \t3
+ swapmove_2x \x2, \x6, \x3, \x7, 4, \t0, \t2, \t3
+ .endm
+
+
+ .align 6
+M0: .octa 0x0004080c0105090d02060a0e03070b0f
+
+M0SR: .octa 0x0004080c05090d010a0e02060f03070b
+SR: .octa 0x0f0e0d0c0a09080b0504070600030201
+SRM0: .octa 0x01060b0c0207080d0304090e00050a0f
+
+M0ISR: .octa 0x0004080c0d0105090a0e0206070b0f03
+ISR: .octa 0x0f0e0d0c080b0a090504070602010003
+ISRM0: .octa 0x0306090c00070a0d01040b0e0205080f
+
+ /*
+ * void aesbs_convert_key(u8 out[], u32 const rk[], int rounds)
+ */
+ENTRY(aesbs_convert_key)
+ ld1 {v7.4s}, [x1], #16 // load round 0 key
+ ld1 {v17.4s}, [x1], #16 // load round 1 key
+
+ movi v8.16b, #0x01 // bit masks
+ movi v9.16b, #0x02
+ movi v10.16b, #0x04
+ movi v11.16b, #0x08
+ movi v12.16b, #0x10
+ movi v13.16b, #0x20
+ movi v14.16b, #0x40
+ movi v15.16b, #0x80
+ ldr q16, M0
+
+ sub x2, x2, #1
+ str q7, [x0], #16 // save round 0 key
+
+.Lkey_loop:
+ tbl v7.16b ,{v17.16b}, v16.16b
+ ld1 {v17.4s}, [x1], #16 // load next round key
+
+ cmtst v0.16b, v7.16b, v8.16b
+ cmtst v1.16b, v7.16b, v9.16b
+ cmtst v2.16b, v7.16b, v10.16b
+ cmtst v3.16b, v7.16b, v11.16b
+ cmtst v4.16b, v7.16b, v12.16b
+ cmtst v5.16b, v7.16b, v13.16b
+ cmtst v6.16b, v7.16b, v14.16b
+ cmtst v7.16b, v7.16b, v15.16b
+ not v0.16b, v0.16b
+ not v1.16b, v1.16b
+ not v5.16b, v5.16b
+ not v6.16b, v6.16b
+
+ subs x2, x2, #1
+ stp q2, q3, [x0, #32]
+ stp q4, q5, [x0, #64]
+ stp q6, q7, [x0, #96]
+ stp q0, q1, [x0], #128
+ b.ne .Lkey_loop
+
+ movi v7.16b, #0x63 // compose .L63
+ eor v17.16b, v17.16b, v7.16b
+ str q17, [x0]
+ ret
+ENDPROC(aesbs_convert_key)
+
+ .align 4
+aesbs_encrypt8:
+ ldr q9, [bskey], #16 // round 0 key
+ ldr q8, M0SR
+ ldr q24, SR
+
+ eor v10.16b, v0.16b, v9.16b // xor with round0 key
+ eor v11.16b, v1.16b, v9.16b
+ tbl v0.16b, {v10.16b}, v8.16b
+ eor v12.16b, v2.16b, v9.16b
+ tbl v1.16b, {v11.16b}, v8.16b
+ eor v13.16b, v3.16b, v9.16b
+ tbl v2.16b, {v12.16b}, v8.16b
+ eor v14.16b, v4.16b, v9.16b
+ tbl v3.16b, {v13.16b}, v8.16b
+ eor v15.16b, v5.16b, v9.16b
+ tbl v4.16b, {v14.16b}, v8.16b
+ eor v10.16b, v6.16b, v9.16b
+ tbl v5.16b, {v15.16b}, v8.16b
+ eor v11.16b, v7.16b, v9.16b
+ tbl v6.16b, {v10.16b}, v8.16b
+ tbl v7.16b, {v11.16b}, v8.16b
+
+ bitslice v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11
+
+ sub rounds, rounds, #1
+ b .Lenc_sbox
+
+.Lenc_loop:
+ shift_rows v0, v1, v2, v3, v4, v5, v6, v7, v24
+.Lenc_sbox:
+ sbox v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, \
+ v13, v14, v15
+ subs rounds, rounds, #1
+ b.cc .Lenc_done
+
+ enc_next_rk
+
+ mix_cols v0, v1, v4, v6, v3, v7, v2, v5, v8, v9, v10, v11, v12, \
+ v13, v14, v15
+
+ add_round_key v0, v1, v2, v3, v4, v5, v6, v7
+
+ b.ne .Lenc_loop
+ ldr q24, SRM0
+ b .Lenc_loop
+
+.Lenc_done:
+ ldr q12, [bskey] // last round key
+
+ bitslice v0, v1, v4, v6, v3, v7, v2, v5, v8, v9, v10, v11
+
+ eor v0.16b, v0.16b, v12.16b
+ eor v1.16b, v1.16b, v12.16b
+ eor v4.16b, v4.16b, v12.16b
+ eor v6.16b, v6.16b, v12.16b
+ eor v3.16b, v3.16b, v12.16b
+ eor v7.16b, v7.16b, v12.16b
+ eor v2.16b, v2.16b, v12.16b
+ eor v5.16b, v5.16b, v12.16b
+ ret
+ENDPROC(aesbs_encrypt8)
+
+ .align 4
+aesbs_decrypt8:
+ lsl x9, rounds, #7
+ add bskey, bskey, x9
+
+ ldr q9, [bskey, #-112]! // round 0 key
+ ldr q8, M0ISR
+ ldr q24, ISR
+
+ eor v10.16b, v0.16b, v9.16b // xor with round0 key
+ eor v11.16b, v1.16b, v9.16b
+ tbl v0.16b, {v10.16b}, v8.16b
+ eor v12.16b, v2.16b, v9.16b
+ tbl v1.16b, {v11.16b}, v8.16b
+ eor v13.16b, v3.16b, v9.16b
+ tbl v2.16b, {v12.16b}, v8.16b
+ eor v14.16b, v4.16b, v9.16b
+ tbl v3.16b, {v13.16b}, v8.16b
+ eor v15.16b, v5.16b, v9.16b
+ tbl v4.16b, {v14.16b}, v8.16b
+ eor v10.16b, v6.16b, v9.16b
+ tbl v5.16b, {v15.16b}, v8.16b
+ eor v11.16b, v7.16b, v9.16b
+ tbl v6.16b, {v10.16b}, v8.16b
+ tbl v7.16b, {v11.16b}, v8.16b
+
+ bitslice v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11
+
+ sub rounds, rounds, #1
+ b .Ldec_sbox
+
+.Ldec_loop:
+ shift_rows v0, v1, v2, v3, v4, v5, v6, v7, v24
+.Ldec_sbox:
+ inv_sbox v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, \
+ v13, v14, v15
+ subs rounds, rounds, #1
+ b.cc .Ldec_done
+
+ dec_next_rk
+
+ add_round_key v0, v1, v6, v4, v2, v7, v3, v5
+
+ inv_mix_cols v0, v1, v6, v4, v2, v7, v3, v5, v8, v9, v10, v11, v12, \
+ v13, v14, v15
+
+ b.ne .Ldec_loop
+ ldr q24, ISRM0
+ b .Ldec_loop
+.Ldec_done:
+ ldr q12, [bskey, #-16] // last round key
+
+ bitslice v0, v1, v6, v4, v2, v7, v3, v5, v8, v9, v10, v11
+
+ eor v0.16b, v0.16b, v12.16b
+ eor v1.16b, v1.16b, v12.16b
+ eor v6.16b, v6.16b, v12.16b
+ eor v4.16b, v4.16b, v12.16b
+ eor v2.16b, v2.16b, v12.16b
+ eor v7.16b, v7.16b, v12.16b
+ eor v3.16b, v3.16b, v12.16b
+ eor v5.16b, v5.16b, v12.16b
+ ret
+ENDPROC(aesbs_decrypt8)
+
+ /*
+ * aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+ * int blocks)
+ * aesbs_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+ * int blocks)
+ */
+ .macro __ecb_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+
+99: mov x5, #1
+ lsl x5, x5, x4
+ subs w4, w4, #8
+ csel x4, x4, xzr, pl
+ csel x5, x5, xzr, mi
+
+ ld1 {v0.16b}, [x1], #16
+ tbnz x5, #1, 0f
+ ld1 {v1.16b}, [x1], #16
+ tbnz x5, #2, 0f
+ ld1 {v2.16b}, [x1], #16
+ tbnz x5, #3, 0f
+ ld1 {v3.16b}, [x1], #16
+ tbnz x5, #4, 0f
+ ld1 {v4.16b}, [x1], #16
+ tbnz x5, #5, 0f
+ ld1 {v5.16b}, [x1], #16
+ tbnz x5, #6, 0f
+ ld1 {v6.16b}, [x1], #16
+ tbnz x5, #7, 0f
+ ld1 {v7.16b}, [x1], #16
+
+0: mov bskey, x2
+ mov rounds, x3
+ bl \do8
+
+ st1 {\o0\().16b}, [x0], #16
+ tbnz x5, #1, 1f
+ st1 {\o1\().16b}, [x0], #16
+ tbnz x5, #2, 1f
+ st1 {\o2\().16b}, [x0], #16
+ tbnz x5, #3, 1f
+ st1 {\o3\().16b}, [x0], #16
+ tbnz x5, #4, 1f
+ st1 {\o4\().16b}, [x0], #16
+ tbnz x5, #5, 1f
+ st1 {\o5\().16b}, [x0], #16
+ tbnz x5, #6, 1f
+ st1 {\o6\().16b}, [x0], #16
+ tbnz x5, #7, 1f
+ st1 {\o7\().16b}, [x0], #16
+
+ cbnz x4, 99b
+
+1: ldp x29, x30, [sp], #16
+ ret
+ .endm
+
+ .align 4
+ENTRY(aesbs_ecb_encrypt)
+ __ecb_crypt aesbs_encrypt8, v0, v1, v4, v6, v3, v7, v2, v5
+ENDPROC(aesbs_ecb_encrypt)
+
+ .align 4
+ENTRY(aesbs_ecb_decrypt)
+ __ecb_crypt aesbs_decrypt8, v0, v1, v6, v4, v2, v7, v3, v5
+ENDPROC(aesbs_ecb_decrypt)
+
+ .macro next_tweak, out, in, const, tmp
+ sshr \tmp\().2d, \in\().2d, #63
+ and \tmp\().16b, \tmp\().16b, \const\().16b
+ add \out\().2d, \in\().2d, \in\().2d
+ ext \tmp\().16b, \tmp\().16b, \tmp\().16b, #8
+ eor \out\().16b, \out\().16b, \tmp\().16b
+ .endm
+
+ .align 4
+.Lxts_mul_x:
+CPU_LE( .quad 1, 0x87 )
+CPU_BE( .quad 0x87, 1 )
+
+ /*
+ * aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+ * int blocks, u8 iv[])
+ * aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+ * int blocks, u8 iv[])
+ */
+__xts_crypt8:
+ mov x6, #1
+ lsl x6, x6, x4
+ subs w4, w4, #8
+ csel x4, x4, xzr, pl
+ csel x6, x6, xzr, mi
+
+ ld1 {v0.16b}, [x1], #16
+ next_tweak v26, v25, v30, v31
+ eor v0.16b, v0.16b, v25.16b
+ tbnz x6, #1, 0f
+
+ ld1 {v1.16b}, [x1], #16
+ next_tweak v27, v26, v30, v31
+ eor v1.16b, v1.16b, v26.16b
+ tbnz x6, #2, 0f
+
+ ld1 {v2.16b}, [x1], #16
+ next_tweak v28, v27, v30, v31
+ eor v2.16b, v2.16b, v27.16b
+ tbnz x6, #3, 0f
+
+ ld1 {v3.16b}, [x1], #16
+ next_tweak v29, v28, v30, v31
+ eor v3.16b, v3.16b, v28.16b
+ tbnz x6, #4, 0f
+
+ ld1 {v4.16b}, [x1], #16
+ str q29, [sp, #16]
+ eor v4.16b, v4.16b, v29.16b
+ next_tweak v29, v29, v30, v31
+ tbnz x6, #5, 0f
+
+ ld1 {v5.16b}, [x1], #16
+ str q29, [sp, #32]
+ eor v5.16b, v5.16b, v29.16b
+ next_tweak v29, v29, v30, v31
+ tbnz x6, #6, 0f
+
+ ld1 {v6.16b}, [x1], #16
+ str q29, [sp, #48]
+ eor v6.16b, v6.16b, v29.16b
+ next_tweak v29, v29, v30, v31
+ tbnz x6, #7, 0f
+
+ ld1 {v7.16b}, [x1], #16
+ str q29, [sp, #64]
+ eor v7.16b, v7.16b, v29.16b
+ next_tweak v29, v29, v30, v31
+
+0: mov bskey, x2
+ mov rounds, x3
+ br x7
+ENDPROC(__xts_crypt8)
+
+ .macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
+ stp x29, x30, [sp, #-80]!
+ mov x29, sp
+
+ ldr q30, .Lxts_mul_x
+ ld1 {v25.16b}, [x5]
+
+99: adr x7, \do8
+ bl __xts_crypt8
+
+ ldp q16, q17, [sp, #16]
+ ldp q18, q19, [sp, #48]
+
+ eor \o0\().16b, \o0\().16b, v25.16b
+ eor \o1\().16b, \o1\().16b, v26.16b
+ eor \o2\().16b, \o2\().16b, v27.16b
+ eor \o3\().16b, \o3\().16b, v28.16b
+
+ st1 {\o0\().16b}, [x0], #16
+ mov v25.16b, v26.16b
+ tbnz x6, #1, 1f
+ st1 {\o1\().16b}, [x0], #16
+ mov v25.16b, v27.16b
+ tbnz x6, #2, 1f
+ st1 {\o2\().16b}, [x0], #16
+ mov v25.16b, v28.16b
+ tbnz x6, #3, 1f
+ st1 {\o3\().16b}, [x0], #16
+ mov v25.16b, v29.16b
+ tbnz x6, #4, 1f
+
+ eor \o4\().16b, \o4\().16b, v16.16b
+ eor \o5\().16b, \o5\().16b, v17.16b
+ eor \o6\().16b, \o6\().16b, v18.16b
+ eor \o7\().16b, \o7\().16b, v19.16b
+
+ st1 {\o4\().16b}, [x0], #16
+ tbnz x6, #5, 1f
+ st1 {\o5\().16b}, [x0], #16
+ tbnz x6, #6, 1f
+ st1 {\o6\().16b}, [x0], #16
+ tbnz x6, #7, 1f
+ st1 {\o7\().16b}, [x0], #16
+
+ cbnz x4, 99b
+
+1: st1 {v25.16b}, [x5]
+ ldp x29, x30, [sp], #80
+ ret
+ .endm
+
+ENTRY(aesbs_xts_encrypt)
+ __xts_crypt aesbs_encrypt8, v0, v1, v4, v6, v3, v7, v2, v5
+ENDPROC(aesbs_xts_encrypt)
+
+ENTRY(aesbs_xts_decrypt)
+ __xts_crypt aesbs_decrypt8, v0, v1, v6, v4, v2, v7, v3, v5
+ENDPROC(aesbs_xts_decrypt)
+
+ .macro next_ctr, v
+ mov \v\().d[1], x8
+ mov \v\().d[0], x7
+ adds x8, x8, #1
+ adc x7, x7, xzr
+ rev64 \v\().16b, \v\().16b
+ .endm
+
+ /*
+ * aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
+ * int rounds, int blocks, u8 iv[], bool final)
+ */
+ENTRY(aesbs_ctr_encrypt)
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+
+ add x4, x4, x6 // do one extra block if final
+
+ ldp x7, x8, [x5]
+ ld1 {v0.16b}, [x5]
+CPU_LE( rev x7, x7 )
+CPU_LE( rev x8, x8 )
+ adds x8, x8, #1
+ adc x7, x7, xzr
+
+99: mov x9, #1
+ lsl x9, x9, x4
+ subs w4, w4, #8
+ csel x4, x4, xzr, pl
+ csel x9, x9, xzr, le
+
+ tbnz x9, #1, 0f
+
+ next_ctr v1
+ tbnz x9, #2, 0f
+
+ next_ctr v2
+ tbnz x9, #3, 0f
+
+ next_ctr v3
+ tbnz x9, #4, 0f
+
+ next_ctr v4
+ tbnz x9, #5, 0f
+
+ next_ctr v5
+ tbnz x9, #6, 0f
+
+ next_ctr v6
+ tbnz x9, #7, 0f
+
+ next_ctr v7
+
+0: mov bskey, x2
+ mov rounds, x3
+ bl aesbs_encrypt8
+
+ lsr x9, x9, x6 // disregard the final block
+ tbnz x9, #0, 0f
+
+ ld1 {v8.16b}, [x1], #16
+ eor v0.16b, v0.16b, v8.16b
+ st1 {v0.16b}, [x0], #16
+ tbnz x9, #1, 1f
+
+ ld1 {v9.16b}, [x1], #16
+ eor v1.16b, v1.16b, v9.16b
+ st1 {v1.16b}, [x0], #16
+ tbnz x9, #2, 2f
+
+ ld1 {v10.16b}, [x1], #16
+ eor v4.16b, v4.16b, v10.16b
+ st1 {v4.16b}, [x0], #16
+ tbnz x9, #3, 3f
+
+ ld1 {v11.16b}, [x1], #16
+ eor v6.16b, v6.16b, v11.16b
+ st1 {v6.16b}, [x0], #16
+ tbnz x9, #4, 4f
+
+ ld1 {v12.16b}, [x1], #16
+ eor v3.16b, v3.16b, v12.16b
+ st1 {v3.16b}, [x0], #16
+ tbnz x9, #5, 5f
+
+ ld1 {v13.16b}, [x1], #16
+ eor v7.16b, v7.16b, v13.16b
+ st1 {v7.16b}, [x0], #16
+ tbnz x9, #6, 6f
+
+ ld1 {v14.16b}, [x1], #16
+ eor v2.16b, v2.16b, v14.16b
+ st1 {v2.16b}, [x0], #16
+ tbnz x9, #7, 7f
+
+ ld1 {v15.16b}, [x1], #16
+ eor v5.16b, v5.16b, v15.16b
+ st1 {v5.16b}, [x0], #16
+
+ next_ctr v0
+ cbnz x4, 99b
+
+0: st1 {v0.16b}, [x5]
+8: ldp x29, x30, [sp], #16
+ ret
+
+ /*
+ * If we are handling the tail of the input (x6 == 1), return the
+ * final keystream block back to the caller via the IV buffer.
+ */
+1: cbz x6, 8b
+ st1 {v1.16b}, [x5]
+ b 8b
+2: cbz x6, 8b
+ st1 {v4.16b}, [x5]
+ b 8b
+3: cbz x6, 8b
+ st1 {v6.16b}, [x5]
+ b 8b
+4: cbz x6, 8b
+ st1 {v3.16b}, [x5]
+ b 8b
+5: cbz x6, 8b
+ st1 {v7.16b}, [x5]
+ b 8b
+6: cbz x6, 8b
+ st1 {v2.16b}, [x5]
+ b 8b
+7: cbz x6, 8b
+ st1 {v5.16b}, [x5]
+ b 8b
+ENDPROC(aesbs_ctr_encrypt)
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
new file mode 100644
index 000000000000..57982172563c
--- /dev/null
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -0,0 +1,300 @@
+/*
+ * Bit sliced AES using NEON instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/neon.h>
+#include <crypto/aes.h>
+#include <crypto/internal/simd.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/xts.h>
+#include <linux/module.h>
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+
+asmlinkage void aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks);
+asmlinkage void aesbs_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks);
+
+asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks, u8 iv[]);
+asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks, u8 iv[]);
+
+asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks, u8 iv[], bool final);
+
+asmlinkage void aesbs_convert_key(u8 out[], u32 const rk[], int rounds);
+
+struct aesbs_key {
+ u8 key[13 * (8 * AES_BLOCK_SIZE) + 32];
+};
+
+struct aesbs_ctx {
+ struct aesbs_key bskey;
+ int rounds;
+};
+
+struct aesbs_xts_ctx {
+ struct aesbs_key bskey;
+ struct crypto_cipher *tweak_tfm;
+ int rounds;
+};
+
+static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
+ unsigned int key_len)
+{
+ struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct crypto_aes_ctx rk;
+ int err;
+
+ err = crypto_aes_expand_key(&rk, in_key, key_len);
+ if (err)
+ return err;
+
+ ctx->rounds = 6 + key_len / 4;
+
+ kernel_neon_begin();
+ aesbs_convert_key(ctx->bskey.key, rk.key_enc, ctx->rounds);
+ kernel_neon_end();
+
+ return 0;
+}
+
+static int xts_init(struct crypto_skcipher *tfm)
+{
+ struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ ctx->tweak_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(ctx->tweak_tfm))
+ return PTR_ERR(ctx->tweak_tfm);
+
+ return 0;
+}
+
+static void xts_exit(struct crypto_skcipher *tfm)
+{
+ struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ crypto_free_cipher(ctx->tweak_tfm);
+}
+
+static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
+ unsigned int key_len)
+{
+ struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct crypto_aes_ctx rk;
+ int err;
+
+ err = xts_verify_key(tfm, in_key, key_len);
+ if (err)
+ return err;
+
+ err = crypto_cipher_setkey(ctx->tweak_tfm, in_key + key_len / 2,
+ key_len / 2);
+ if (err)
+ return err;
+
+ err = crypto_aes_expand_key(&rk, in_key, key_len / 2);
+ if (err)
+ return err;
+
+ ctx->rounds = 6 + key_len / 8;
+
+ kernel_neon_begin();
+ aesbs_convert_key(ctx->bskey.key, rk.key_enc, ctx->rounds);
+ kernel_neon_end();
+
+ return 0;
+}
+
+static int __ecb_crypt(struct skcipher_request *req,
+ void (*fn)(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks))
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, true);
+
+ kernel_neon_begin();
+ while (walk.nbytes >= AES_BLOCK_SIZE) {
+ unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+
+ if (walk.nbytes < walk.total)
+ blocks = round_down(blocks,
+ walk.chunksize / AES_BLOCK_SIZE);
+
+ fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->bskey.key,
+ ctx->rounds, blocks);
+ err = skcipher_walk_done(&walk,
+ walk.nbytes - blocks * AES_BLOCK_SIZE);
+ }
+ kernel_neon_end();
+
+ return err;
+}
+
+static int ecb_encrypt(struct skcipher_request *req)
+{
+ return __ecb_crypt(req, aesbs_ecb_encrypt);
+}
+
+static int ecb_decrypt(struct skcipher_request *req)
+{
+ return __ecb_crypt(req, aesbs_ecb_decrypt);
+}
+
+static int __xts_crypt(struct skcipher_request *req,
+ void (*fn)(u8 out[], u8 const in[], u8 const rk[],
+ int rounds, int blocks, u8 iv[]))
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, true);
+
+ crypto_cipher_encrypt_one(ctx->tweak_tfm, walk.iv, walk.iv);
+
+ kernel_neon_begin();
+ while (walk.nbytes >= AES_BLOCK_SIZE) {
+ unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+
+ if (walk.nbytes < walk.total)
+ blocks = round_down(blocks,
+ walk.chunksize / AES_BLOCK_SIZE);
+
+ fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->bskey.key,
+ ctx->rounds, blocks, walk.iv);
+ err = skcipher_walk_done(&walk,
+ walk.nbytes - blocks * AES_BLOCK_SIZE);
+ }
+ kernel_neon_end();
+
+ return err;
+}
+
+static int xts_encrypt(struct skcipher_request *req)
+{
+ return __xts_crypt(req, aesbs_xts_encrypt);
+}
+
+static int xts_decrypt(struct skcipher_request *req)
+{
+ return __xts_crypt(req, aesbs_xts_decrypt);
+}
+
+static int ctr_encrypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, true);
+
+ kernel_neon_begin();
+ while (walk.nbytes > 0) {
+ unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+ bool final = (walk.total % AES_BLOCK_SIZE) != 0;
+
+ if (walk.nbytes < walk.total) {
+ blocks = round_down(blocks,
+ walk.chunksize / AES_BLOCK_SIZE);
+ final = false;
+ }
+
+ aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->bskey.key, ctx->rounds, blocks, walk.iv,
+ final);
+
+ if (final) {
+ u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
+ u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
+
+ if (dst != src)
+ memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
+ crypto_xor(dst, walk.iv, walk.total % AES_BLOCK_SIZE);
+
+ err = skcipher_walk_done(&walk, 0);
+ break;
+ }
+ err = skcipher_walk_done(&walk,
+ walk.nbytes - blocks * AES_BLOCK_SIZE);
+ }
+ kernel_neon_end();
+
+ return err;
+}
+
+static struct skcipher_alg aes_algs[] = { {
+ .base.cra_name = "ecb(aes)",
+ .base.cra_driver_name = "ecb-aes-neonbs",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = AES_BLOCK_SIZE,
+ .base.cra_ctxsize = sizeof(struct aesbs_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .chunksize = 8 * AES_BLOCK_SIZE,
+ .setkey = aesbs_setkey,
+ .encrypt = ecb_encrypt,
+ .decrypt = ecb_decrypt,
+}, {
+ .base.cra_name = "xts(aes)",
+ .base.cra_driver_name = "xts-aes-neonbs",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = AES_BLOCK_SIZE,
+ .base.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = 2 * AES_MIN_KEY_SIZE,
+ .max_keysize = 2 * AES_MAX_KEY_SIZE,
+ .chunksize = 8 * AES_BLOCK_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .setkey = aesbs_xts_setkey,
+ .encrypt = xts_encrypt,
+ .decrypt = xts_decrypt,
+ .init = xts_init,
+ .exit = xts_exit,
+}, {
+ .base.cra_name = "ctr(aes)",
+ .base.cra_driver_name = "ctr-aes-neonbs",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct aesbs_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .chunksize = 8 * AES_BLOCK_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .setkey = aesbs_setkey,
+ .encrypt = ctr_encrypt,
+ .decrypt = ctr_encrypt,
+} };
+
+static int __init aes_init(void)
+{
+ return crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
+}
+
+static void aes_exit(void)
+{
+ crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
+}
+
+module_init(aes_init);
+module_exit(aes_exit);
--
2.7.4
^ permalink raw reply related
* [PATCH] watchdog: bcm2835_wdt: set WDOG_HW_RUNNING bit when appropriate
From: Eric Anholt @ 2016-12-12 17:46 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481536123-9279-1-git-send-email-rasmus.villemoes@prevas.dk>
Rasmus Villemoes <rasmus.villemoes@prevas.dk> writes:
> A bootloader may start the watchdog device before handing control to
> the kernel - in that case, we should tell the kernel about it so the
> watchdog framework can keep it alive until userspace opens
> /dev/watchdog0.
I don't believe our current bootloaders (the closed firmware or u-boot)
set up the watchdog, but this seems reasonable since they might want to
later.
Acked-by: Eric Anholt <eric@anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161212/673b1918/attachment.sig>
^ permalink raw reply
* [PATCH v4] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-12 17:55 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161212103512.GE1574@e103592.cambridge.arm.com>
On 12 December 2016 at 10:35, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Dec 09, 2016 at 08:57:20PM +0000, Ard Biesheuvel wrote:
>> On 9 December 2016 at 19:29, Dave Martin <Dave.Martin@arm.com> wrote:
>> > On Fri, Dec 09, 2016 at 06:21:55PM +0000, Catalin Marinas wrote:
>> >> On Fri, Dec 09, 2016 at 04:46:32PM +0000, Ard Biesheuvel wrote:
>> >> > void kernel_neon_begin_partial(u32 num_regs)
>> >> > {
>> >> > - if (in_interrupt()) {
>> >> > - struct fpsimd_partial_state *s = this_cpu_ptr(
>> >> > - in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
>> >> > + struct fpsimd_partial_state *s;
>> >> > + int level;
>> >> > +
>> >> > + preempt_disable();
>> >> > +
>> >> > + level = this_cpu_inc_return(kernel_neon_nesting_level);
>> >> > + BUG_ON(level > 3);
>> >> > +
>> >> > + if (level > 1) {
>> >> > + s = this_cpu_ptr(nested_fpsimdstate);
>> >> >
>> >> > - BUG_ON(num_regs > 32);
>> >> > - fpsimd_save_partial_state(s, roundup(num_regs, 2));
>> >> > + WARN_ON_ONCE(num_regs > 32);
>> >> > + num_regs = min(roundup(num_regs, 2), 32U);
>> >> > +
>> >> > + fpsimd_save_partial_state(&s[level - 2], num_regs);
>> >> > } else {
>> >> > /*
>> >> > * Save the userland FPSIMD state if we have one and if we
>> >> > @@ -241,7 +256,6 @@ void kernel_neon_begin_partial(u32 num_regs)
>> >> > * that there is no longer userland FPSIMD state in the
>> >> > * registers.
>> >> > */
>> >> > - preempt_disable();
>> >> > if (current->mm &&
>> >> > !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
>> >> > fpsimd_save_state(¤t->thread.fpsimd_state);
>> >>
>> >> I wonder whether we could actually do this saving and flag/level setting
>> >> in reverse to simplify the races. Something like your previous patch but
>> >> only set TIF_FOREIGN_FPSTATE after saving:
>> >>
>> >> level = this_cpu_read(kernel_neon_nesting_level);
>> >> if (level > 0) {
>> >> ...
>> >> fpsimd_save_partial_state();
>> >> } else {
>> >> if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
>> >> fpsimd_save_state();
>> >> set_thread_flag(TIF_FOREIGN_FPSTATE);
>> >> }
>> >> this_cpu_inc(kernel_neon_nesting_level);
>> >>
>> >> There is a risk of extra saving if we get an interrupt after
>> >> test_thread_flag() and before set_thread_flag() but I don't think this
>> >> would corrupt any state, just writing things twice.
>> >
>> > I would worry that we can save two states over the same buffer and then
>> > restore an uninitialised buffer in this case unless we are careful.
>> > Because the level-dependent code is now misbracketed by the inc/dec,
>> > a preempting call races with the outer call and use the same value.
>> >
>> > I guess we could do
>> >
>> > if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
>> > fpsimd_save_state();
>> > clear_thread_flag(TIF_FOREIGN_FPSTATE);
>> >
>> > at the start unconditionally, before the _inc_return().
>> >
>> > The task state may then get saved in the middle of being saved, but
>> > as you say it shouldn't have changed in the meantime.
>>
>> It /will/ have changed in the meantime: when the interrupted context
>> is resumed, it will happily proceed with saving the state where it
>> left off, but now the register file contains whatever was left after
>> the interrupt handler is done with the NEON.
>
> Hmmm, true. The NEON regs will have been restored by kernel_neon_end()
> in the inner context, but the extra SVE bits won't have been.
>
Even worse: both the interrupter and the interruptee think they are
preserving the userland context, so once the interrupter is done, it
will not restore the context as it found it. The interruptee will then
proceed and write whatever is left in those registers into the saved
state.
>>
>> > The nested
>> > save code may then do a partial save of the same state on top of that
>> > which could get restored at the inner kernel_neon_end() call.
>> >
>>
>> I'm afraid the only way to deal with this correctly is to treat the
>> whole sequence as a critical section, which means execute it with
>> interrupts disabled.
>
> Or we make the KERNEL_MODE_NEON code SVE-aware, which is where I started
> off. In that case, we do SVE (partial) save/restore whenever
> kernel_mode_neon() is called with live SVE state. The change here is
> that would we consider that there is always live SVE state until the
> fpsimd_save_state() actually finishes at the outer level. We may want
> to delay setting of TIF_FOREIGN_FPSTATE for that purpose.
>
> This means you do take an additional latency hit if you want to use NEON
> in an interrupting context and there happens to be live SVE state. It's
> a consequence of the architecture though -- I don't think there's any
> way to get around it. We can still scale the cost by implementing
> sve_save_partial_state() or something equivalent.
>
> You original inc()+save() ... restore()+dec() seems sound enough if
> viewed this way. Unless I'm missing something?
>
I think having a small critical section is not so bad. Let me send out
a v5 so we can discuss ...
^ permalink raw reply
* [PATCH v5] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-12 17:56 UTC (permalink / raw)
To: linux-arm-kernel
Currently, we allow kernel mode NEON in softirq or hardirq context by
stacking and unstacking a slice of the NEON register file for each call
to kernel_neon_begin() and kernel_neon_end(), respectively.
Given that
a) a CPU typically spends most of its time in userland, during which time
no kernel mode NEON in process context is in progress,
b) a CPU spends most of its time in the kernel doing other things than
kernel mode NEON when it gets interrupted to perform kernel mode NEON
in softirq context
the stacking and subsequent unstacking is only necessary if we are
interrupting a thread while it is performing kernel mode NEON in process
context, which means that in all other cases, we can simply preserve the
userland FPSIMD state once, and only restore it upon return to userland,
even if we are being invoked from softirq or hardirq context.
So instead of checking whether we are running in interrupt context, keep
track of the level of nested kernel mode NEON calls in progress, and only
perform the eager stack/unstack if the level exceeds 1.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/fpsimd.c | 64 +++++++++++++++++++++++++++++++++-------------
1 file changed, 46 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 394c61db5566..c19363775436 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -220,45 +220,73 @@ void fpsimd_flush_task_state(struct task_struct *t)
#ifdef CONFIG_KERNEL_MODE_NEON
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+/*
+ * Although unlikely, it is possible for three kernel mode NEON contexts to
+ * be live at the same time: process context, softirq context and hardirq
+ * context. So while the userland context is stashed in the thread's fpsimd
+ * state structure, we need two additional levels of storage.
+ */
+static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);
+static DEFINE_PER_CPU(int, kernel_neon_nesting_level);
/*
* Kernel-side NEON support functions
*/
void kernel_neon_begin_partial(u32 num_regs)
{
- if (in_interrupt()) {
- struct fpsimd_partial_state *s = this_cpu_ptr(
- in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+ struct fpsimd_partial_state *s;
+ int level;
- BUG_ON(num_regs > 32);
- fpsimd_save_partial_state(s, roundup(num_regs, 2));
- } else {
+ preempt_disable();
+
+ if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
/*
* Save the userland FPSIMD state if we have one and if we
* haven't done so already. Clear fpsimd_last_state to indicate
* that there is no longer userland FPSIMD state in the
* registers.
*/
- preempt_disable();
- if (current->mm &&
- !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
- fpsimd_save_state(¤t->thread.fpsimd_state);
+ if (current->mm) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
+ fpsimd_save_state(¤t->thread.fpsimd_state);
+ local_irq_restore(flags);
+ } else {
+ set_thread_flag(TIF_FOREIGN_FPSTATE);
+ }
this_cpu_write(fpsimd_last_state, NULL);
}
+
+ level = this_cpu_inc_return(kernel_neon_nesting_level);
+ BUG_ON(level > 3);
+
+ if (level > 1) {
+ s = this_cpu_ptr(nested_fpsimdstate);
+
+ WARN_ON_ONCE(num_regs > 32);
+ num_regs = min(roundup(num_regs, 2), 32U);
+
+ fpsimd_save_partial_state(&s[level - 2], num_regs);
+ }
}
EXPORT_SYMBOL(kernel_neon_begin_partial);
void kernel_neon_end(void)
{
- if (in_interrupt()) {
- struct fpsimd_partial_state *s = this_cpu_ptr(
- in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
- fpsimd_load_partial_state(s);
- } else {
- preempt_enable();
+ struct fpsimd_partial_state *s;
+ int level;
+
+ level = this_cpu_read(kernel_neon_nesting_level);
+ BUG_ON(level < 1);
+
+ if (level > 1) {
+ s = this_cpu_ptr(nested_fpsimdstate);
+ fpsimd_load_partial_state(&s[level - 2]);
}
+ this_cpu_dec(kernel_neon_nesting_level);
+ preempt_enable();
}
EXPORT_SYMBOL(kernel_neon_end);
--
2.7.4
^ permalink raw reply related
* [PATCH] media: platform: exynos4-is: constify v4l2_subdev_* structures
From: Bhumika Goyal @ 2016-12-12 18:03 UTC (permalink / raw)
To: linux-arm-kernel
v4l2_subdev_{core/pad/video}_ops structures are stored in the
fields of the v4l2_subdev_ops structure which are of type const.
Also, v4l2_subdev_ops structure is passed to a function
having its argument of type const. As these structures are never
modified, so declare them as const.
Done using Coccinelle:(one of the scripts used)
@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct v4l2_subdev_ops i at p = {...};
@ok1@
identifier r1.i;
position p;
expression e1;
@@
v4l2_subdev_init(e1,&i at p)
@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i at p
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct v4l2_subdev_ops i;
File size before:
text data bss dec hex filename
16830 1064 0 17894 45e6 platform/exynos4-is/fimc-capture.o
7787 704 20 8511 213f platform/exynos4-is/mipi-csis.o
File size after:
text data bss dec hex filename
17022 880 0 17902 45ee platform/exynos4-is/fimc-capture.o
8299 192 20 8511 213f platform/exynos4-is/mipi-csis.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
drivers/media/platform/exynos4-is/fimc-capture.c | 4 ++--
drivers/media/platform/exynos4-is/mipi-csis.c | 8 ++++----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/media/platform/exynos4-is/fimc-capture.c b/drivers/media/platform/exynos4-is/fimc-capture.c
index 964f4a6..a5e729c 100644
--- a/drivers/media/platform/exynos4-is/fimc-capture.c
+++ b/drivers/media/platform/exynos4-is/fimc-capture.c
@@ -1695,7 +1695,7 @@ static int fimc_subdev_set_selection(struct v4l2_subdev *sd,
return 0;
}
-static struct v4l2_subdev_pad_ops fimc_subdev_pad_ops = {
+static const struct v4l2_subdev_pad_ops fimc_subdev_pad_ops = {
.enum_mbus_code = fimc_subdev_enum_mbus_code,
.get_selection = fimc_subdev_get_selection,
.set_selection = fimc_subdev_set_selection,
@@ -1703,7 +1703,7 @@ static int fimc_subdev_set_selection(struct v4l2_subdev *sd,
.set_fmt = fimc_subdev_set_fmt,
};
-static struct v4l2_subdev_ops fimc_subdev_ops = {
+static const struct v4l2_subdev_ops fimc_subdev_ops = {
.pad = &fimc_subdev_pad_ops,
};
diff --git a/drivers/media/platform/exynos4-is/mipi-csis.c b/drivers/media/platform/exynos4-is/mipi-csis.c
index befd9fc..f819b29 100644
--- a/drivers/media/platform/exynos4-is/mipi-csis.c
+++ b/drivers/media/platform/exynos4-is/mipi-csis.c
@@ -649,23 +649,23 @@ static int s5pcsis_log_status(struct v4l2_subdev *sd)
return 0;
}
-static struct v4l2_subdev_core_ops s5pcsis_core_ops = {
+static const struct v4l2_subdev_core_ops s5pcsis_core_ops = {
.s_power = s5pcsis_s_power,
.log_status = s5pcsis_log_status,
};
-static struct v4l2_subdev_pad_ops s5pcsis_pad_ops = {
+static const struct v4l2_subdev_pad_ops s5pcsis_pad_ops = {
.enum_mbus_code = s5pcsis_enum_mbus_code,
.get_fmt = s5pcsis_get_fmt,
.set_fmt = s5pcsis_set_fmt,
};
-static struct v4l2_subdev_video_ops s5pcsis_video_ops = {
+static const struct v4l2_subdev_video_ops s5pcsis_video_ops = {
.s_rx_buffer = s5pcsis_s_rx_buffer,
.s_stream = s5pcsis_s_stream,
};
-static struct v4l2_subdev_ops s5pcsis_subdev_ops = {
+static const struct v4l2_subdev_ops s5pcsis_subdev_ops = {
.core = &s5pcsis_core_ops,
.pad = &s5pcsis_pad_ops,
.video = &s5pcsis_video_ops,
--
1.9.1
^ permalink raw reply related
* [RFC v3 PATCH 00/25] Allow NOMMU for MULTIPLATFORM
From: Afzal Mohammed @ 2016-12-12 18:15 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <87fulus094.fsf@dell.be.48ers.dk>
Hi,
On Sun, Dec 11, 2016 at 09:01:59PM +0100, Peter Korsgaard wrote:
> When you select a cortex-A variant, then we enable MMU support by
> default, but you can disable it under toolchain options (Enable MMU) and
> then the flat binary option is available.
Thank You Peter Korsgaard, that did the trick, able to boot to
prompt!, logs at the end.
> Hmm, I'm not sure why a cortex-M toolchain wouldn't work on cortex-A, I
> thought the 'M' instruction set was a pure subset of the 'A'.
On Mon, Dec 12, 2016 at 09:28:03AM +0000, Vladimir Murzin wrote:
> M-class toolchain should just work with A-class; you don't even need to
> disable MMU to try it out after d782e42 ("ARM: 8594/1: enable binfmt_flat on
> systems with an MMU").
Earlier, there was a nonsense done by me in not enabling flat binary
support in Kernel.
But even after that, it didn't work, dunno why, upon enabling flat
binary support in Kernel, it ended up instead with,
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
The exit code probably refers to interrupted system call
On Mon, Dec 12, 2016 at 08:07:16AM +0100, mickael guene wrote:
> You can find an R toolchain here:
> https://github.com/mickael-guene/fdpic_manifest/releases/download/v7-r-1.0.1/toolset-v7-r-1.0.1-0-gbdcc6a7c-armv7-r.tgz
>
> It's an fdpic toolset for cortex-r cpu class. gcc version is
> quite old (4.7).
>
> Note also that generated code may crash on class A cpu due to
> generation of udiv/sdiv which is optional for class A.
> (cortex a15 is ok but not a9).
>
> Hope it helps
On Mon, Dec 12, 2016 at 10:44:45AM +0100, mickael guene wrote:
> At the end of https://github.com/mickael-guene/fdpic_manifest you can
> find a set of patch to apply for kernel fdpic support. Unfortunately
> they are quite old ... But I have done some test on May for
> stm32f469-disco platform and I have attached patches against more
> recent kernel.
Thanks Mickael.
Earlier had tried syncing the repo, download was getting interrupted
frequently, though persisting on it would have fetched it fully. But
seeing the Kernel patches parallely, pushed the plan aside for the
time being as context of the changes was very much different with the
version of Kernel (4.9-rc7) used here.
But the attached patches seems can be applied w/o any/much difficulty.
As already reached the prompt, will keep note of these details, might
help later.
And Vladimir, Thanks.
Regards
afzal
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.9.0-rc7-00026-g7a142ca8231b (afzal at debian) (gcc version 6.2.0 (GCC) ) #26 Mon Dec 12 22:32:33 IST 2016
[ 0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=00c50478
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[ 0.000000] OF: fdt:Machine model: TI AM437x Industrial Development Kit
[ 0.000000] bootconsole [earlycon0] enabled
[ 0.000000] AM437x ES1.2 (sgx neon)
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260096
[ 0.000000] Kernel command line: console=ttyO0,115200n8 earlyprintk
[ 0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] Memory: 1029196K/1048576K available (6562K kernel code, 523K rwdata, 2096K rodata, 712K init, 274K bss, 19380K reserved, 0K cma-reserved)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0x80000000 - 0x80001000 ( 4 kB)
[ 0.000000] fixmap : 0xffc00000 - 0xfff00000 (3072 kB)
[ 0.000000] vmalloc : 0x00000000 - 0xffffffff (4095 MB)
[ 0.000000] lowmem : 0x80000000 - 0xc0000000 (1024 MB)
[ 0.000000] modules : 0x80000000 - 0xc0000000 (1024 MB)
[ 0.000000] .text : 0x80008000 - 0x80670b88 (6563 kB)
[ 0.000000] .init : 0x8087e000 - 0x80930000 ( 712 kB)
[ 0.000000] .data : 0x80930000 - 0x809b2f60 ( 524 kB)
[ 0.000000] .bss : 0x809b2f60 - 0x809f7a9c ( 275 kB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] NR_IRQS:16 nr_irqs:16 16
[ 0.000000] OMAP clockevent source: timer1 at 32786 Hz
[ 0.000259] sched_clock: 64 bits at 500MHz, resolution 2ns, wraps every 4398046511103ns
[ 0.009660] clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles: 0xe6a171a037, max_idle_ns: 881590485102 ns
[ 0.022315] Switching to timer-based delay loop, resolution 2ns
[ 0.141364] clocksource: 32k_counter: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 58327039986419 ns
[ 0.152415] OMAP clocksource: 32k_counter at 32768 Hz
[ 0.231362] Console: colour dummy device 80x30
[ 0.236920] Calibrating delay loop (skipped), value calculated using timer frequency.. 1000.00 BogoMIPS (lpj=5000000)
[ 0.249062] pid_max: default: 32768 minimum: 301
[ 0.256668] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
[ 0.264524] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
[ 0.323801] devtmpfs: initialized
[ 0.935615] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[ 0.951495] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 0.963940] pinctrl core: initialized pinctrl subsystem
[ 1.014378] NET: Registered protocol family 16
[ 2.111659] cpuidle: using governor menu
[ 2.176928] omap_l3_noc 44000000.ocp: L3 debug error: target 8 mod:0 (unclearable)
[ 2.186840] omap_l3_noc 44000000.ocp: L3 application error: target 8 mod:0 (unclearable)
[ 2.494565] OMAP GPIO hardware version 0.1
[ 2.883468] platform 53701000.des: Cannot lookup hwmod 'des'
[ 2.900195] platform 48310000.rng: Cannot lookup hwmod 'rng'
[ 3.046777] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
[ 3.055998] hw-breakpoint: maximum watchpoint size is 4 bytes.
[ 3.072570] omap4_sram_init:Unable to allocate sram needed to handle errata I688
[ 3.080942] omap4_sram_init:Unable to get sram pool needed to handle errata I688
[ 4.016395] edma 49000000.edma: TI EDMA DMA engine driver
[ 4.042166] V3_3D: supplied by V24_0D
[ 4.056616] VDD_COREREG: supplied by V24_0D
[ 4.072516] VDD_CORE: supplied by VDD_COREREG
[ 4.088252] V1_8DREG: supplied by V24_0D
[ 4.103897] V1_8D: supplied by V1_8DREG
[ 4.118796] V1_5DREG: supplied by V24_0D
[ 4.134236] V1_5D: supplied by V1_5DREG
[ 4.288700] vgaarb: loaded
[ 4.326444] SCSI subsystem initialized
[ 4.345255] usbcore: registered new interface driver usbfs
[ 4.354345] usbcore: registered new interface driver hub
[ 4.362195] usbcore: registered new device driver usb
[ 4.383412] omap_i2c 44e0b000.i2c: could not find pctldev for node /ocp at 44000000/l4_wkup at 44c00000/scm at 210000/pinmux at 800/i2c0_pins_default, deferring probe
[ 4.400047] omap_i2c 4819c000.i2c: could not find pctldev for node /ocp at 44000000/l4_wkup at 44c00000/scm at 210000/pinmux at 800/i2c2_pins_default, deferring probe
[ 4.420788] pps_core: LinuxPPS API ver. 1 registered
[ 4.426744] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[ 4.437776] PTP clock support registered
[ 4.449669] EDAC MC: Ver: 3.0.0
[ 4.507254] clocksource: Switched to clocksource arm_global_timer
[ 4.891236] NET: Registered protocol family 2
[ 4.920504] TCP established hash table entries: 8192 (order: 3, 32768 bytes)
[ 4.934239] TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
[ 4.947220] TCP: Hash tables configured (established 8192 bind 8192)
[ 4.956856] UDP hash table entries: 512 (order: 1, 8192 bytes)
[ 4.965035] UDP-Lite hash table entries: 512 (order: 1, 8192 bytes)
[ 4.976320] NET: Registered protocol family 1
[ 4.988215] RPC: Registered named UNIX socket transport module.
[ 4.994956] RPC: Registered udp transport module.
[ 5.000656] RPC: Registered tcp transport module.
[ 5.006103] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 6.371750] workingset: timestamp_bits=30 max_order=18 bucket_order=0
[ 6.835038] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[ 6.888403] NFS: Registering the id_resolver key type
[ 6.894459] Key type id_resolver registered
[ 6.899596] Key type id_legacy registered
[ 6.905432] ntfs: driver 2.1.32 [Flags: R/O].
[ 6.961089] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[ 6.969706] io scheduler noop registered
[ 6.974328] io scheduler deadline registered
[ 6.989359] io scheduler cfq registered (default)
[ 7.085244] pinctrl-single 44e10800.pinmux: 199 pins at pa 44e10800 size 796
[ 9.483420] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 9.593419] omap_uart 44e09000.serial: no wakeirq for uart0
[ 9.600384] omap_uart 44e09000.serial: No clock speed specified: using default: 48000000
[ 9.612215] 44e09000.serial: ttyO0 at MMIO 0x44e09000 (irq = 29, base_baud = 3000000) is a OMAP UART0
[ 9.623241] console [ttyO0] enabled
[ 9.623241] console [ttyO0] enabled
[ 9.631603] bootconsole [earlycon0] disabled
[ 9.631603] bootconsole [earlycon0] disabled
[ 9.657952] STMicroelectronics ASC driver initialized
[ 9.703627] omap_rng 48310000.rng: _od_fail_runtime_resume: FIXME: missing hwmod/omap_dev info
[ 9.714158] omap_rng 48310000.rng: Failed to runtime_get device: -19
[ 9.722078] omap_rng 48310000.rng: initialization failed.
[ 10.149265] brd: module loaded
[ 10.379026] loop: module loaded
[ 10.549954] libphy: Fixed MDIO Bus: probed
[ 10.621367] CAN device driver interface
[ 10.696022] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[ 10.703201] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 10.713426] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
[ 10.721901] igb: Copyright (c) 2007-2014 Intel Corporation.
[ 10.937883] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
[ 10.945298] davinci_mdio 4a101000.mdio: detected phy mask fffffffe
[ 10.963397] libphy: 4a101000.mdio: probed
[ 10.968744] davinci_mdio 4a101000.mdio: phy[0]: device 4a101000.mdio:00, driver Micrel KSZ9031 Gigabit PHY
[ 11.002569] cpsw 4a100000.ethernet: Detected MACID = c4:be:84:cc:f8:b2
[ 11.059455] pegasus: v0.9.3 (2013/04/25), Pegasus/Pegasus II USB Ethernet driver
[ 11.070237] usbcore: registered new interface driver pegasus
[ 11.079659] usbcore: registered new interface driver asix
[ 11.088196] usbcore: registered new interface driver ax88179_178a
[ 11.097524] usbcore: registered new interface driver cdc_ether
[ 11.107046] usbcore: registered new interface driver smsc75xx
[ 11.116880] usbcore: registered new interface driver smsc95xx
[ 11.125813] usbcore: registered new interface driver net1080
[ 11.134794] usbcore: registered new interface driver cdc_subset
[ 11.143940] usbcore: registered new interface driver zaurus
[ 11.153558] usbcore: registered new interface driver cdc_ncm
[ 11.224727] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 11.232713] ehci-pci: EHCI PCI platform driver
[ 11.240224] ehci-platform: EHCI generic platform driver
[ 11.256349] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 11.264046] ohci-pci: OHCI PCI platform driver
[ 11.271558] ohci-platform: OHCI generic platform driver
[ 11.284180] ohci-omap3: OHCI OMAP3 driver
[ 11.325702] usbcore: registered new interface driver usb-storage
[ 11.400750] mousedev: PS/2 mouse device common for all mice
[ 11.447930] i2c /dev entries driver
[ 11.615616] sdhci: Secure Digital Host Controller Interface driver
[ 11.623170] sdhci: Copyright(c) Pierre Ossman
[ 11.647533] omap_hsmmc 48060000.mmc: Got CD GPIO
[ 11.734930] Synopsys Designware Multimedia Card Interface Driver
[ 11.764020] sdhci-pltfm: SDHCI platform and OF driver helper
[ 11.812878] ledtrig-cpu: registered to indicate activity on CPUs
[ 11.827153] usbcore: registered new interface driver usbhid
[ 11.834026] usbhid: USB HID core driver
[ 11.882277] NET: Registered protocol family 10
[ 11.934153] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 11.971848] NET: Registered protocol family 17
[ 11.977629] can: controller area network core (rev 20120528 abi 9)
[ 11.987031] NET: Registered protocol family 29
[ 11.992658] can: raw protocol (rev 20120528)
[ 11.998072] can: broadcast manager protocol (rev 20161123 t)
[ 12.004962] can: netlink gateway (rev 20130117) max_hops=1
[ 12.025921] Key type dns_resolver registered
[ 12.035682] omap_voltage_late_init: Voltage driver support not added
[ 12.048436] ThumbEE CPU extension supported.
[ 12.280999] mmc0: host does not support reading read-only switch, assuming write-enable
[ 12.293776] mmc0: new high speed SDHC card at address 0002
[ 12.318060] at24 0-0050: 32768 byte 24c256 EEPROM, writable, 64 bytes/write
[ 12.340744] mmcblk0: mmc0:0002 00000 3.66 GiB
[ 12.366577] omap_i2c 44e0b000.i2c: bus 0 rev0.12 at 400 kHz
[ 12.393428] mmcblk0: p1 p2
[ 12.433995] omap_i2c 4819c000.i2c: bus 2 rev0.12 at 100 kHz
[ 12.464867] input: gpio_keys as /devices/platform/gpio_keys/input/input0
[ 12.479679] hctosys: unable to open rtc device (rtc0)
[ 12.564936] Freeing unused kernel memory: 712K (8087e000 - 80930000)
[ 12.572725] This architecture does not have kernel memory protection.
Initializing random number generator... [ 14.422674] random: dd: uninitialized urandom read (512 bytes read)
done.
Welcome to Buildroot
buildroot login: root
Jan 1 00:00:16 login[81]: root login on 'ttyO0'
~ # uname -a
Linux buildroot 4.9.0-rc7-00026-g7a142ca8231b #26 Mon Dec 12 22:32:33 IST 2016 armv7l GNU/Linux
~ #
^ permalink raw reply
* [RFT PATCH] ARM64: dts: meson-gxbb: Add reserved memory zone and usable memory range
From: Heinrich Schuchardt @ 2016-12-12 18:23 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161212101801.28491-1-narmstrong@baylibre.com>
On 12/12/2016 11:18 AM, Neil Armstrong wrote:
> The Amlogic Meson GXBB secure monitor uses part of the memory space, this
> patch adds these reserved zones and redefines the usable memory range for
> each boards.
>
> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
> ---
> arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi | 2 +-
> arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 21 +++++++++++++++++++++
> .../boot/dts/amlogic/meson-gxbb-nexbox-a95x.dts | 2 +-
> arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts | 2 +-
> arch/arm64/boot/dts/amlogic/meson-gxbb-p20x.dtsi | 2 +-
> .../boot/dts/amlogic/meson-gxbb-vega-s95-meta.dts | 2 +-
> .../boot/dts/amlogic/meson-gxbb-vega-s95-pro.dts | 2 +-
> .../boot/dts/amlogic/meson-gxbb-vega-s95-telos.dts | 2 +-
> .../boot/dts/amlogic/meson-gxl-nexbox-a95x.dts | 2 +-
> .../arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts | 2 +-
> arch/arm64/boot/dts/amlogic/meson-gxm-nexbox-a1.dts | 2 +-
> 11 files changed, 31 insertions(+), 10 deletions(-)
>
I added your patch to next-20161212.
My kernel config is available as
https://github.com/xypron/kernel-odroid-c2/blob/5ec4be0c1b45297bbcbc1ce3d3d787e45dac66b6/config/config-next-20161212
To build the same kernel just run ./build-dpkg.sh (or make) on
https://github.com/xypron/kernel-odroid-c2/tree/5ec4be0c1b45297bbcbc1ce3d3d787e45dac66b6
Free showed 0x2301000 less total memory available than next-20161209
without the patch.
When git cloning linux-next I got the following error on Hardkernel
Odroid C2:
[ 811.602365] Bad mode in Error handler detected on CPU2, code
0xbf000000 -- SError
[ 811.604205] CPU: 2 PID: 1447 Comm: git Not tainted
4.9.0-next-20161212-r005-arm64 #1
[ 811.611876] Hardware name: Hardkernel ODROID-C2 (DT)
[ 811.616793] task: ffff8000745c5780 task.stack: ffff800072d3c000
[ 811.622660] PC is at 0xaaaad3770f28
[ 811.626107] LR is at 0xffffab54e53c
[ 811.629558] pc : [<0000aaaad3770f28>] lr : [<0000ffffab54e53c>]
pstate: 20000000
[ 811.636888] sp : 0000ffffd3a1d950
[ 811.640166] x29: 0000ffffd3a1d950 x28: 0000ffff9853a050
[ 811.645427] x27: 00000000000ffc5e x26: 0000ffff8fe00020
[ 811.650688] x25: 0000ffffd3a1da98 x24: 0000000000000000
[ 811.655949] x23: 0000aaaad3770f28 x22: 0000000000000010
[ 811.661211] x21: 0000ffff9809bae0 x20: 000000000003de04
[ 811.666472] x19: 0000ffff8fe00010 x18: 0000000023c57c32
[ 811.671733] x17: 0000ffffab58f988 x16: 0000ffffab660008
[ 811.676994] x15: 00000000000006dc x14: 0000000000000000
[ 811.682255] x13: 00000000002549ea x12: 0000000029555c36
[ 811.687517] x11: 00000000002549eb x10: 0000000029555c36
[ 811.692778] x9 : 00000000002549ea x8 : 0000000029555c36
[ 811.698039] x7 : 00000000002549e9 x6 : 0000000029555c36
[ 811.703300] x5 : 0000ffff98d54b40 x4 : 0000ffff8f93c030
[ 811.708562] x3 : 00000000ffffffff x2 : 0000000000000000
[ 811.713823] x1 : 0000ffff9853a050 x0 : 0000ffff9809bae0
[ 811.720561] Internal error: Attempting to execute userspace memory:
8600000f [#1] PREEMPT SMP
[ 811.729004] Modules linked in: meson_rng rng_core ip_tables x_tables
ipv6 realtek
[ 811.736422] CPU: 2 PID: 1447 Comm: git Not tainted
4.9.0-next-20161212-r005-arm64 #1
[ 811.744097] Hardware name: Hardkernel ODROID-C2 (DT)
[ 811.749014] task: ffff8000745c5780 task.stack: ffff800072d3c000
[ 811.754879] PC is at 0xffffab54e53c
[ 811.758328] LR is at 0xffffab54e53c
[ 811.761779] pc : [<0000ffffab54e53c>] lr : [<0000ffffab54e53c>]
pstate: 600003c5
[ 811.769109] sp : ffff800072d3fec0
[ 811.772387] x29: 0000000000000000 x28: ffff8000745c5780
[ 811.777648] x27: 00000000000ffc5e x26: 0000ffff8fe00020
[ 811.782909] x25: 0000ffffd3a1da98 x24: 0000000000000000
[ 811.788171] x23: 0000000020000000 x22: 0000aaaad3770f28
[ 811.793432] x21: ffffffffffffffff x20: 000080006e538000
[ 811.798693] x19: 0000000000000000 x18: 0000000000000010
[ 811.803954] x17: 0000ffffab58f988 x16: 0000ffffab660008
[ 811.809215] x15: 0000000000000006 x14: ffff000088b2eabf
[ 811.814477] x13: ffff000008b2eacd x12: 0000000000000105
[ 811.819738] x11: 0000000000000002 x10: 0000000000000106
[ 811.824999] x9 : ffff800072d3fb40 x8 : 00000000000af8ec
[ 811.830260] x7 : 0000000000000000 x6 : 0000000000000a65
[ 811.835522] x5 : 000000000a660a65 x4 : 0000000000000000
[ 811.840783] x3 : 0000000000000002 x2 : 0000000000000a66
[ 811.846044] x1 : ffff8000745c5780 x0 : 0000000000000000
[ 811.852773] Process git (pid: 1447, stack limit = 0xffff800072d3c000)
[ 811.859156] Stack: (0xffff800072d3fec0 to 0xffff800072d40000)
[ 811.864849] fec0: 0000ffff9809bae0 0000ffff9853a050 0000000000000000
00000000ffffffff
[ 811.872611] fee0: 0000ffff8f93c030 0000ffff98d54b40 0000000029555c36
00000000002549e9
[ 811.880374] ff00: 0000000029555c36 00000000002549ea 0000000029555c36
00000000002549eb
[ 811.888136] ff20: 0000000029555c36 00000000002549ea 0000000000000000
00000000000006dc
[ 811.895898] ff40: 0000ffffab660008 0000ffffab58f988 0000000023c57c32
0000ffff8fe00010
[ 811.903661] ff60: 000000000003de04 0000ffff9809bae0 0000000000000010
0000aaaad3770f28
[ 811.911423] ff80: 0000000000000000 0000ffffd3a1da98 0000ffff8fe00020
00000000000ffc5e
[ 811.919186] ffa0: 0000ffff9853a050 0000ffffd3a1d950 0000ffffab54e53c
0000ffffd3a1d950
[ 811.926949] ffc0: 0000aaaad3770f28 0000000020000000 0000000000000000
ffffffffffffffff
[ 811.934711] ffe0: 0000000000000000 0000000000000000 3136363920746e61
3064613364666464
[ 811.942473] Call trace:
[ 811.944888] Exception stack(0xffff800072d3fcf0 to 0xffff800072d3fe20)
[ 811.951270] fce0: 0000000000000000
0001000000000000
[ 811.959034] fd00: ffff800072d3fec0 0000ffffab54e53c ffff8000731ab640
0000000000000000
[ 811.966796] fd20: 0000000000000004 ffff000008ab9818 ffff8000745c5780
000000000808540c
[ 811.974559] fd40: ffff800072d3fd90 ffff0000080c8858 ffff800072d3fe40
ffff8000745c5780
[ 811.982321] fd60: 0000000000000004 00000000000003c0 ffff800072d3fe40
0000000000000000
[ 811.990084] fd80: 0000ffffd3a1da98 0000ffff8fe00020 0000000000000000
ffff8000745c5780
[ 811.997846] fda0: 0000000000000a66 0000000000000002 0000000000000000
000000000a660a65
[ 812.005609] fdc0: 0000000000000a65 0000000000000000 00000000000af8ec
ffff800072d3fb40
[ 812.013371] fde0: 0000000000000106 0000000000000002 0000000000000105
ffff000008b2eacd
[ 812.021134] fe00: ffff000088b2eabf 0000000000000006 0000ffffab660008
0000ffffab58f988
[ 812.028896] [<0000ffffab54e53c>] 0xffffab54e53c
[ 812.033382] Code: aa1c03e1 aa1503e0 8b16027a d63f02e0 (7100001f)
[ 812.039501] ---[ end trace e791f586be1831bb ]---
^ permalink raw reply
* [PATCHv4 00/15] clk: ti: add support for hwmod clocks
From: Michael Turquette @ 2016-12-12 18:25 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <7371ef35-5d95-bf78-4c97-c61091a1fa4b@ti.com>
Quoting Tero Kristo (2016-12-02 00:15:53)
> On 29/10/16 02:37, Stephen Boyd wrote:
> > On 10/28, Tero Kristo wrote:
> >> Eventually that should happen. However, we have plenty of legacy
> >> code still in place which depend on clk_get functionality within
> >> kernel. The major contributing factor is the hwmod codebase, for
> >> which we have plans to:
> >>
> >> - get this clock driver merged
> >> - implement a new interconnect driver for OMAP family SoCs
> >> - interconnect driver will use DT handles for fetching clocks,
> >> rather than clock aliases
> >> - reset handling will be implemented as part of the interconnect
> >> driver somehow (no prototype / clear plans for that as of yet)
> >> - all the hwmod stuff can be dropped
> >>
> >> The clock alias handling is still needed as a transition phase until
> >> all the above is done, then we can start dropping them. Basically
> >> anything that is using omap_hwmod depends on the clock aliases right
> >> now.
> >
> > Ok, sounds good. Thanks.
>
> Stephen, any final comments on this series? I guess its too late to push
> for 4.10, but I would like to get this merged early for 4.11 window.
Hi Tero,
No final comments from me. I needed to go back and forth with Tony about
the clockdomain modeling, but it seems sensible to create clock
providers from the clock domains if you want to pass those struct clk
objects down to the drivers.
One thing I wasn't able to follow exactly in the code is how the
clockdomains are linking parent clocks from cm1, cm2, etc to the clock
domains. Are the clockdomain providers calling clk_get() on the clocks
that it *consumes*, or are the clockdomain providers never calling
clk_get() on those clocks and just establishing the tree hierarchy at
clk_register() time?
Unless Stephen has any more review comments we can merge this into a
clk-next based on v4.10-rc1 when that drops.
Regards,
Mike
>
> -Tero
^ permalink raw reply
* [PATCH] PCI: mvebu: Handle changes to the bridge windows while enabled
From: Jason Gunthorpe @ 2016-12-12 18:30 UTC (permalink / raw)
To: linux-arm-kernel
The PCI core will write to the bridge window config multiple times
while they are enabled. This can lead to mbus failures like:
mvebu_mbus: cannot add window '4:e8', conflicts with another window
mvebu-pcie mbus:pex at e0000000: Could not create MBus window at [mem 0xe0000000-0xe00fffff]: -22
For me this is happening during a hotplug cycle. The PCI core is
not changing the values, just writing them twice while active.
The patch addresses the general case of any change to an active window,
but not atomically. The code is slightly refactored so io and mem
can share more of the window logic.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
drivers/pci/host/pci-mvebu.c | 101 +++++++++++++++++++++++++------------------
1 file changed, 60 insertions(+), 41 deletions(-)
diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c
index 307f81d6b479af..af724731b22f53 100644
--- a/drivers/pci/host/pci-mvebu.c
+++ b/drivers/pci/host/pci-mvebu.c
@@ -133,6 +133,12 @@ struct mvebu_pcie {
int nports;
};
+struct mvebu_pcie_window {
+ phys_addr_t base;
+ phys_addr_t remap;
+ size_t size;
+};
+
/* Structure representing one PCIe interface */
struct mvebu_pcie_port {
char *name;
@@ -150,10 +156,8 @@ struct mvebu_pcie_port {
struct mvebu_sw_pci_bridge bridge;
struct device_node *dn;
struct mvebu_pcie *pcie;
- phys_addr_t memwin_base;
- size_t memwin_size;
- phys_addr_t iowin_base;
- size_t iowin_size;
+ struct mvebu_pcie_window memwin;
+ struct mvebu_pcie_window iowin;
u32 saved_pcie_stat;
};
@@ -379,23 +383,45 @@ static void mvebu_pcie_add_windows(struct mvebu_pcie_port *port,
}
}
+static void mvebu_pcie_set_window(struct mvebu_pcie_port *port,
+ unsigned int target, unsigned int attribute,
+ const struct mvebu_pcie_window *desired,
+ struct mvebu_pcie_window *cur)
+{
+ if (desired->base == cur->base && desired->remap == cur->remap &&
+ desired->size == cur->size)
+ return;
+
+ if (cur->size != 0) {
+ mvebu_pcie_del_windows(port, cur->base, cur->size);
+ cur->size = 0;
+ cur->base = 0;
+
+ /*
+ * If something tries to change the window while it is enabled
+ * the change will not be done atomically. That would be
+ * difficult to do in the general case.
+ */
+ }
+
+ if (desired->size == 0)
+ return;
+
+ mvebu_pcie_add_windows(port, target, attribute, desired->base,
+ desired->size, desired->remap);
+ *cur = *desired;
+}
+
static void mvebu_pcie_handle_iobase_change(struct mvebu_pcie_port *port)
{
- phys_addr_t iobase;
+ struct mvebu_pcie_window desired = {};
/* Are the new iobase/iolimit values invalid? */
if (port->bridge.iolimit < port->bridge.iobase ||
port->bridge.iolimitupper < port->bridge.iobaseupper ||
!(port->bridge.command & PCI_COMMAND_IO)) {
-
- /* If a window was configured, remove it */
- if (port->iowin_base) {
- mvebu_pcie_del_windows(port, port->iowin_base,
- port->iowin_size);
- port->iowin_base = 0;
- port->iowin_size = 0;
- }
-
+ mvebu_pcie_set_window(port, port->io_target, port->io_attr,
+ &desired, &port->iowin);
return;
}
@@ -412,32 +438,27 @@ static void mvebu_pcie_handle_iobase_change(struct mvebu_pcie_port *port)
* specifications. iobase is the bus address, port->iowin_base
* is the CPU address.
*/
- iobase = ((port->bridge.iobase & 0xF0) << 8) |
- (port->bridge.iobaseupper << 16);
- port->iowin_base = port->pcie->io.start + iobase;
- port->iowin_size = ((0xFFF | ((port->bridge.iolimit & 0xF0) << 8) |
- (port->bridge.iolimitupper << 16)) -
- iobase) + 1;
-
- mvebu_pcie_add_windows(port, port->io_target, port->io_attr,
- port->iowin_base, port->iowin_size,
- iobase);
+ desired.remap = ((port->bridge.iobase & 0xF0) << 8) |
+ (port->bridge.iobaseupper << 16);
+ desired.base = port->pcie->io.start + desired.remap;
+ desired.size = ((0xFFF | ((port->bridge.iolimit & 0xF0) << 8) |
+ (port->bridge.iolimitupper << 16)) -
+ desired.remap) +
+ 1;
+
+ mvebu_pcie_set_window(port, port->io_target, port->io_attr, &desired,
+ &port->iowin);
}
static void mvebu_pcie_handle_membase_change(struct mvebu_pcie_port *port)
{
+ struct mvebu_pcie_window desired = {.remap = MVEBU_MBUS_NO_REMAP};
+
/* Are the new membase/memlimit values invalid? */
if (port->bridge.memlimit < port->bridge.membase ||
!(port->bridge.command & PCI_COMMAND_MEMORY)) {
-
- /* If a window was configured, remove it */
- if (port->memwin_base) {
- mvebu_pcie_del_windows(port, port->memwin_base,
- port->memwin_size);
- port->memwin_base = 0;
- port->memwin_size = 0;
- }
-
+ mvebu_pcie_set_window(port, port->mem_target, port->mem_attr,
+ &desired, &port->memwin);
return;
}
@@ -447,14 +468,12 @@ static void mvebu_pcie_handle_membase_change(struct mvebu_pcie_port *port)
* window to setup, according to the PCI-to-PCI bridge
* specifications.
*/
- port->memwin_base = ((port->bridge.membase & 0xFFF0) << 16);
- port->memwin_size =
- (((port->bridge.memlimit & 0xFFF0) << 16) | 0xFFFFF) -
- port->memwin_base + 1;
-
- mvebu_pcie_add_windows(port, port->mem_target, port->mem_attr,
- port->memwin_base, port->memwin_size,
- MVEBU_MBUS_NO_REMAP);
+ desired.base = ((port->bridge.membase & 0xFFF0) << 16);
+ desired.size = (((port->bridge.memlimit & 0xFFF0) << 16) | 0xFFFFF) -
+ desired.base + 1;
+
+ mvebu_pcie_set_window(port, port->mem_target, port->mem_attr, &desired,
+ &port->memwin);
}
/*
--
2.7.4
^ permalink raw reply related
* [PATCH V7 0/8] Add support for privileged mappings
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
This series is a resend of the V5 that Mitch sent sometime back [2]
All the patches are the same and i have just rebased. Redid patch [3],
as it does not apply in this code base. Added a couple of more patches
[4], [5] from Robin for adding the privileged attributes to armv7s format
and arm-smmuv3 revert.
The following patch to the ARM SMMU driver:
commit d346180e70b91b3d5a1ae7e5603e65593d4622bc
Author: Robin Murphy <robin.murphy@arm.com>
Date: Tue Jan 26 18:06:34 2016 +0000
iommu/arm-smmu: Treat all device transactions as unprivileged
started forcing all SMMU transactions to come through as "unprivileged".
The rationale given was that:
(1) There is no way in the IOMMU API to even request privileged
mappings.
(2) It's difficult to implement a DMA mapper that correctly models the
ARM VMSAv8 behavior of unprivileged-writeable =>
privileged-execute-never.
This series rectifies (1) by introducing an IOMMU API for privileged
mappings and implements it in io-pgtable-arm.
This series rectifies (2) by introducing a new dma attribute
(DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged
mappings which are inaccessible to lesser-privileged execution levels, and
implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c)
is converted over to the new attribute.
Jordan and Jeremy can provide more info on the use case if needed, but the
high level is that it's a security feature to prevent attacks such as [1].
Note that, i tested this on arm64 with arm-smmuv2, short descriptor changes,
and do not have an platform to test this with arm-smmuv3.
[1] https://github.com/robclark/kilroy
[2] https://lkml.org/lkml/2016/7/27/590
[3] https://patchwork.kernel.org/patch/9250493/
[4] http://www.linux-arm.org/git?p=linux-rm.git;a=commit;h=1291bd74f05d31da1dab3df02987cba5bd25849b
[5] http://www.linux-arm.org/git?p=linux-rm.git;a=commit;h=a79c1c6333f26849dba418cd92de26b60f5954f3
Changelog:
v6..v7
- Added couple of more patches, picked up acks, updated commit log
v5..v6
- Rebased all the patches and redid 6/6 as it does not apply in
this code base.
v4..v5
- Simplified patch 4/6 (suggested by Robin Murphy).
v3..v4
- Rebased and reworked on linux next due to the dma attrs rework going
on over there. Patches changed: 3/6, 4/6, and 5/6.
v2..v3
- Incorporated feedback from Robin:
* Various comments and re-wordings.
* Use existing bit definitions for IOMMU_PRIV implementation
in io-pgtable-arm.
* Renamed and redocumented dma_direction_to_prot.
* Don't worry about executability in new DMA attr.
v1..v2
- Added a new DMA attribute to make executable privileged mappings
work, and use that in the pl330 driver (suggested by Will).
Jeremy Gebben (1):
iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
Mitchel Humpherys (4):
iommu: add IOMMU_PRIV attribute
common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
dmaengine: pl330: Make sure microcode is privileged
Robin Murphy (2):
iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag
iommu/arm-smmu: Revert "iommu/arm-smmu: Set PRIVCFG in stage 1 STEs"
Sricharan R (1):
iommu/arm-smmu: Set privileged attribute to 'default' instead of
'unprivileged'
Documentation/DMA-attributes.txt | 10 ++++++++++
arch/arm64/mm/dma-mapping.c | 6 +++---
drivers/dma/pl330.c | 5 +++--
drivers/iommu/arm-smmu-v3.c | 7 +------
drivers/iommu/arm-smmu.c | 2 +-
drivers/iommu/dma-iommu.c | 10 ++++++++--
drivers/iommu/io-pgtable-arm-v7s.c | 6 +++++-
drivers/iommu/io-pgtable-arm.c | 5 ++++-
include/linux/dma-iommu.h | 3 ++-
include/linux/dma-mapping.h | 7 +++++++
include/linux/iommu.h | 1 +
11 files changed, 45 insertions(+), 17 deletions(-)
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply
* [PATCH V7 1/8] iommu: add IOMMU_PRIV attribute
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Mitchel Humpherys <mitchelh@codeaurora.org>
Add the IOMMU_PRIV attribute, which is used to indicate privileged
mappings.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
---
include/linux/iommu.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index f2960e4..bf22131 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -31,6 +31,7 @@
#define IOMMU_CACHE (1 << 2) /* DMA cache coherency */
#define IOMMU_NOEXEC (1 << 3)
#define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */
+#define IOMMU_PRIV (1 << 5) /* privileged */
struct iommu_ops;
struct iommu_group;
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 2/8] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Jeremy Gebben <jgebben@codeaurora.org>
Allow the creation of privileged mode mappings, for stage 1 only.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
---
drivers/iommu/io-pgtable-arm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index f5c90e1..69ba83a 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -350,11 +350,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
if (data->iop.fmt == ARM_64_LPAE_S1 ||
data->iop.fmt == ARM_32_LPAE_S1) {
- pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG;
+ pte = ARM_LPAE_PTE_nG;
if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
pte |= ARM_LPAE_PTE_AP_RDONLY;
+ if (!(prot & IOMMU_PRIV))
+ pte |= ARM_LPAE_PTE_AP_UNPRIV;
+
if (prot & IOMMU_MMIO)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 3/8] iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Robin Murphy <robin.murphy@arm.com>
The short-descriptor format also allows privileged-only mappings, so
let's wire it up.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/io-pgtable-arm-v7s.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index f50e51c..1177782 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -265,7 +265,9 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
if (!(prot & IOMMU_MMIO))
pte |= ARM_V7S_ATTR_TEX(1);
if (ap) {
- pte |= ARM_V7S_PTE_AF | ARM_V7S_PTE_AP_UNPRIV;
+ pte |= ARM_V7S_PTE_AF;
+ if (!(prot & IOMMU_PRIV))
+ pte |= ARM_V7S_PTE_AP_UNPRIV;
if (!(prot & IOMMU_WRITE))
pte |= ARM_V7S_PTE_AP_RDONLY;
}
@@ -288,6 +290,8 @@ static int arm_v7s_pte_to_prot(arm_v7s_iopte pte, int lvl)
if (!(attr & ARM_V7S_PTE_AP_RDONLY))
prot |= IOMMU_WRITE;
+ if (!(attr & ARM_V7S_PTE_AP_UNPRIV))
+ prot |= IOMMU_PRIV;
if ((attr & (ARM_V7S_TEX_MASK << ARM_V7S_TEX_SHIFT)) == 0)
prot |= IOMMU_MMIO;
else if (pte & ARM_V7S_ATTR_C)
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 4/8] common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Mitchel Humpherys <mitchelh@codeaurora.org>
This patch adds the DMA_ATTR_PRIVILEGED attribute to the DMA-mapping
subsystem.
Some advanced peripherals such as remote processors and GPUs perform
accesses to DMA buffers in both privileged "supervisor" and unprivileged
"user" modes. This attribute is used to indicate to the DMA-mapping
subsystem that the buffer is fully accessible at the elevated privilege
level (and ideally inaccessible or at least read-only at the
lesser-privileged levels).
Cc: linux-doc at vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
Documentation/DMA-attributes.txt | 10 ++++++++++
include/linux/dma-mapping.h | 7 +++++++
2 files changed, 17 insertions(+)
diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 98bf7ac..44c6bc4 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -143,3 +143,13 @@ So, this provides a way for drivers to avoid those error messages on calls
where allocation failures are not a problem, and shouldn't bother the logs.
NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
+
+DMA_ATTR_PRIVILEGED
+------------------------------
+
+Some advanced peripherals such as remote processors and GPUs perform
+accesses to DMA buffers in both privileged "supervisor" and unprivileged
+"user" modes. This attribute is used to indicate to the DMA-mapping
+subsystem that the buffer is fully accessible at the elevated privilege
+level (and ideally inaccessible or at least read-only at the
+lesser-privileged levels).
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6f3e6ca..ee31ea1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -63,6 +63,13 @@
#define DMA_ATTR_NO_WARN (1UL << 8)
/*
+ * DMA_ATTR_PRIVILEGED: used to indicate that the buffer is fully
+ * accessible at an elevated privilege level (and ideally inaccessible or
+ * at least read-only@lesser-privileged levels).
+ */
+#define DMA_ATTR_PRIVILEGED (1UL << 8)
+
+/*
* A dma_addr_t can hold any valid DMA or bus address for the platform.
* It can be given to a device to use as a DMA source or target. A CPU cannot
* reference a dma_addr_t directly because there may be translation between
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 5/8] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Mitchel Humpherys <mitchelh@codeaurora.org>
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
are only accessible to privileged DMA engines. Implement it in
dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
arch/arm64/mm/dma-mapping.c | 6 +++---
drivers/iommu/dma-iommu.c | 10 ++++++++--
include/linux/dma-iommu.h | 3 ++-
3 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 401f79a..ae76ead 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -557,7 +557,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
unsigned long attrs)
{
bool coherent = is_device_dma_coherent(dev);
- int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent);
+ int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
size_t iosize = size;
void *addr;
@@ -711,7 +711,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
unsigned long attrs)
{
bool coherent = is_device_dma_coherent(dev);
- int prot = dma_direction_to_prot(dir, coherent);
+ int prot = dma_info_to_prot(dir, coherent, attrs);
dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
if (!iommu_dma_mapping_error(dev, dev_addr) &&
@@ -769,7 +769,7 @@ static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
return iommu_dma_map_sg(dev, sgl, nelems,
- dma_direction_to_prot(dir, coherent));
+ dma_info_to_prot(dir, coherent, attrs));
}
static void __iommu_unmap_sg_attrs(struct device *dev,
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d2a7a46..756d5e0 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -182,16 +182,22 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
EXPORT_SYMBOL(iommu_dma_init_domain);
/**
- * dma_direction_to_prot - Translate DMA API directions to IOMMU API page flags
+ * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
+ * page flags.
* @dir: Direction of DMA transfer
* @coherent: Is the DMA master cache-coherent?
+ * @attrs: DMA attributes for the mapping
*
* Return: corresponding IOMMU API page protection flags
*/
-int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
+int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
+ unsigned long attrs)
{
int prot = coherent ? IOMMU_CACHE : 0;
+ if (attrs & DMA_ATTR_PRIVILEGED)
+ prot |= IOMMU_PRIV;
+
switch (dir) {
case DMA_BIDIRECTIONAL:
return prot | IOMMU_READ | IOMMU_WRITE;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 32c5890..a203181 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -34,7 +34,8 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
u64 size, struct device *dev);
/* General helpers for DMA-API <-> IOMMU-API interaction */
-int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
+int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
+ unsigned long attrs);
/*
* These implement the bulk of the relevant DMA mapping callbacks, but require
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 6/8] dmaengine: pl330: Make sure microcode is privileged
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Mitchel Humpherys <mitchelh@codeaurora.org>
The PL330 is hard-wired such that instruction fetches on both the
manager and channel threads go out onto the bus with the "privileged"
bit set. This can become troublesome once there is an IOMMU or other
form of memory protection downstream, since those will typically be
programmed by the DMA mapping subsystem in the expectation of normal
unprivileged transactions (such as the PL330 channel threads' own data
accesses as currently configured by this driver).
To avoid the case of, say, an IOMMU blocking an unexpected privileged
transaction with a permission fault, use the newly-introduced
DMA_ATTR_PRIVILEGED attribute for the mapping of our microcode buffer.
That way the DMA layer can do whatever it needs to do to make things
continue to work as expected on more complex systems.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
[rm: remove now-redundant local variable, clarify commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
drivers/dma/pl330.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 030fe05..1e5ae0c 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -1859,9 +1859,10 @@ static int dmac_alloc_resources(struct pl330_dmac *pl330)
* Alloc MicroCode buffer for 'chans' Channel threads.
* A channel's buffer offset is (Channel_Id * MCODE_BUFF_PERCHAN)
*/
- pl330->mcode_cpu = dma_alloc_coherent(pl330->ddma.dev,
+ pl330->mcode_cpu = dma_alloc_attrs(pl330->ddma.dev,
chans * pl330->mcbufsz,
- &pl330->mcode_bus, GFP_KERNEL);
+ &pl330->mcode_bus, GFP_KERNEL,
+ DMA_ATTR_PRIVILEGED);
if (!pl330->mcode_cpu) {
dev_err(pl330->ddma.dev, "%s:%d Can't allocate memory!\n",
__func__, __LINE__);
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 7/8] iommu/arm-smmu: Set privileged attribute to 'default' instead of 'unprivileged'
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
Currently the driver sets all the device transactions privileges
to UNPRIVILEGED, but there are cases where the iommu masters wants
to isolate privileged supervisor and unprivileged user.
So don't override the privileged setting to unprivileged, instead
set it to default as incoming and let it be controlled by the pagetable
settings.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/arm-smmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index eaa8f44..8bb0eea 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1213,7 +1213,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
continue;
s2cr[idx].type = type;
- s2cr[idx].privcfg = S2CR_PRIVCFG_UNPRIV;
+ s2cr[idx].privcfg = S2CR_PRIVCFG_DEFAULT;
s2cr[idx].cbndx = cbndx;
arm_smmu_write_s2cr(smmu, idx);
}
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
* [PATCH V7 8/8] iommu/arm-smmu: Revert "iommu/arm-smmu: Set PRIVCFG in stage 1 STEs"
From: Sricharan R @ 2016-12-12 18:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1481567927-14791-1-git-send-email-sricharan@codeaurora.org>
From: Robin Murphy <robin.murphy@arm.com>
Now that proper privileged mappings can be requested via IOMMU_PRIV,
unconditionally overriding the incoming PRIVCFG becomes the wrong thing
to do, so stop it.
This reverts commit df5e1a0f2a2d779ad467a691203bcbc74d75690e.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
drivers/iommu/arm-smmu-v3.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 257a6a3..0eca0553 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -269,9 +269,6 @@
#define STRTAB_STE_1_SHCFG_INCOMING 1UL
#define STRTAB_STE_1_SHCFG_SHIFT 44
-#define STRTAB_STE_1_PRIVCFG_UNPRIV 2UL
-#define STRTAB_STE_1_PRIVCFG_SHIFT 48
-
#define STRTAB_STE_2_S2VMID_SHIFT 0
#define STRTAB_STE_2_S2VMID_MASK 0xffffUL
#define STRTAB_STE_2_VTCR_SHIFT 32
@@ -1073,9 +1070,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
#ifdef CONFIG_PCI_ATS
STRTAB_STE_1_EATS_TRANS << STRTAB_STE_1_EATS_SHIFT |
#endif
- STRTAB_STE_1_STRW_NSEL1 << STRTAB_STE_1_STRW_SHIFT |
- STRTAB_STE_1_PRIVCFG_UNPRIV <<
- STRTAB_STE_1_PRIVCFG_SHIFT);
+ STRTAB_STE_1_STRW_NSEL1 << STRTAB_STE_1_STRW_SHIFT);
if (smmu->features & ARM_SMMU_FEAT_STALLS)
dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox