Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/7] MFD: add bindings for STM32 General Purpose Timer driver
From: Benjamin Gaignard @ 2016-12-09 12:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209085331.GB3625@dell.home>

2016-12-09 9:53 GMT+01:00 Lee Jones <lee.jones@linaro.org>:
> Sorry to do this Ben.  Not much to do now though!
>
>> Add bindings information for STM32 General Purpose Timer
>>
>> version 2:
>> - rename stm32-mfd-timer to stm32-gptimer
>> - only keep one compatible string
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
>> ---
>>  .../bindings/mfd/stm32-general-purpose-timer.txt   | 39 ++++++++++++++++++++++
>>  1 file changed, 39 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/mfd/stm32-general-purpose-timer.txt
>>
>> diff --git a/Documentation/devicetree/bindings/mfd/stm32-general-purpose-timer.txt b/Documentation/devicetree/bindings/mfd/stm32-general-purpose-timer.txt
>> new file mode 100644
>> index 0000000..ce67755
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/mfd/stm32-general-purpose-timer.txt
>> @@ -0,0 +1,39 @@
>> +STM32 General Purpose Timer driver bindings
>
> This is a great place to describe what we're *actually* trying to
> achieve with this driver, and the worst place to use the term "general
> purpose", since this IP is so much more than that.
>
>
> "STM32 Timers
>
> This IP provides 3 types of timer along with PWM functionality.
>
> [...]"
>
> ... then go on to explain what the 3 types are and how they can be
> used.
>
>> +Required parameters:
>> +- compatible: must be "st,stm32-gptimer"
>
> I vehemently disagree that this entire IP is a GP Timer.  It contains
> GP timers, sure, but it also contains Advanced and Basic timers.
>
> IMHO this compatible should be "st,stm32-timers".
>
> And the file name of both this and the *.c file should reflect that
> too.

I will do those name changes in v6.

>
> Remainder looks nice.
>
>> +- reg:                       Physical base address and length of the controller's
>> +                     registers.
>> +- clock-names:               Set to "clk_int".
>> +- clocks:            Phandle to the clock used by the timer module.
>> +                     For Clk properties, please refer to ../clock/clock-bindings.txt
>> +
>> +Optional parameters:
>> +- resets:            Phandle to the parent reset controller.
>> +                     See ../reset/st,stm32-rcc.txt
>> +
>> +Optional subnodes:
>> +- pwm:                       See ../pwm/pwm-stm32.txt
>> +- timer:             See ../iio/timer/stm32-timer-trigger.txt
>> +
>> +Example:
>> +     timers at 40010000 {
>> +             #address-cells = <1>;
>> +             #size-cells = <0>;
>> +             compatible = "st,stm32-gptimer";
>> +             reg = <0x40010000 0x400>;
>> +             clocks = <&rcc 0 160>;
>> +             clock-names = "clk_int";
>> +
>> +             pwm at 0 {
>
> Out of interest, do you use the "@0", "@1" for anything now?

No I don't use them so I will remove them in v6

>
>> +                     compatible = "st,stm32-pwm";
>> +                     pinctrl-0       = <&pwm1_pins>;
>> +                     pinctrl-names   = "default";
>> +             };
>> +
>> +             timer at 0 {
>> +                     compatible = "st,stm32-timer-trigger";
>> +                     reg = <0>;
>> +             };
>> +     };
>
> --
> Lee Jones
> Linaro STMicroelectronics Landing Team Lead
> Linaro.org ? Open source software for ARM SoCs
> Follow Linaro: Facebook | Twitter | Blog



-- 
Benjamin Gaignard

Graphic Study Group

Linaro.org ? Open source software for ARM SoCs

Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply

* [PATCH v2 0/2] mmc: host: s3cmci: add device tree support
From: Sergio Prado @ 2016-12-09 13:14 UTC (permalink / raw)
  To: linux-arm-kernel

This series adds support for configuring Samsung's S3C24XX MMC/SD/SDIO
controller via device tree.

Tested on FriendlyARM mini2440, based on s3c2440 SoC.

Changes since v1:
- pinctrl description removed from DT binding
- unit and label on DT binding renamed to mmc

Sergio Prado (2):
  dt-bindings: mmc: add DT binding for S3C24XX MMC/SD/SDIO controller
  mmc: host: s3cmci: allow probing from device tree

 .../devicetree/bindings/mmc/samsung,s3cmci.txt     |  34 +++++
 drivers/mmc/host/s3cmci.c                          | 155 +++++++++++++++++----
 2 files changed, 165 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt

-- 
1.9.1

^ permalink raw reply

* [PATCH v2 1/2] dt-bindings: mmc: add DT binding for S3C24XX MMC/SD/SDIO controller
From: Sergio Prado @ 2016-12-09 13:14 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481289284-19919-1-git-send-email-sergio.prado@e-labworks.com>

Adds the device tree bindings description for Samsung S3C24XX
MMC/SD/SDIO controller, used as a connectivity interface with external
MMC, SD and SDIO storage mediums.

Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
Acked-by: Rob Herring <robh@kernel.org>
---
 .../devicetree/bindings/mmc/samsung,s3cmci.txt     | 34 ++++++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt

diff --git a/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
new file mode 100644
index 000000000000..d09dbf4b3824
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
@@ -0,0 +1,34 @@
+* Samsung's S3C24XX MMC/SD/SDIO controller device tree bindings
+
+Samsung's S3C24XX MMC/SD/SDIO controller is used as a connectivity interface
+with external MMC, SD and SDIO storage mediums.
+
+This file documents differences between the core mmc properties described by
+mmc.txt and the properties used by the Samsung S3C24XX MMC/SD/SDIO controller
+implementation.
+
+Required SoC Specific Properties:
+- compatible: should be one of the following
+  - "samsung,s3c2410-sdi": for controllers compatible with s3c2410
+  - "samsung,s3c2412-sdi": for controllers compatible with s3c2412
+  - "samsung,s3c2440-sdi": for controllers compatible with s3c2440
+- clocks: Should reference the controller clock
+- clock-names: Should contain "sdi"
+
+Example:
+	mmc0: mmc at 5a000000 {
+		compatible = "samsung,s3c2440-sdi";
+		pinctrl-names = "default";
+		pinctrl-0 = <&sdi_pins>;
+		reg = <0x5a000000 0x100000>;
+		interrupts = <0 0 21 3>;
+		clocks = <&clocks PCLK_SDI>;
+		clock-names = "sdi";
+		bus-width = <4>;
+		cd-gpios = <&gpg 8 GPIO_ACTIVE_LOW>;
+		wp-gpios = <&gph 8 GPIO_ACTIVE_LOW>;
+	};
+
+	Note: This example shows both SoC specific and board specific properties
+	in a single device node. The properties can be actually be separated
+	into SoC specific node and board specific node.
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 2/2] mmc: host: s3cmci: allow probing from device tree
From: Sergio Prado @ 2016-12-09 13:14 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481289284-19919-1-git-send-email-sergio.prado@e-labworks.com>

Allows configuring Samsung S3C24XX MMC/SD/SDIO controller using a device
tree.

Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
---
 drivers/mmc/host/s3cmci.c | 155 +++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 131 insertions(+), 24 deletions(-)

diff --git a/drivers/mmc/host/s3cmci.c b/drivers/mmc/host/s3cmci.c
index 932a4b1fed33..bfeb90e8ffee 100644
--- a/drivers/mmc/host/s3cmci.c
+++ b/drivers/mmc/host/s3cmci.c
@@ -23,6 +23,9 @@
 #include <linux/gpio.h>
 #include <linux/irq.h>
 #include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_gpio.h>
 
 #include <plat/gpio-cfg.h>
 #include <mach/dma.h>
@@ -127,6 +130,22 @@ enum dbg_channels {
 	dbg_conf  = (1 << 8),
 };
 
+struct s3cmci_drv_data {
+	int is2440;
+};
+
+static const struct s3cmci_drv_data s3c2410_s3cmci_drv_data = {
+	.is2440 = 0,
+};
+
+static const struct s3cmci_drv_data s3c2412_s3cmci_drv_data = {
+	.is2440 = 1,
+};
+
+static const struct s3cmci_drv_data s3c2440_s3cmci_drv_data = {
+	.is2440 = 1,
+};
+
 static const int dbgmap_err   = dbg_fail;
 static const int dbgmap_info  = dbg_info | dbg_conf;
 static const int dbgmap_debug = dbg_err | dbg_debug;
@@ -1241,8 +1260,9 @@ static void s3cmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
 	case MMC_POWER_ON:
 	case MMC_POWER_UP:
 		/* Configure GPE5...GPE10 pins in SD mode */
-		s3c_gpio_cfgall_range(S3C2410_GPE(5), 6, S3C_GPIO_SFN(2),
-				      S3C_GPIO_PULL_NONE);
+		if (!host->pdev->dev.of_node)
+			s3c_gpio_cfgall_range(S3C2410_GPE(5), 6, S3C_GPIO_SFN(2),
+					      S3C_GPIO_PULL_NONE);
 
 		if (host->pdata->set_power)
 			host->pdata->set_power(ios->power_mode, ios->vdd);
@@ -1254,7 +1274,8 @@ static void s3cmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
 
 	case MMC_POWER_OFF:
 	default:
-		gpio_direction_output(S3C2410_GPE(5), 0);
+		if (!host->pdev->dev.of_node)
+			gpio_direction_output(S3C2410_GPE(5), 0);
 
 		if (host->is2440)
 			mci_con |= S3C2440_SDICON_SDRESET;
@@ -1544,21 +1565,12 @@ static inline void s3cmci_debugfs_remove(struct s3cmci_host *host) { }
 
 #endif /* CONFIG_DEBUG_FS */
 
-static int s3cmci_probe(struct platform_device *pdev)
+static int s3cmci_probe_pdata(struct s3cmci_host *host)
 {
-	struct s3cmci_host *host;
-	struct mmc_host	*mmc;
-	int ret;
-	int is2440;
-	int i;
+	struct platform_device *pdev = host->pdev;
+	int i, ret;
 
-	is2440 = platform_get_device_id(pdev)->driver_data;
-
-	mmc = mmc_alloc_host(sizeof(struct s3cmci_host), &pdev->dev);
-	if (!mmc) {
-		ret = -ENOMEM;
-		goto probe_out;
-	}
+	host->is2440 = platform_get_device_id(pdev)->driver_data;
 
 	for (i = S3C2410_GPE(5); i <= S3C2410_GPE(10); i++) {
 		ret = gpio_request(i, dev_name(&pdev->dev));
@@ -1568,14 +1580,90 @@ static int s3cmci_probe(struct platform_device *pdev)
 			for (i--; i >= S3C2410_GPE(5); i--)
 				gpio_free(i);
 
-			goto probe_free_host;
+			return ret;
 		}
 	}
 
+	return 0;
+}
+
+static int s3cmci_probe_dt(struct s3cmci_host *host)
+{
+	struct platform_device *pdev = host->pdev;
+	struct s3c24xx_mci_pdata *pdata;
+	const struct s3cmci_drv_data *drvdata;
+	struct mmc_host *mmc = host->mmc;
+	int gpio, ret;
+
+	drvdata = of_device_get_match_data(&pdev->dev);
+	if (!drvdata)
+		return -ENODEV;
+
+	host->is2440 = drvdata->is2440;
+
+	ret = mmc_of_parse(mmc);
+	if (ret)
+		return ret;
+
+	pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL);
+	if (!pdata)
+		return -ENOMEM;
+
+	pdata->ocr_avail = mmc->ocr_avail;
+
+	if (mmc->caps2 & MMC_CAP2_NO_WRITE_PROTECT)
+		pdata->no_wprotect = 1;
+
+	if (mmc->caps & MMC_CAP_NEEDS_POLL)
+		pdata->no_detect = 1;
+
+	if (mmc->caps2 & MMC_CAP2_RO_ACTIVE_HIGH)
+		pdata->wprotect_invert = 1;
+
+	if (mmc->caps2 & MMC_CAP2_CD_ACTIVE_HIGH)
+		pdata->detect_invert = 1;
+
+	gpio = of_get_named_gpio(pdev->dev.of_node, "cd-gpios", 0);
+	if (gpio_is_valid(gpio)) {
+		pdata->gpio_detect = gpio;
+		gpio_free(gpio);
+	}
+
+	gpio = of_get_named_gpio(pdev->dev.of_node, "wp-gpios", 0);
+	if (gpio_is_valid(gpio)) {
+		pdata->gpio_wprotect = gpio;
+		gpio_free(gpio);
+	}
+
+	pdev->dev.platform_data = pdata;
+
+	return 0;
+}
+
+static int s3cmci_probe(struct platform_device *pdev)
+{
+	struct s3cmci_host *host;
+	struct mmc_host	*mmc;
+	int ret;
+	int i;
+
+	mmc = mmc_alloc_host(sizeof(struct s3cmci_host), &pdev->dev);
+	if (!mmc) {
+		ret = -ENOMEM;
+		goto probe_out;
+	}
+
 	host = mmc_priv(mmc);
 	host->mmc 	= mmc;
 	host->pdev	= pdev;
-	host->is2440	= is2440;
+
+	if (pdev->dev.of_node)
+		ret = s3cmci_probe_dt(host);
+	else
+		ret = s3cmci_probe_pdata(host);
+
+	if (ret)
+		goto probe_free_host;
 
 	host->pdata = pdev->dev.platform_data;
 	if (!host->pdata) {
@@ -1586,7 +1674,7 @@ static int s3cmci_probe(struct platform_device *pdev)
 	spin_lock_init(&host->complete_lock);
 	tasklet_init(&host->pio_tasklet, pio_tasklet, (unsigned long) host);
 
-	if (is2440) {
+	if (host->is2440) {
 		host->sdiimsk	= S3C2440_SDIIMSK;
 		host->sdidata	= S3C2440_SDIDATA;
 		host->clk_div	= 1;
@@ -1789,8 +1877,9 @@ static int s3cmci_probe(struct platform_device *pdev)
 	release_mem_region(host->mem->start, resource_size(host->mem));
 
  probe_free_gpio:
-	for (i = S3C2410_GPE(5); i <= S3C2410_GPE(10); i++)
-		gpio_free(i);
+	if (!pdev->dev.of_node)
+		for (i = S3C2410_GPE(5); i <= S3C2410_GPE(10); i++)
+			gpio_free(i);
 
  probe_free_host:
 	mmc_free_host(mmc);
@@ -1837,9 +1926,9 @@ static int s3cmci_remove(struct platform_device *pdev)
 	if (!pd->no_detect)
 		gpio_free(pd->gpio_detect);
 
-	for (i = S3C2410_GPE(5); i <= S3C2410_GPE(10); i++)
-		gpio_free(i);
-
+	if (!pdev->dev.of_node)
+		for (i = S3C2410_GPE(5); i <= S3C2410_GPE(10); i++)
+			gpio_free(i);
 
 	iounmap(host->base);
 	release_mem_region(host->mem->start, resource_size(host->mem));
@@ -1848,6 +1937,23 @@ static int s3cmci_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static const struct of_device_id s3cmci_dt_match[] = {
+	{
+		.compatible = "samsung,s3c2410-sdi",
+		.data = &s3c2410_s3cmci_drv_data,
+	},
+	{
+		.compatible = "samsung,s3c2412-sdi",
+		.data = &s3c2412_s3cmci_drv_data,
+	},
+	{
+		.compatible = "samsung,s3c2440-sdi",
+		.data = &s3c2440_s3cmci_drv_data,
+	},
+	{ /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, sdhci_s3c_dt_match);
+
 static const struct platform_device_id s3cmci_driver_ids[] = {
 	{
 		.name	= "s3c2410-sdi",
@@ -1867,6 +1973,7 @@ static int s3cmci_remove(struct platform_device *pdev)
 static struct platform_driver s3cmci_driver = {
 	.driver	= {
 		.name	= "s3c-sdi",
+		.of_match_table = s3cmci_dt_match,
 	},
 	.id_table	= s3cmci_driver_ids,
 	.probe		= s3cmci_probe,
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section
From: Yisheng Xie @ 2016-12-09 13:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <584AA257.3080608@linaro.org>



On 2016/12/9 20:23, Hanjun Guo wrote:
> On 12/09/2016 08:19 PM, Ard Biesheuvel wrote:
>> On 9 December 2016 at 12:14, Yisheng Xie <xieyisheng1@huawei.com> wrote:
>>> Hi Robert,
>>> We have merged your patch to 4.9.0-rc8, however we still meet the similar problem
>>> on our D05 board:
>>>
>>
>> To be clear: does this issue also occur on D05 *without* the patch?
> 
> It boots ok on D05 without this patch.
> 
> But I think the problem Robert described in the commit message is
> still there, just not triggered in the boot. we met this problem
> when having LTP stress memory test and hit the same BUG_ON.
> 
That's right. for D05's case, when trigger the BUG_ON on:
1863         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

the end_page is not nomap, but BIOS reserved. here is the log I got:

[    0.000000] efi:   0x00003fc00000-0x00003fffffff [Reserved           |   |  |  |  |  |  |  |   |  |  |  |  ]
[...]
[    5.081443] move_freepages: phys(start_page: 0x20000000,end_page:0x3fff0000)

For invalid pages, their zone and node information is not initialized, and it
do have risk to trigger the BUG_ON, so I have a silly question,
why not just change the BUG_ON:
-----------
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440..af199b8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1860,12 +1860,13 @@ int move_freepages(struct zone *zone,
         * Remove at a later date when no bug reports exist related to
         * grouping pages by mobility
         */
-       VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
+       VM_BUG_ON(early_pfn_valid(start_page) && early_pfn_valid(end_page) &&
+                       page_zone(start_page) != page_zone(end_page));
 #endif

        for (page = start_page; page <= end_page;) {
                /* Make sure we are not inadvertently changing nodes */
-               VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+               VM_BUG_ON_PAGE(early_pfn_valid(page) && (page_to_nid(page) != zone_to_nid(zone)), page);

                if (!pfn_valid_within(page_to_pfn(page))) {
                        page++;

Thanks,
Yisheng Xie

> Thanks
> Hanjun
> 
> .
> 

^ permalink raw reply related

* [PATCH v8 02/16] arm: use new phandle allocation functions
From: Marc Zyngier @ 2016-12-09 13:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161104173203.21168-3-andre.przywara@arm.com>

On 04/11/16 17:31, Andre Przywara wrote:
> To refer to the GIC FDT node, we used to pass the GIC phandle to most
> of the functions dealing with FDT nodes.
> Since we now have a global phandle reference, use that to refer to the
> GIC handle in various places and get rid of the now unneeded parameter
> passing.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Despite the comments on the previous patch, this one is a good cleanup.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] crypto: arm/aes-neonbs - process 8 blocks in parallel if we can
From: Ard Biesheuvel @ 2016-12-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

The bit-sliced NEON implementation of AES only performs optimally if
it can process 8 blocks of input in parallel. This is due to the nature
of bit slicing, where the n-th bit of each byte of AES state of each input
block is collected into NEON register 'n', for registers q0 - q7.

This implies that the amount of work for the transform is fixed,
regardless of whether we are handling just one block or 8 in parallel.

So let's try a bit harder to iterate over the input in suitably sized
chunks, by increasing the chunksize to 8 * AES_BLOCK_SIZE, and tweaking
the loops to only process multiples of the chunk size, unless we are
handling the last chunk in the input stream.

Note that the skcipher walk API guarantees that a step in the walk never
returns less that 'chunksize' bytes if there are at least that many bytes
of input still available. However, it does *not* guarantee that those steps
produce an exact multiple of the chunk size.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/crypto/aesbs-glue.c | 68 +++++++++++++++++++++++++-------------------
 1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/arch/arm/crypto/aesbs-glue.c b/arch/arm/crypto/aesbs-glue.c
index d8e06de72ef3..938d1e1bf9a3 100644
--- a/arch/arm/crypto/aesbs-glue.c
+++ b/arch/arm/crypto/aesbs-glue.c
@@ -121,39 +121,26 @@ static int aesbs_cbc_encrypt(struct skcipher_request *req)
 	return crypto_cbc_encrypt_walk(req, aesbs_encrypt_one);
 }
 
-static inline void aesbs_decrypt_one(struct crypto_skcipher *tfm,
-				     const u8 *src, u8 *dst)
-{
-	struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
-
-	AES_decrypt(src, dst, &ctx->dec.rk);
-}
-
 static int aesbs_cbc_decrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
-	unsigned int nbytes;
 	int err;
 
-	for (err = skcipher_walk_virt(&walk, req, false);
-	     (nbytes = walk.nbytes); err = skcipher_walk_done(&walk, nbytes)) {
-		u32 blocks = nbytes / AES_BLOCK_SIZE;
-		u8 *dst = walk.dst.virt.addr;
-		u8 *src = walk.src.virt.addr;
-		u8 *iv = walk.iv;
-
-		if (blocks >= 8) {
-			kernel_neon_begin();
-			bsaes_cbc_encrypt(src, dst, nbytes, &ctx->dec, iv);
-			kernel_neon_end();
-			nbytes %= AES_BLOCK_SIZE;
-			continue;
-		}
+	err = skcipher_walk_virt(&walk, req, false);
+
+	while (walk.nbytes) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.chunksize);
 
-		nbytes = crypto_cbc_decrypt_blocks(&walk, tfm,
-						   aesbs_decrypt_one);
+		kernel_neon_begin();
+		bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
+				  nbytes, &ctx->dec, walk.iv);
+		kernel_neon_end();
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 	return err;
 }
@@ -186,6 +173,12 @@ static int aesbs_ctr_encrypt(struct skcipher_request *req)
 		__be32 *ctr = (__be32 *)walk.iv;
 		u32 headroom = UINT_MAX - be32_to_cpu(ctr[3]);
 
+		if (walk.nbytes < walk.total) {
+			blocks = round_down(blocks,
+					    walk.chunksize / AES_BLOCK_SIZE);
+			tail = walk.nbytes - blocks * AES_BLOCK_SIZE;
+		}
+
 		/* avoid 32 bit counter overflow in the NEON code */
 		if (unlikely(headroom < blocks)) {
 			blocks = headroom + 1;
@@ -198,6 +191,9 @@ static int aesbs_ctr_encrypt(struct skcipher_request *req)
 		kernel_neon_end();
 		inc_be128_ctr(ctr, blocks);
 
+		if (tail > 0 && tail < AES_BLOCK_SIZE)
+			break;
+
 		err = skcipher_walk_done(&walk, tail);
 	}
 	if (walk.nbytes) {
@@ -227,11 +223,16 @@ static int aesbs_xts_encrypt(struct skcipher_request *req)
 	AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
 
 	while (walk.nbytes) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.chunksize);
+
 		kernel_neon_begin();
 		bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
-				  walk.nbytes, &ctx->enc, walk.iv);
+				  nbytes, &ctx->enc, walk.iv);
 		kernel_neon_end();
-		err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 	return err;
 }
@@ -249,11 +250,16 @@ static int aesbs_xts_decrypt(struct skcipher_request *req)
 	AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
 
 	while (walk.nbytes) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.chunksize);
+
 		kernel_neon_begin();
 		bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
-				  walk.nbytes, &ctx->dec, walk.iv);
+				  nbytes, &ctx->dec, walk.iv);
 		kernel_neon_end();
-		err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 	return err;
 }
@@ -272,6 +278,7 @@ static struct skcipher_alg aesbs_algs[] = { {
 	.min_keysize	= AES_MIN_KEY_SIZE,
 	.max_keysize	= AES_MAX_KEY_SIZE,
 	.ivsize		= AES_BLOCK_SIZE,
+	.chunksize	= 8 * AES_BLOCK_SIZE,
 	.setkey		= aesbs_cbc_set_key,
 	.encrypt	= aesbs_cbc_encrypt,
 	.decrypt	= aesbs_cbc_decrypt,
@@ -289,7 +296,7 @@ static struct skcipher_alg aesbs_algs[] = { {
 	.min_keysize	= AES_MIN_KEY_SIZE,
 	.max_keysize	= AES_MAX_KEY_SIZE,
 	.ivsize		= AES_BLOCK_SIZE,
-	.chunksize	= AES_BLOCK_SIZE,
+	.chunksize	= 8 * AES_BLOCK_SIZE,
 	.setkey		= aesbs_ctr_set_key,
 	.encrypt	= aesbs_ctr_encrypt,
 	.decrypt	= aesbs_ctr_encrypt,
@@ -307,6 +314,7 @@ static struct skcipher_alg aesbs_algs[] = { {
 	.min_keysize	= 2 * AES_MIN_KEY_SIZE,
 	.max_keysize	= 2 * AES_MAX_KEY_SIZE,
 	.ivsize		= AES_BLOCK_SIZE,
+	.chunksize	= 8 * AES_BLOCK_SIZE,
 	.setkey		= aesbs_xts_set_key,
 	.encrypt	= aesbs_xts_encrypt,
 	.decrypt	= aesbs_xts_decrypt,
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 0/8] Add PWM and IIO timer drivers for STM32
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

version 6:
- rename stm32-gptimer in stm32-timers.
- change "st,stm32-gptimer" compatible to "st,stm32-timers".
- modify "st,breakinput" parameter in pwm part.
- split DT patch in 2

version 5:
- fix comments done on version 4
- rebased on kernel 4.9-rc8
- change nodes names and re-order then by addresses

version 4:
- fix comments done on version 3
- don't use interrupts anymore in IIO timer
- detect hardware capabilities at probe time to simplify binding

version 3:
- no change on mfd and pwm divers patches
- add cross reference between bindings
- change compatible to "st,stm32-timer-trigger"
- fix attributes access rights
- use string instead of int for master_mode and slave_mode
- document device attributes in sysfs-bus-iio-timer-stm32
- update DT with the new compatible

version 2:
- keep only one compatible per driver
- use DT parameters to describe hardware block configuration:
  - pwm channels, complementary output, counter size, break input
  - triggers accepted and create by IIO timers
- change DT to limite use of reference to the node
- interrupt is now in IIO timer driver
- rename stm32-mfd-timer to stm32-timers (for general purpose timer)

The following patches enable PWM and IIO Timer features for STM32 platforms.

Those two features are mixed into the registers of the same hardware block
(named general purpose timer) which lead to introduce a multifunctions driver 
on the top of them to be able to share the registers.

In STM32f4 14 instances of timer hardware block exist, even if they all have
the same register mapping they could have a different number of pwm channels
and/or different triggers capabilities. We use various parameters in DT to 
describe the differences between hardware blocks

The MFD (stm32-timers.c) takes care of clock and register mapping
by using regmap. stm32_timers structure is provided to its sub-node to
share those information.

PWM driver is implemented into pwm-stm32.c. Depending of the instance we may
have up to 4 channels, sometime with complementary outputs or 32 bits counter
instead of 16 bits. Some hardware blocks may also have a break input function
which allows to stop pwm depending of a level, defined in devicetree, on an
external pin.

IIO timer driver (stm32-timer-trigger.c and stm32-timer-trigger.h) define a list
of hardware triggers usable by hardware blocks like ADC, DAC or other timers. 

The matrix of possible connections between blocks is quite complex so we use 
trigger names and is_stm32_iio_timer_trigger() function to be sure that
triggers are valid and configure the IPs.

At run time IIO timer hardware blocks can configure (through "master_mode" 
IIO device attribute) which internal signal (counter enable, reset,
comparison block, etc...) is used to generate the trigger.

By using "slave_mode" IIO device attribute timer can also configure on which
event (level, rising edge) of the block is enabled.

Since we can use trigger from one hardware to control an other block, we can
use a pwm to control an other one. The following example shows how to configure
pwm1 and pwm3 to make pwm3 generate pulse only when pwm1 pulse level is high.

/sys/bus/iio/devices # ls
iio:device0  iio:device1  trigger0     trigger1

configure timer1 to use pwm1 channel 0 as output trigger
/sys/bus/iio/devices # echo 'OC1REF' > iio\:device0/master_mode
configure timer3 to enable only when input is high
/sys/bus/iio/devices # echo 'gated' > iio\:device1/slave_mode
/sys/bus/iio/devices # cat trigger0/name
tim1_trgo
configure timer2 to use timer1 trigger is input
/sys/bus/iio/devices # echo "tim1_trgo" > iio\:device1/trigger/current_trigger

configure pwm3 channel 0 to generate a signal with a period of 100ms and a
duty cycle of 50%
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 0 > export
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 100000000 > pwm0/period
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 50000000 > pwm0/duty_cycle
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 1 > pwm0/enable
here pwm3 channel 0, as expected, doesn't start because has to be triggered by
pwm1 channel 0

configure pwm1 channel 0 to generate a signal with a period of 1s and a
duty cycle of 50%
/sys/devices/platform/soc/40010000.timers/40010000.timers:pwm/pwm/pwmchip0 # echo 0 > export
/sys/devices/platform/soc/40010000.timers/40010000.timers:pwm/pwm/pwmchip0 # echo 1000000000 > pwm0/period
/sys/devices/platform/soc/40010000.timers/40010000.timers:pwm/pwm/pwmchip0 # echo 500000000 > pwm0/duty_cycle
/sys/devices/platform/soc/40010000.timers/40010000.timers:pwm/pwm/pwmchip0 # echo 1 > pwm0/enable 
finally pwm1 starts and pwm3 only generates pulse when pwm1 signal is high

An other example to use a timer as source of clock for another device.
Here timer1 is used a source clock for pwm3:

/sys/bus/iio/devices # echo 100000 > trigger0/sampling_frequency 
/sys/bus/iio/devices # echo "tim1_trgo" > iio\:device1/trigger/current_trigger 
/sys/bus/iio/devices # echo 'external_clock' > iio\:device1/slave_mode
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 0 > export 
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 1000000 > pwm0/period 
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 500000 > pwm0/duty_cycle 
/sys/devices/platform/soc/40000400.timers/40000400.timers:pwm/pwm/pwmchip4 # echo 1 > pwm0/enable

Benjamin Gaignard (8):
  MFD: add bindings for STM32 Timers driver
  MFD: add STM32 Timers driver
  PWM: add pwm-stm32 DT bindings
  PWM: add PWM driver for STM32 plaftorm
  IIO: add bindings for STM32 timer trigger driver
  IIO: add STM32 timer trigger driver
  ARM: dts: stm32: add Timers driver for stm32f429 MCU
  ARM: dts: stm32: Enable pw1 and pwm3 for stm32f469-disco

 .../ABI/testing/sysfs-bus-iio-timer-stm32          |  55 +++
 .../bindings/iio/timer/stm32-timer-trigger.txt     |  23 +
 .../devicetree/bindings/mfd/stm32-timers.txt       |  46 ++
 .../devicetree/bindings/pwm/pwm-stm32.txt          |  33 ++
 arch/arm/boot/dts/stm32f429.dtsi                   | 275 ++++++++++++
 arch/arm/boot/dts/stm32f469-disco.dts              |  28 ++
 drivers/iio/Kconfig                                |   2 +-
 drivers/iio/Makefile                               |   1 +
 drivers/iio/timer/Kconfig                          |  13 +
 drivers/iio/timer/Makefile                         |   1 +
 drivers/iio/timer/stm32-timer-trigger.c            | 466 +++++++++++++++++++++
 drivers/iio/trigger/Kconfig                        |   1 -
 drivers/mfd/Kconfig                                |  11 +
 drivers/mfd/Makefile                               |   2 +
 drivers/mfd/stm32-timers.c                         |  80 ++++
 drivers/pwm/Kconfig                                |   9 +
 drivers/pwm/Makefile                               |   1 +
 drivers/pwm/pwm-stm32.c                            | 434 +++++++++++++++++++
 include/linux/iio/timer/stm32-timer-trigger.h      |  62 +++
 include/linux/mfd/stm32-timers.h                   |  71 ++++
 20 files changed, 1612 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
 create mode 100644 Documentation/devicetree/bindings/iio/timer/stm32-timer-trigger.txt
 create mode 100644 Documentation/devicetree/bindings/mfd/stm32-timers.txt
 create mode 100644 Documentation/devicetree/bindings/pwm/pwm-stm32.txt
 create mode 100644 drivers/iio/timer/Kconfig
 create mode 100644 drivers/iio/timer/Makefile
 create mode 100644 drivers/iio/timer/stm32-timer-trigger.c
 create mode 100644 drivers/mfd/stm32-timers.c
 create mode 100644 drivers/pwm/pwm-stm32.c
 create mode 100644 include/linux/iio/timer/stm32-timer-trigger.h
 create mode 100644 include/linux/mfd/stm32-timers.h

-- 
1.9.1

^ permalink raw reply

* [PATCH v6 1/8] MFD: add bindings for STM32 Timers driver
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Add bindings information for STM32 Timers

version 6:
- rename stm32-gtimer to stm32-timers
- change compatible
- add description about the IPs

version 2:
- rename stm32-mfd-timer to stm32-gptimer
- only keep one compatible string

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 .../devicetree/bindings/mfd/stm32-timers.txt       | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mfd/stm32-timers.txt

diff --git a/Documentation/devicetree/bindings/mfd/stm32-timers.txt b/Documentation/devicetree/bindings/mfd/stm32-timers.txt
new file mode 100644
index 0000000..b30868e
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/stm32-timers.txt
@@ -0,0 +1,46 @@
+STM32 Timers driver bindings
+
+This IP provides 3 types of timer along with PWM functionality:
+- advanced-control timers consist of a 16-bit auto-reload counter driven by a programmable
+  prescaler, break input feature, PWM outputs and complementary PWM ouputs channels.
+- general-purpose timers consist of a 16-bit or 32-bit auto-reload counter driven by a
+  programmable prescaler and PWM outputs.
+- basic timers consist of a 16-bit auto-reload counter driven by a programmable prescaler.
+
+Required parameters:
+- compatible: must be "st,stm32-timers"
+
+- reg:			Physical base address and length of the controller's
+			registers.
+- clock-names: 		Set to "clk_int".
+- clocks: 		Phandle to the clock used by the timer module.
+			For Clk properties, please refer to ../clock/clock-bindings.txt
+
+Optional parameters:
+- resets:		Phandle to the parent reset controller.
+			See ../reset/st,stm32-rcc.txt
+
+Optional subnodes:
+- pwm:			See ../pwm/pwm-stm32.txt
+- timer:		See ../iio/timer/stm32-timer-trigger.txt
+
+Example:
+	timers at 40010000 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "st,stm32-timers";
+		reg = <0x40010000 0x400>;
+		clocks = <&rcc 0 160>;
+		clock-names = "clk_int";
+
+		pwm {
+			compatible = "st,stm32-pwm";
+			pinctrl-0	= <&pwm1_pins>;
+			pinctrl-names	= "default";
+		};
+
+		timer {
+			compatible = "st,stm32-timer-trigger";
+			reg = <0>;
+		};
+	};
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 2/8] MFD: add STM32 Timers driver
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

This hardware block could at used at same time for PWM generation
and IIO timers.
PWM and IIO timer configuration are mixed in the same registers
so we need a multi fonction driver to be able to share those registers.

version 6:
- rename files to stm32-timers
- rename functions to stm32_timers_xxx

version 5:
- fix Lee comments about detect function
- add missing dependency on REGMAP_MMIO

version 4:
- add a function to detect Auto Reload Register (ARR) size
- rename the structure shared with other drivers

version 2:
- rename driver "stm32-gptimer" to be align with SoC documentation
- only keep one compatible
- use of_platform_populate() instead of devm_mfd_add_devices()

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 drivers/mfd/Kconfig              | 11 ++++++
 drivers/mfd/Makefile             |  2 +
 drivers/mfd/stm32-timers.c       | 80 ++++++++++++++++++++++++++++++++++++++++
 include/linux/mfd/stm32-timers.h | 71 +++++++++++++++++++++++++++++++++++
 4 files changed, 164 insertions(+)
 create mode 100644 drivers/mfd/stm32-timers.c
 create mode 100644 include/linux/mfd/stm32-timers.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index c6df644..4ec1906 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1607,6 +1607,17 @@ config MFD_STW481X
 	  in various ST Microelectronics and ST-Ericsson embedded
 	  Nomadik series.
 
+config MFD_STM32_TIMERS
+	tristate "Support for STM32 Timers"
+	depends on (ARCH_STM32 && OF) || COMPILE_TEST
+	select MFD_CORE
+	select REGMAP
+	select REGMAP_MMIO
+	help
+	  Select this option to enable STM32 timers driver used
+	  for PWM and IIO Timer. This driver allow to share the
+	  registers between the others drivers.
+
 menu "Multimedia Capabilities Port drivers"
 	depends on ARCH_SA1100
 
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9834e66..11a52f8 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -211,3 +211,5 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
 obj-$(CONFIG_MFD_MT6397)	+= mt6397-core.o
 
 obj-$(CONFIG_MFD_ALTERA_A10SR)	+= altera-a10sr.o
+
+obj-$(CONFIG_MFD_STM32_TIMERS) 	+= stm32-timers.o
diff --git a/drivers/mfd/stm32-timers.c b/drivers/mfd/stm32-timers.c
new file mode 100644
index 0000000..68d115e
--- /dev/null
+++ b/drivers/mfd/stm32-timers.c
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) STMicroelectronics 2016
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#include <linux/mfd/stm32-timers.h>
+#include <linux/module.h>
+#include <linux/of_platform.h>
+#include <linux/reset.h>
+
+static const struct regmap_config stm32_timers_regmap_cfg = {
+	.reg_bits = 32,
+	.val_bits = 32,
+	.reg_stride = sizeof(u32),
+	.max_register = 0x400,
+};
+
+static void stm32_timers_get_arr_size(struct stm32_timers *ddata)
+{
+	/*
+	 * Only the available bits will be written so when readback
+	 * we get the maximum value of auto reload register
+	 */
+	regmap_write(ddata->regmap, TIM_ARR, ~0L);
+	regmap_read(ddata->regmap, TIM_ARR, &ddata->max_arr);
+	regmap_write(ddata->regmap, TIM_ARR, 0x0);
+}
+
+static int stm32_timers_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct stm32_timers *ddata;
+	struct resource *res;
+	void __iomem *mmio;
+
+	ddata = devm_kzalloc(dev, sizeof(*ddata), GFP_KERNEL);
+	if (!ddata)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mmio = devm_ioremap_resource(dev, res);
+	if (IS_ERR(mmio))
+		return PTR_ERR(mmio);
+
+	ddata->regmap = devm_regmap_init_mmio_clk(dev, "clk_int", mmio,
+						  &stm32_timers_regmap_cfg);
+	if (IS_ERR(ddata->regmap))
+		return PTR_ERR(ddata->regmap);
+
+	ddata->clk = devm_clk_get(dev, NULL);
+	if (IS_ERR(ddata->clk))
+		return PTR_ERR(ddata->clk);
+
+	stm32_timers_get_arr_size(ddata);
+
+	platform_set_drvdata(pdev, ddata);
+
+	return of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
+}
+
+static const struct of_device_id stm32_timers_of_match[] = {
+	{ .compatible = "st,stm32-timers", },
+	{ /* end node */ },
+};
+MODULE_DEVICE_TABLE(of, stm32_timers_of_match);
+
+static struct platform_driver stm32_timers_driver = {
+	.probe = stm32_timers_probe,
+	.driver	= {
+		.name = "stm32-timers",
+		.of_match_table = stm32_timers_of_match,
+	},
+};
+module_platform_driver(stm32_timers_driver);
+
+MODULE_DESCRIPTION("STMicroelectronics STM32 Timers");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/mfd/stm32-timers.h b/include/linux/mfd/stm32-timers.h
new file mode 100644
index 0000000..d030004
--- /dev/null
+++ b/include/linux/mfd/stm32-timers.h
@@ -0,0 +1,71 @@
+/*
+ * Copyright (C) STMicroelectronics 2016
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#ifndef _LINUX_STM32_GPTIMER_H_
+#define _LINUX_STM32_GPTIMER_H_
+
+#include <linux/clk.h>
+#include <linux/regmap.h>
+
+#define TIM_CR1		0x00	/* Control Register 1      */
+#define TIM_CR2		0x04	/* Control Register 2      */
+#define TIM_SMCR	0x08	/* Slave mode control reg  */
+#define TIM_DIER	0x0C	/* DMA/interrupt register  */
+#define TIM_SR		0x10	/* Status register	   */
+#define TIM_EGR		0x14	/* Event Generation Reg    */
+#define TIM_CCMR1	0x18	/* Capt/Comp 1 Mode Reg    */
+#define TIM_CCMR2	0x1C	/* Capt/Comp 2 Mode Reg    */
+#define TIM_CCER	0x20	/* Capt/Comp Enable Reg    */
+#define TIM_PSC		0x28	/* Prescaler               */
+#define TIM_ARR		0x2c	/* Auto-Reload Register    */
+#define TIM_CCR1	0x34	/* Capt/Comp Register 1    */
+#define TIM_CCR2	0x38	/* Capt/Comp Register 2    */
+#define TIM_CCR3	0x3C	/* Capt/Comp Register 3    */
+#define TIM_CCR4	0x40	/* Capt/Comp Register 4    */
+#define TIM_BDTR	0x44	/* Break and Dead-Time Reg */
+
+#define TIM_CR1_CEN	BIT(0)	/* Counter Enable	   */
+#define TIM_CR1_ARPE	BIT(7)	/* Auto-reload Preload Ena */
+#define TIM_CR2_MMS	(BIT(4) | BIT(5) | BIT(6)) /* Master mode selection */
+#define TIM_SMCR_SMS	(BIT(0) | BIT(1) | BIT(2)) /* Slave mode selection */
+#define TIM_SMCR_TS	(BIT(4) | BIT(5) | BIT(6)) /* Trigger selection */
+#define TIM_DIER_UIE	BIT(0)	/* Update interrupt	   */
+#define TIM_SR_UIF	BIT(0)	/* Update interrupt flag   */
+#define TIM_EGR_UG	BIT(0)	/* Update Generation       */
+#define TIM_CCMR_PE	BIT(3)	/* Channel Preload Enable  */
+#define TIM_CCMR_M1	(BIT(6) | BIT(5))  /* Channel PWM Mode 1 */
+#define TIM_CCER_CC1E	BIT(0)	/* Capt/Comp 1  out Ena    */
+#define TIM_CCER_CC1P	BIT(1)	/* Capt/Comp 1  Polarity   */
+#define TIM_CCER_CC1NE	BIT(2)	/* Capt/Comp 1N out Ena    */
+#define TIM_CCER_CC1NP	BIT(3)	/* Capt/Comp 1N Polarity   */
+#define TIM_CCER_CC2E	BIT(4)	/* Capt/Comp 2  out Ena    */
+#define TIM_CCER_CC3E	BIT(8)	/* Capt/Comp 3  out Ena    */
+#define TIM_CCER_CC4E	BIT(12)	/* Capt/Comp 4  out Ena    */
+#define TIM_CCER_CCXE	(BIT(0) | BIT(4) | BIT(8) | BIT(12))
+#define TIM_BDTR_BKE	BIT(12) /* Break input enable	   */
+#define TIM_BDTR_BKP	BIT(13) /* Break input polarity	   */
+#define TIM_BDTR_AOE	BIT(14)	/* Automatic Output Enable */
+#define TIM_BDTR_MOE	BIT(15)	/* Main Output Enable      */
+#define TIM_BDTR_BKF	(BIT(16) | BIT(17) | BIT(18) | BIT(19))
+#define TIM_BDTR_BK2F	(BIT(20) | BIT(21) | BIT(22) | BIT(23))
+#define TIM_BDTR_BK2E	BIT(24) /* Break 2 input enable	   */
+#define TIM_BDTR_BK2P	BIT(25) /* Break 2 input polarity  */
+
+#define MAX_TIM_PSC		0xFFFF
+#define TIM_CR2_MMS_SHIFT	4
+#define TIM_SMCR_TS_SHIFT	4
+#define TIM_BDTR_BKF_MASK	0xF
+#define TIM_BDTR_BKF_SHIFT	16
+#define TIM_BDTR_BK2F_SHIFT	20
+
+struct stm32_timers {
+	struct clk *clk;
+	struct regmap *regmap;
+	u32 max_arr;
+};
+#endif
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 3/8] PWM: add pwm-stm32 DT bindings
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Define bindings for pwm-stm32

version 6:
- change st,breakinput parameter format to make it usuable on stm32f7 too.

version 2:
- use parameters instead of compatible of handle the hardware configuration

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 .../devicetree/bindings/pwm/pwm-stm32.txt          | 33 ++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/pwm-stm32.txt

diff --git a/Documentation/devicetree/bindings/pwm/pwm-stm32.txt b/Documentation/devicetree/bindings/pwm/pwm-stm32.txt
new file mode 100644
index 0000000..866f222
--- /dev/null
+++ b/Documentation/devicetree/bindings/pwm/pwm-stm32.txt
@@ -0,0 +1,33 @@
+STMicroelectronics STM32 Timers PWM bindings
+
+Must be a sub-node of an STM32 Timers device tree node.
+See ../mfd/stm32-timers.txt for details about the parent node.
+
+Required parameters:
+- compatible:		Must be "st,stm32-pwm".
+- pinctrl-names: 	Set to "default".
+- pinctrl-0: 		List of phandles pointing to pin configuration nodes for PWM module.
+			For Pinctrl properties see ../pinctrl/pinctrl-bindings.txt
+
+Optional parameters:
+- st,breakinput:	Arrays of three u32 <index level filter> to describe break input configurations.
+			"index" indicates on which break input the configuration should be applied.
+			"level" gives the active level (0=low or 1=high) for this configuration.
+			"filter" gives the filtering value to be applied.
+
+Example:
+	timers at 40010000 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "st,stm32-timers";
+		reg = <0x40010000 0x400>;
+		clocks = <&rcc 0 160>;
+		clock-names = "clk_int";
+
+		pwm {
+			compatible = "st,stm32-pwm";
+			pinctrl-0	= <&pwm1_pins>;
+			pinctrl-names	= "default";
+			st,breakinput = <0 1 5>;
+		};
+	};
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 4/8] PWM: add PWM driver for STM32 plaftorm
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

This driver adds support for PWM driver on STM32 platform.
The SoC have multiple instances of the hardware IP and each
of them could have small differences: number of channels,
complementary output, auto reload register size...

version 6:
- change "st,breakinput" parameter to make it usuable for stm32f7 too.

version 4:
- detect at probe time hardware capabilities
- fix comments done on v2 and v3
- use PWM atomic ops

version 2:
- only keep one comptatible
- use DT parameters to discover hardware block configuration

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 drivers/pwm/Kconfig     |   9 +
 drivers/pwm/Makefile    |   1 +
 drivers/pwm/pwm-stm32.c | 434 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)
 create mode 100644 drivers/pwm/pwm-stm32.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index bf01288..f769b2a 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -388,6 +388,15 @@ config PWM_STI
 	  To compile this driver as a module, choose M here: the module
 	  will be called pwm-sti.
 
+config PWM_STM32
+	tristate "STMicroelectronics STM32 PWM"
+	depends on (ARCH_STM32 && OF && MFD_STM32_TIMERS) || COMPILE_TEST
+	help
+	  Generic PWM framework driver for STM32 SoCs.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called pwm-stm32.
+
 config PWM_STMPE
 	bool "STMPE expander PWM export"
 	depends on MFD_STMPE
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index 1194c54..5aa9308 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -37,6 +37,7 @@ obj-$(CONFIG_PWM_ROCKCHIP)	+= pwm-rockchip.o
 obj-$(CONFIG_PWM_SAMSUNG)	+= pwm-samsung.o
 obj-$(CONFIG_PWM_SPEAR)		+= pwm-spear.o
 obj-$(CONFIG_PWM_STI)		+= pwm-sti.o
+obj-$(CONFIG_PWM_STM32)		+= pwm-stm32.o
 obj-$(CONFIG_PWM_STMPE)		+= pwm-stmpe.o
 obj-$(CONFIG_PWM_SUN4I)		+= pwm-sun4i.o
 obj-$(CONFIG_PWM_TEGRA)		+= pwm-tegra.o
diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
new file mode 100644
index 0000000..fcf0a78
--- /dev/null
+++ b/drivers/pwm/pwm-stm32.c
@@ -0,0 +1,434 @@
+/*
+ * Copyright (C) STMicroelectronics 2016
+ *
+ * Author: Gerald Baeza <gerald.baeza@st.com>
+ *
+ * License terms: GNU General Public License (GPL), version 2
+ *
+ * Inspired by timer-stm32.c from Maxime Coquelin
+ *             pwm-atmel.c from Bo Shen
+ */
+
+#include <linux/mfd/stm32-timers.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/pwm.h>
+#include <linux/of.h>
+
+#define CCMR_CHANNEL_SHIFT 8
+#define CCMR_CHANNEL_MASK  0xFF
+#define MAX_BREAKINPUT 2
+
+struct stm32_pwm {
+	struct pwm_chip chip;
+	struct device *dev;
+	struct clk *clk;
+	struct regmap *regmap;
+	unsigned int caps;
+	unsigned int npwm;
+	u32 max_arr;
+	bool have_complementary_output;
+};
+
+struct stm32_breakinput {
+	u32 index;
+	u32 level;
+	u32 filter;
+};
+
+static inline struct stm32_pwm *to_stm32_pwm_dev(struct pwm_chip *chip)
+{
+	return container_of(chip, struct stm32_pwm, chip);
+}
+
+static u32 active_channels(struct stm32_pwm *dev)
+{
+	u32 ccer;
+
+	regmap_read(dev->regmap, TIM_CCER, &ccer);
+
+	return ccer & TIM_CCER_CCXE;
+}
+
+static int write_ccrx(struct stm32_pwm *dev, struct pwm_device *pwm,
+		      u32 value)
+{
+	switch (pwm->hwpwm) {
+	case 0:
+		return regmap_write(dev->regmap, TIM_CCR1, value);
+	case 1:
+		return regmap_write(dev->regmap, TIM_CCR2, value);
+	case 2:
+		return regmap_write(dev->regmap, TIM_CCR3, value);
+	case 3:
+		return regmap_write(dev->regmap, TIM_CCR4, value);
+	}
+	return -EINVAL;
+}
+
+static int stm32_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+			    int duty_ns, int period_ns)
+{
+	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
+	unsigned long long prd, div, dty;
+	unsigned int prescaler = 0;
+	u32 ccmr, mask, shift;
+
+	/* Period and prescaler values depends on clock rate */
+	div = (unsigned long long)clk_get_rate(priv->clk) * period_ns;
+
+	do_div(div, NSEC_PER_SEC);
+	prd = div;
+
+	while (div > priv->max_arr) {
+		prescaler++;
+		div = prd;
+		do_div(div, (prescaler + 1));
+	}
+
+	prd = div;
+
+	if (prescaler > MAX_TIM_PSC) {
+		dev_err(chip->dev, "prescaler exceeds the maximum value\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * All channels share the same prescaler and counter so when two
+	 * channels are active at the same we can't change them
+	 */
+	if (active_channels(priv) & ~(1 << pwm->hwpwm * 4)) {
+		u32 psc, arr;
+
+		regmap_read(priv->regmap, TIM_PSC, &psc);
+		regmap_read(priv->regmap, TIM_ARR, &arr);
+
+		if ((psc != prescaler) || (arr != prd - 1))
+			return -EBUSY;
+	}
+
+	regmap_write(priv->regmap, TIM_PSC, prescaler);
+	regmap_write(priv->regmap, TIM_ARR, prd - 1);
+	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_ARPE, TIM_CR1_ARPE);
+
+	/* Calculate the duty cycles */
+	dty = prd * duty_ns;
+	do_div(dty, period_ns);
+
+	write_ccrx(priv, pwm, dty);
+
+	/* Configure output mode */
+	shift = (pwm->hwpwm & 0x1) * CCMR_CHANNEL_SHIFT;
+	ccmr = (TIM_CCMR_PE | TIM_CCMR_M1) << shift;
+	mask = CCMR_CHANNEL_MASK << shift;
+
+	if (pwm->hwpwm < 2)
+		regmap_update_bits(priv->regmap, TIM_CCMR1, mask, ccmr);
+	else
+		regmap_update_bits(priv->regmap, TIM_CCMR2, mask, ccmr);
+
+	regmap_update_bits(priv->regmap, TIM_BDTR,
+			   TIM_BDTR_MOE | TIM_BDTR_AOE,
+			   TIM_BDTR_MOE | TIM_BDTR_AOE);
+
+	return 0;
+}
+
+static int stm32_pwm_set_polarity(struct pwm_chip *chip, struct pwm_device *pwm,
+				  enum pwm_polarity polarity)
+{
+	u32 mask;
+	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
+
+	mask = TIM_CCER_CC1P << (pwm->hwpwm * 4);
+	if (priv->have_complementary_output)
+		mask |= TIM_CCER_CC1NP << (pwm->hwpwm * 4);
+
+	regmap_update_bits(priv->regmap, TIM_CCER, mask,
+			   polarity == PWM_POLARITY_NORMAL ? 0 : mask);
+
+	return 0;
+}
+
+static int stm32_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
+{
+	u32 mask;
+	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
+
+	clk_enable(priv->clk);
+
+	/* Enable channel */
+	mask = TIM_CCER_CC1E << (pwm->hwpwm * 4);
+	if (priv->have_complementary_output)
+		mask |= TIM_CCER_CC1NE << (pwm->hwpwm * 4);
+
+	regmap_update_bits(priv->regmap, TIM_CCER, mask, mask);
+
+	/* Make sure that registers are updated */
+	regmap_update_bits(priv->regmap, TIM_EGR, TIM_EGR_UG, TIM_EGR_UG);
+
+	/* Enable controller */
+	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_CEN, TIM_CR1_CEN);
+
+	return 0;
+}
+
+static void stm32_pwm_disable(struct pwm_chip *chip, struct pwm_device *pwm)
+{
+	u32 mask;
+	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
+
+	/* Disable channel */
+	mask = TIM_CCER_CC1E << (pwm->hwpwm * 4);
+	if (priv->have_complementary_output)
+		mask |= TIM_CCER_CC1NE << (pwm->hwpwm * 4);
+
+	regmap_update_bits(priv->regmap, TIM_CCER, mask, 0);
+
+	/* When all channels are disabled, we can disable the controller */
+	if (!active_channels(priv))
+		regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_CEN, 0);
+
+	clk_disable(priv->clk);
+}
+
+static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+			   struct pwm_state *state)
+{
+	struct pwm_state curstate;
+	bool enabled;
+	int ret;
+
+	pwm_get_state(pwm, &curstate);
+	enabled = curstate.enabled;
+
+	if (enabled && !state->enabled) {
+		stm32_pwm_disable(chip, pwm);
+		return 0;
+	}
+
+	if (state->polarity != curstate.polarity && enabled)
+		stm32_pwm_set_polarity(chip, pwm, state->polarity);
+
+	ret = stm32_pwm_config(chip, pwm, state->duty_cycle, state->period);
+	if (ret)
+		return ret;
+
+	if (!enabled && state->enabled)
+		ret = stm32_pwm_enable(chip, pwm);
+
+	return ret;
+}
+
+static const struct pwm_ops stm32pwm_ops = {
+	.owner = THIS_MODULE,
+	.apply = stm32_pwm_apply,
+};
+
+static int stm32_pwm_set_breakinput(struct stm32_pwm *priv,
+				    int level, int filter)
+{
+	u32 bdtr = TIM_BDTR_BKE;
+
+	if (level)
+		bdtr |= TIM_BDTR_BKP;
+
+	bdtr |= (filter & TIM_BDTR_BKF_MASK) << TIM_BDTR_BKF_SHIFT;
+
+	regmap_update_bits(priv->regmap,
+			   TIM_BDTR, TIM_BDTR_BKE | TIM_BDTR_BKP | TIM_BDTR_BKF,
+			   bdtr);
+
+	regmap_read(priv->regmap, TIM_BDTR, &bdtr);
+
+	return (bdtr & TIM_BDTR_BKE) ? 0 : -EINVAL;
+}
+
+static int stm32_pwm_set_breakinput2(struct stm32_pwm *priv,
+				     int level, int filter)
+{
+	u32 bdtr = TIM_BDTR_BK2E;
+
+	if (level)
+		bdtr |= TIM_BDTR_BK2P;
+
+	bdtr |= (filter & TIM_BDTR_BKF_MASK) << TIM_BDTR_BK2F_SHIFT;
+
+	regmap_update_bits(priv->regmap,
+			   TIM_BDTR, TIM_BDTR_BK2E |
+			   TIM_BDTR_BK2P |
+			   TIM_BDTR_BK2F,
+			   bdtr);
+
+	regmap_read(priv->regmap, TIM_BDTR, &bdtr);
+
+	return (bdtr & TIM_BDTR_BK2E) ? 0 : -EINVAL;
+}
+
+static int stm32_pwm_apply_breakinputs(struct stm32_pwm *priv,
+				       struct device_node *np)
+{
+	struct stm32_breakinput breakinput[MAX_BREAKINPUT];
+	int nb, ret, i, array_size;
+
+	nb = of_property_count_elems_of_size(np, "st,breakinput",
+					     sizeof(struct stm32_breakinput));
+
+	/*
+	 * Because "st,breakinput" parameter is optional do not make probe
+	 * failed if it doesn't exist.
+	 */
+	if (nb <= 0)
+		return 0;
+
+	if (nb > MAX_BREAKINPUT)
+		return -EINVAL;
+
+	array_size = nb * sizeof(struct stm32_breakinput) / sizeof(u32);
+	ret = of_property_read_u32_array(np, "st,breakinput",
+					 &breakinput[0].index, array_size);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < nb && !ret; i++) {
+		switch (breakinput[i].index) {
+		case 0:
+		{
+			ret = stm32_pwm_set_breakinput(priv,
+						       breakinput[i].level,
+						       breakinput[i].filter);
+			break;
+		}
+		case 1:
+		{
+			ret = stm32_pwm_set_breakinput2(priv,
+							breakinput[i].level,
+							breakinput[i].filter);
+
+			break;
+		}
+		default:
+		{
+			ret = -EINVAL;
+			break;
+		}
+		}
+	}
+
+	return ret;
+}
+
+static void stm32_pwm_detect_complementary(struct stm32_pwm *priv)
+{
+	u32 ccer;
+
+	/*
+	 * If complementary bit doesn't exist writing 1 will have no
+	 * effect so we can detect it.
+	 */
+	regmap_update_bits(priv->regmap,
+			   TIM_CCER, TIM_CCER_CC1NE, TIM_CCER_CC1NE);
+	regmap_read(priv->regmap, TIM_CCER, &ccer);
+	regmap_update_bits(priv->regmap, TIM_CCER, TIM_CCER_CCXE, 0);
+
+	priv->have_complementary_output = (ccer != 0);
+}
+
+static void stm32_pwm_detect_channels(struct stm32_pwm *priv)
+{
+	u32 ccer;
+
+	/*
+	 * If channels enable bits don't exist writing 1 will have no
+	 * effect so we can detect and count them.
+	 */
+	regmap_update_bits(priv->regmap,
+			   TIM_CCER, TIM_CCER_CCXE, TIM_CCER_CCXE);
+	regmap_read(priv->regmap, TIM_CCER, &ccer);
+	regmap_update_bits(priv->regmap, TIM_CCER, TIM_CCER_CCXE, 0);
+
+	if (ccer & TIM_CCER_CC1E)
+		priv->npwm++;
+
+	if (ccer & TIM_CCER_CC2E)
+		priv->npwm++;
+
+	if (ccer & TIM_CCER_CC3E)
+		priv->npwm++;
+
+	if (ccer & TIM_CCER_CC4E)
+		priv->npwm++;
+}
+
+static int stm32_pwm_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device_node *np = dev->of_node;
+	struct stm32_timers *ddata = dev_get_drvdata(pdev->dev.parent);
+	struct stm32_pwm *priv;
+	int ret;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->regmap = ddata->regmap;
+	priv->clk = ddata->clk;
+	priv->max_arr = ddata->max_arr;
+
+	if (!priv->regmap || !priv->clk)
+		return -EINVAL;
+
+	ret = stm32_pwm_apply_breakinputs(priv, np);
+	if (ret)
+		return ret;
+
+	stm32_pwm_detect_complementary(priv);
+	stm32_pwm_detect_channels(priv);
+
+	priv->chip.base = -1;
+	priv->chip.dev = dev;
+	priv->chip.ops = &stm32pwm_ops;
+	priv->chip.npwm = priv->npwm;
+
+	ret = pwmchip_add(&priv->chip);
+	if (ret < 0)
+		return ret;
+
+	platform_set_drvdata(pdev, priv);
+
+	return 0;
+}
+
+static int stm32_pwm_remove(struct platform_device *pdev)
+{
+	struct stm32_pwm *priv = platform_get_drvdata(pdev);
+	unsigned int i;
+
+	for (i = 0; i < priv->npwm; i++)
+		pwm_disable(&priv->chip.pwms[i]);
+
+	pwmchip_remove(&priv->chip);
+
+	return 0;
+}
+
+static const struct of_device_id stm32_pwm_of_match[] = {
+	{ .compatible = "st,stm32-pwm",	},
+	{ /* end node */ },
+};
+MODULE_DEVICE_TABLE(of, stm32_pwm_of_match);
+
+static struct platform_driver stm32_pwm_driver = {
+	.probe	= stm32_pwm_probe,
+	.remove	= stm32_pwm_remove,
+	.driver	= {
+		.name = "stm32-pwm",
+		.of_match_table = stm32_pwm_of_match,
+	},
+};
+module_platform_driver(stm32_pwm_driver);
+
+MODULE_ALIAS("platform: stm32-pwm");
+MODULE_DESCRIPTION("STMicroelectronics STM32 PWM driver");
+MODULE_LICENSE("GPL v2");
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 5/8] IIO: add bindings for STM32 timer trigger driver
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Define bindings for STM32 timer trigger

version 4:
- remove triggers enumeration from DT
- add reg parameter

version 3:
- change file name
- add cross reference with mfd bindings

version 2:
- only keep one compatible
- add DT parameters to set lists of the triggers:
  one list describe the triggers created by the device
  another one give the triggers accepted by the device

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 .../bindings/iio/timer/stm32-timer-trigger.txt     | 23 ++++++++++++++++++++++
 1 file changed, 23 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iio/timer/stm32-timer-trigger.txt

diff --git a/Documentation/devicetree/bindings/iio/timer/stm32-timer-trigger.txt b/Documentation/devicetree/bindings/iio/timer/stm32-timer-trigger.txt
new file mode 100644
index 0000000..36a6c4a
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/timer/stm32-timer-trigger.txt
@@ -0,0 +1,23 @@
+STMicroelectronics STM32 Timers IIO timer bindings
+
+Must be a sub-node of an STM32 Timers device tree node.
+See ../mfd/stm32-timers.txt for details about the parent node.
+
+Required parameters:
+- compatible:	Must be "st,stm32-timer-trigger".
+- reg:		Define triggers configuration of the hardware IP.
+
+Example:
+	timers at 40010000 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "st,stm32-timers";
+		reg = <0x40010000 0x400>;
+		clocks = <&rcc 0 160>;
+		clock-names = "clk_int";
+
+		timer {
+			compatible = "st,stm32-timer-trigger";
+			reg = <0>;
+		};
+	};
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 6/8] IIO: add STM32 timer trigger driver
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Timers IPs can be used to generate triggers for other IPs like
DAC, ADC or other timers.
Each trigger may result of timer internals signals like counter enable,
reset or edge, this configuration could be done through "master_mode"
device attribute.

A timer device could be triggered by other timers, we use the trigger
name and is_stm32_iio_timer_trigger() function to distinguish them
and configure IP input switch.

Timer may also decide on which event (edge, level) they could
be activated by a trigger, this configuration is done by writing in
"slave_mode" device attribute.

Since triggers could also be used by DAC or ADC their names are defined
in include/ nux/iio/timer/stm32-timer-trigger.h so those IPs will be able
to configure themselves in valid_trigger function

Trigger have a "sampling_frequency" attribute which allow to configure
timer sampling frequency without using PWM interface

version 5:
- simplify tables of triggers
- only create an IIO device when needed

version 4:
- get triggers configuration from "reg" in DT
- add tables of triggers
- sampling frequency is enable/disable when writing in trigger
  sampling_frequency attribute
- no more use of interruptions

version 3:
- change compatible to "st,stm32-timer-trigger"
- fix attributes access right
- use string instead of int for master_mode and slave_mode
- document device attributes in sysfs-bus-iio-timer-stm32

version 2:
- keep only one compatible
- use st,input-triggers-names and st,output-triggers-names
  to know which triggers are accepted and/or create by the device

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 .../ABI/testing/sysfs-bus-iio-timer-stm32          |  55 +++
 drivers/iio/Kconfig                                |   2 +-
 drivers/iio/Makefile                               |   1 +
 drivers/iio/timer/Kconfig                          |  13 +
 drivers/iio/timer/Makefile                         |   1 +
 drivers/iio/timer/stm32-timer-trigger.c            | 466 +++++++++++++++++++++
 drivers/iio/trigger/Kconfig                        |   1 -
 include/linux/iio/timer/stm32-timer-trigger.h      |  62 +++
 8 files changed, 599 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
 create mode 100644 drivers/iio/timer/Kconfig
 create mode 100644 drivers/iio/timer/Makefile
 create mode 100644 drivers/iio/timer/stm32-timer-trigger.c
 create mode 100644 include/linux/iio/timer/stm32-timer-trigger.h

diff --git a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
new file mode 100644
index 0000000..26583dd
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
@@ -0,0 +1,55 @@
+What:		/sys/bus/iio/devices/iio:deviceX/master_mode_available
+KernelVersion:	4.10
+Contact:	benjamin.gaignard at st.com
+Description:
+		Reading returns the list possible master modes which are:
+		- "reset"     :	The UG bit from the TIMx_EGR register is used as trigger output (TRGO).
+		- "enable"    : The Counter Enable signal CNT_EN is used as trigger output.
+		- "update"    : The update event is selected as trigger output.
+				For instance a master timer can then be used as a prescaler for a slave timer.
+		- "compare_pulse" : The trigger output send a positive pulse when the CC1IF flag is to be set.
+		- "OC1REF"    : OC1REF signal is used as trigger output.
+		- "OC2REF"    : OC2REF signal is used as trigger output.
+		- "OC3REF"    : OC3REF signal is used as trigger output.
+		- "OC4REF"    : OC4REF signal is used as trigger output.
+
+What:		/sys/bus/iio/devices/iio:deviceX/master_mode
+KernelVersion:	4.10
+Contact:	benjamin.gaignard at st.com
+Description:
+		Reading returns the current master modes.
+		Writing set the master mode
+
+What:		/sys/bus/iio/devices/iio:deviceX/slave_mode_available
+KernelVersion:	4.10
+Contact:	benjamin.gaignard at st.com
+Description:
+		Reading returns the list possible slave modes which are:
+		- "disabled"  : The prescaler is clocked directly by the internal clock.
+		- "encoder_1" : Counter counts up/down on TI2FP1 edge depending on TI1FP2 level.
+		- "encoder_2" : Counter counts up/down on TI1FP2 edge depending on TI2FP1 level.
+		- "encoder_3" : Counter counts up/down on both TI1FP1 and TI2FP2 edges depending
+				on the level of the other input.
+		- "reset"     : Rising edge of the selected trigger input reinitializes the counter
+				and generates an update of the registers.
+		- "gated"     : The counter clock is enabled when the trigger input is high.
+				The counter stops (but is not reset) as soon as the trigger becomes low.
+				Both start and stop of the counter are controlled.
+		- "trigger"   : The counter starts at a rising edge of the trigger TRGI (but it is not
+				reset). Only the start of the counter is controlled.
+		- "external_clock": Rising edges of the selected trigger (TRGI) clock the counter.
+
+What:		/sys/bus/iio/devices/iio:deviceX/slave_mode
+KernelVersion:	4.10
+Contact:	benjamin.gaignard at st.com
+Description:
+		Reading returns the current slave mode.
+		Writing set the slave mode
+
+What:		/sys/bus/iio/devices/triggerX/sampling_frequency
+KernelVersion:	4.10
+Contact:	benjamin.gaignard at st.com
+Description:
+		Reading returns the current sampling frequency.
+		Writing an value different of 0 set and start sampling.
+		Writing 0 stop sampling.
diff --git a/drivers/iio/Kconfig b/drivers/iio/Kconfig
index 6743b18..2de2a80 100644
--- a/drivers/iio/Kconfig
+++ b/drivers/iio/Kconfig
@@ -90,5 +90,5 @@ source "drivers/iio/potentiometer/Kconfig"
 source "drivers/iio/pressure/Kconfig"
 source "drivers/iio/proximity/Kconfig"
 source "drivers/iio/temperature/Kconfig"
-
+source "drivers/iio/timer/Kconfig"
 endif # IIO
diff --git a/drivers/iio/Makefile b/drivers/iio/Makefile
index 87e4c43..b797c08 100644
--- a/drivers/iio/Makefile
+++ b/drivers/iio/Makefile
@@ -32,4 +32,5 @@ obj-y += potentiometer/
 obj-y += pressure/
 obj-y += proximity/
 obj-y += temperature/
+obj-y += timer/
 obj-y += trigger/
diff --git a/drivers/iio/timer/Kconfig b/drivers/iio/timer/Kconfig
new file mode 100644
index 0000000..e3c21f2
--- /dev/null
+++ b/drivers/iio/timer/Kconfig
@@ -0,0 +1,13 @@
+#
+# Timers drivers
+
+menu "Timers"
+
+config IIO_STM32_TIMER_TRIGGER
+	tristate "STM32 Timer Trigger"
+	depends on (ARCH_STM32 && OF && MFD_STM32_TIMERS) || COMPILE_TEST
+	select IIO_TRIGGERED_EVENT
+	help
+	  Select this option to enable STM32 Timer Trigger
+
+endmenu
diff --git a/drivers/iio/timer/Makefile b/drivers/iio/timer/Makefile
new file mode 100644
index 0000000..4ad95ec9
--- /dev/null
+++ b/drivers/iio/timer/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_IIO_STM32_TIMER_TRIGGER) += stm32-timer-trigger.o
diff --git a/drivers/iio/timer/stm32-timer-trigger.c b/drivers/iio/timer/stm32-timer-trigger.c
new file mode 100644
index 0000000..8d16e8f
--- /dev/null
+++ b/drivers/iio/timer/stm32-timer-trigger.c
@@ -0,0 +1,466 @@
+/*
+ * Copyright (C) STMicroelectronics 2016
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#include <linux/iio/iio.h>
+#include <linux/iio/sysfs.h>
+#include <linux/iio/timer/stm32-timer-trigger.h>
+#include <linux/iio/trigger.h>
+#include <linux/iio/triggered_event.h>
+#include <linux/interrupt.h>
+#include <linux/mfd/stm32-timers.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+
+#define MAX_TRIGGERS 6
+#define MAX_VALIDS 5
+
+/* List the triggers created by each timer */
+static const void *triggers_table[][MAX_TRIGGERS] = {
+	{ TIM1_TRGO, TIM1_CH1, TIM1_CH2, TIM1_CH3, TIM1_CH4,},
+	{ TIM2_TRGO, TIM2_CH1, TIM2_CH2, TIM2_CH3, TIM2_CH4,},
+	{ TIM3_TRGO, TIM3_CH1, TIM3_CH2, TIM3_CH3, TIM3_CH4,},
+	{ TIM4_TRGO, TIM4_CH1, TIM4_CH2, TIM4_CH3, TIM4_CH4,},
+	{ TIM5_TRGO, TIM5_CH1, TIM5_CH2, TIM5_CH3, TIM5_CH4,},
+	{ TIM6_TRGO,},
+	{ TIM7_TRGO,},
+	{ TIM8_TRGO, TIM8_CH1, TIM8_CH2, TIM8_CH3, TIM8_CH4,},
+	{ TIM9_TRGO, TIM9_CH1, TIM9_CH2,},
+	{ TIM12_TRGO, TIM12_CH1, TIM12_CH2,},
+};
+
+/* List the triggers accepted by each timer */
+static const void *valids_table[][MAX_VALIDS] = {
+	{ TIM5_TRGO, TIM2_TRGO, TIM4_TRGO, TIM3_TRGO,},
+	{ TIM1_TRGO, TIM8_TRGO, TIM3_TRGO, TIM4_TRGO,},
+	{ TIM1_TRGO, TIM8_TRGO, TIM5_TRGO, TIM4_TRGO,},
+	{ TIM1_TRGO, TIM2_TRGO, TIM3_TRGO, TIM8_TRGO,},
+	{ TIM2_TRGO, TIM3_TRGO, TIM4_TRGO, TIM8_TRGO,},
+	{ }, /* timer 6 */
+	{ }, /* timer 7 */
+	{ TIM1_TRGO, TIM2_TRGO, TIM4_TRGO, TIM5_TRGO,},
+	{ TIM2_TRGO, TIM3_TRGO,},
+	{ TIM4_TRGO, TIM5_TRGO,},
+};
+
+struct stm32_timer_trigger {
+	struct device *dev;
+	struct regmap *regmap;
+	struct clk *clk;
+	u32 max_arr;
+	const void *triggers;
+	const void *valids;
+};
+
+static int stm32_timer_start(struct stm32_timer_trigger *priv,
+			     unsigned int frequency)
+{
+	unsigned long long prd, div;
+	int prescaler = 0;
+	u32 ccer, cr1;
+
+	/* Period and prescaler values depends of clock rate */
+	div = (unsigned long long)clk_get_rate(priv->clk);
+
+	do_div(div, frequency);
+
+	prd = div;
+
+	/*
+	 * Increase prescaler value until we get a result that fit
+	 * with auto reload register maximum value.
+	 */
+	while (div > priv->max_arr) {
+		prescaler++;
+		div = prd;
+		do_div(div, (prescaler + 1));
+	}
+	prd = div;
+
+	if (prescaler > MAX_TIM_PSC) {
+		dev_err(priv->dev, "prescaler exceeds the maximum value\n");
+		return -EINVAL;
+	}
+
+	/* Check if nobody else use the timer */
+	regmap_read(priv->regmap, TIM_CCER, &ccer);
+	if (ccer & TIM_CCER_CCXE)
+		return -EBUSY;
+
+	regmap_read(priv->regmap, TIM_CR1, &cr1);
+	if (!(cr1 & TIM_CR1_CEN))
+		clk_enable(priv->clk);
+
+	regmap_write(priv->regmap, TIM_PSC, prescaler);
+	regmap_write(priv->regmap, TIM_ARR, prd - 1);
+	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_ARPE, TIM_CR1_ARPE);
+
+	/* Force master mode to update mode */
+	regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS, 0x20);
+
+	/* Make sure that registers are updated */
+	regmap_update_bits(priv->regmap, TIM_EGR, TIM_EGR_UG, TIM_EGR_UG);
+
+	/* Enable controller */
+	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_CEN, TIM_CR1_CEN);
+
+	return 0;
+}
+
+static void stm32_timer_stop(struct stm32_timer_trigger *priv)
+{
+	u32 ccer, cr1;
+
+	regmap_read(priv->regmap, TIM_CCER, &ccer);
+	if (ccer & TIM_CCER_CCXE)
+		return;
+
+	regmap_read(priv->regmap, TIM_CR1, &cr1);
+	if (cr1 & TIM_CR1_CEN)
+		clk_disable(priv->clk);
+
+	/* Stop timer */
+	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_CEN, 0);
+	regmap_write(priv->regmap, TIM_PSC, 0);
+	regmap_write(priv->regmap, TIM_ARR, 0);
+}
+
+static ssize_t stm32_tt_store_frequency(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t len)
+{
+	struct iio_trigger *trig = to_iio_trigger(dev);
+	struct stm32_timer_trigger *priv = iio_trigger_get_drvdata(trig);
+	unsigned int freq;
+	int ret;
+
+	ret = kstrtouint(buf, 10, &freq);
+	if (ret)
+		return ret;
+
+	if (freq == 0) {
+		stm32_timer_stop(priv);
+	} else {
+		ret = stm32_timer_start(priv, freq);
+		if (ret)
+			return ret;
+	}
+
+	return len;
+}
+
+static ssize_t stm32_tt_read_frequency(struct device *dev,
+				       struct device_attribute *attr, char *buf)
+{
+	struct iio_trigger *trig = to_iio_trigger(dev);
+	struct stm32_timer_trigger *priv = iio_trigger_get_drvdata(trig);
+	u32 psc, arr, cr1;
+	unsigned long long freq = 0;
+
+	regmap_read(priv->regmap, TIM_CR1, &cr1);
+	regmap_read(priv->regmap, TIM_PSC, &psc);
+	regmap_read(priv->regmap, TIM_ARR, &arr);
+
+	if (psc && arr && (cr1 & TIM_CR1_CEN)) {
+		freq = (unsigned long long)clk_get_rate(priv->clk);
+		do_div(freq, psc);
+		do_div(freq, arr);
+	}
+
+	return sprintf(buf, "%d\n", (unsigned int)freq);
+}
+
+static IIO_DEV_ATTR_SAMP_FREQ(0660,
+			      stm32_tt_read_frequency,
+			      stm32_tt_store_frequency);
+
+static struct attribute *stm32_trigger_attrs[] = {
+	&iio_dev_attr_sampling_frequency.dev_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group stm32_trigger_attr_group = {
+	.attrs = stm32_trigger_attrs,
+};
+
+static const struct attribute_group *stm32_trigger_attr_groups[] = {
+	&stm32_trigger_attr_group,
+	NULL,
+};
+
+static char *master_mode_table[] = {
+	"reset",
+	"enable",
+	"update",
+	"compare_pulse",
+	"OC1REF",
+	"OC2REF",
+	"OC3REF",
+	"OC4REF"
+};
+
+static ssize_t stm32_tt_show_master_mode(struct device *dev,
+					 struct device_attribute *attr,
+					 char *buf)
+{
+	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
+	struct stm32_timer_trigger *priv = iio_priv(indio_dev);
+	u32 cr2;
+
+	regmap_read(priv->regmap, TIM_CR2, &cr2);
+	cr2 = (cr2 & TIM_CR2_MMS) >> TIM_CR2_MMS_SHIFT;
+
+	return snprintf(buf, PAGE_SIZE, "%s\n", master_mode_table[cr2]);
+}
+
+static ssize_t stm32_tt_store_master_mode(struct device *dev,
+					  struct device_attribute *attr,
+					  const char *buf, size_t len)
+{
+	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
+	struct stm32_timer_trigger *priv = iio_priv(indio_dev);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(master_mode_table); i++) {
+		if (!strncmp(master_mode_table[i], buf,
+			     strlen(master_mode_table[i]))) {
+			regmap_update_bits(priv->regmap, TIM_CR2,
+					   TIM_CR2_MMS, i << TIM_CR2_MMS_SHIFT);
+			return len;
+		}
+	}
+
+	return -EINVAL;
+}
+
+static IIO_CONST_ATTR(master_mode_available,
+	"reset enable update compare_pulse OC1REF OC2REF OC3REF OC4REF");
+
+static IIO_DEVICE_ATTR(master_mode, 0660,
+		       stm32_tt_show_master_mode,
+		       stm32_tt_store_master_mode,
+		       0);
+
+static char *slave_mode_table[] = {
+	"disabled",
+	"encoder_1",
+	"encoder_2",
+	"encoder_3",
+	"reset",
+	"gated",
+	"trigger",
+	"external_clock",
+};
+
+static ssize_t stm32_tt_show_slave_mode(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
+	struct stm32_timer_trigger *priv = iio_priv(indio_dev);
+	u32 smcr;
+
+	regmap_read(priv->regmap, TIM_SMCR, &smcr);
+	smcr &= TIM_SMCR_SMS;
+
+	return snprintf(buf, PAGE_SIZE, "%s\n", slave_mode_table[smcr]);
+}
+
+static ssize_t stm32_tt_store_slave_mode(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t len)
+{
+	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
+	struct stm32_timer_trigger *priv = iio_priv(indio_dev);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(slave_mode_table); i++) {
+		if (!strncmp(slave_mode_table[i], buf,
+			     strlen(slave_mode_table[i]))) {
+			regmap_update_bits(priv->regmap,
+					   TIM_SMCR, TIM_SMCR_SMS, i);
+			return len;
+		}
+	}
+
+	return -EINVAL;
+}
+
+static IIO_CONST_ATTR(slave_mode_available,
+"disabled encoder_1 encoder_2 encoder_3 reset gated trigger external_clock");
+
+static IIO_DEVICE_ATTR(slave_mode, 0660,
+		       stm32_tt_show_slave_mode,
+		       stm32_tt_store_slave_mode,
+		       0);
+
+static struct attribute *stm32_timer_attrs[] = {
+	&iio_dev_attr_master_mode.dev_attr.attr,
+	&iio_const_attr_master_mode_available.dev_attr.attr,
+	&iio_dev_attr_slave_mode.dev_attr.attr,
+	&iio_const_attr_slave_mode_available.dev_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group stm32_timer_attr_group = {
+	.attrs = stm32_timer_attrs,
+};
+
+static const struct iio_trigger_ops timer_trigger_ops = {
+	.owner = THIS_MODULE,
+};
+
+static int stm32_setup_iio_triggers(struct stm32_timer_trigger *priv)
+{
+	int ret;
+	const char * const *cur = priv->triggers;
+
+	while (cur && *cur) {
+		struct iio_trigger *trig;
+
+		trig = devm_iio_trigger_alloc(priv->dev, "%s", *cur);
+		if  (!trig)
+			return -ENOMEM;
+
+		trig->dev.parent = priv->dev->parent;
+		trig->ops = &timer_trigger_ops;
+		trig->dev.groups = stm32_trigger_attr_groups;
+		iio_trigger_set_drvdata(trig, priv);
+
+		ret = devm_iio_trigger_register(priv->dev, trig);
+		if (ret)
+			return ret;
+		cur++;
+	}
+
+	return 0;
+}
+
+/**
+ * is_stm32_timer_trigger
+ * @trig: trigger to be checked
+ *
+ * return true if the trigger is a valid stm32 iio timer trigger
+ * either return false
+ */
+bool is_stm32_timer_trigger(struct iio_trigger *trig)
+{
+	return (trig->ops == &timer_trigger_ops);
+}
+EXPORT_SYMBOL(is_stm32_timer_trigger);
+
+static int stm32_validate_trigger(struct iio_dev *indio_dev,
+				  struct iio_trigger *trig)
+{
+	struct stm32_timer_trigger *priv = iio_priv(indio_dev);
+	const char * const *cur = priv->valids;
+	unsigned int i = 0;
+
+	if (!is_stm32_timer_trigger(trig))
+		return -EINVAL;
+
+	while (cur && *cur) {
+		if (!strncmp(trig->name, *cur, strlen(trig->name))) {
+			regmap_update_bits(priv->regmap,
+					   TIM_SMCR, TIM_SMCR_TS,
+					   i << TIM_SMCR_TS_SHIFT);
+			return 0;
+		}
+		cur++;
+		i++;
+	}
+
+	return -EINVAL;
+}
+
+static const struct iio_info stm32_trigger_info = {
+	.driver_module = THIS_MODULE,
+	.validate_trigger = stm32_validate_trigger,
+	.attrs = &stm32_timer_attr_group,
+};
+
+static struct stm32_timer_trigger *stm32_setup_iio_device(struct device *dev)
+{
+	struct iio_dev *indio_dev;
+	int ret;
+
+	indio_dev = devm_iio_device_alloc(dev,
+					  sizeof(struct stm32_timer_trigger));
+	if (!indio_dev)
+		return NULL;
+
+	indio_dev->name = dev_name(dev);
+	indio_dev->dev.parent = dev;
+	indio_dev->info = &stm32_trigger_info;
+	indio_dev->modes = INDIO_EVENT_TRIGGERED;
+	indio_dev->num_channels = 0;
+	indio_dev->dev.of_node = dev->of_node;
+
+	ret = devm_iio_device_register(dev, indio_dev);
+	if (ret)
+		return NULL;
+
+	return iio_priv(indio_dev);
+}
+
+static int stm32_timer_trigger_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct stm32_timer_trigger *priv;
+	struct stm32_timers *ddata = dev_get_drvdata(pdev->dev.parent);
+	unsigned int index;
+	int ret;
+
+	if (of_property_read_u32(dev->of_node, "reg", &index))
+		return -EINVAL;
+
+	if (index >= ARRAY_SIZE(triggers_table))
+		return -EINVAL;
+
+	/* Create an IIO device only if we have triggers to be validated */
+	if (*valids_table[index])
+		priv = stm32_setup_iio_device(dev);
+	else
+		priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+
+	if (!priv)
+		return -ENOMEM;
+
+	priv->dev = dev;
+	priv->regmap = ddata->regmap;
+	priv->clk = ddata->clk;
+	priv->max_arr = ddata->max_arr;
+	priv->triggers = triggers_table[index];
+	priv->valids = valids_table[index];
+
+	ret = stm32_setup_iio_triggers(priv);
+	if (ret)
+		return ret;
+
+	platform_set_drvdata(pdev, priv);
+
+	return 0;
+}
+
+static const struct of_device_id stm32_trig_of_match[] = {
+	{ .compatible = "st,stm32-timer-trigger", },
+	{ /* end node */ },
+};
+MODULE_DEVICE_TABLE(of, stm32_trig_of_match);
+
+static struct platform_driver stm32_timer_trigger_driver = {
+	.probe = stm32_timer_trigger_probe,
+	.driver = {
+		.name = "stm32-timer-trigger",
+		.of_match_table = stm32_trig_of_match,
+	},
+};
+module_platform_driver(stm32_timer_trigger_driver);
+
+MODULE_ALIAS("platform: stm32-timer-trigger");
+MODULE_DESCRIPTION("STMicroelectronics STM32 Timer Trigger driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iio/trigger/Kconfig b/drivers/iio/trigger/Kconfig
index 809b2e7..f2af4fe 100644
--- a/drivers/iio/trigger/Kconfig
+++ b/drivers/iio/trigger/Kconfig
@@ -46,5 +46,4 @@ config IIO_SYSFS_TRIGGER
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called iio-trig-sysfs.
-
 endmenu
diff --git a/include/linux/iio/timer/stm32-timer-trigger.h b/include/linux/iio/timer/stm32-timer-trigger.h
new file mode 100644
index 0000000..55535ae
--- /dev/null
+++ b/include/linux/iio/timer/stm32-timer-trigger.h
@@ -0,0 +1,62 @@
+/*
+ * Copyright (C) STMicroelectronics 2016
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#ifndef _STM32_TIMER_TRIGGER_H_
+#define _STM32_TIMER_TRIGGER_H_
+
+#define TIM1_TRGO	"tim1_trgo"
+#define TIM1_CH1	"tim1_ch1"
+#define TIM1_CH2	"tim1_ch2"
+#define TIM1_CH3	"tim1_ch3"
+#define TIM1_CH4	"tim1_ch4"
+
+#define TIM2_TRGO	"tim2_trgo"
+#define TIM2_CH1	"tim2_ch1"
+#define TIM2_CH2	"tim2_ch2"
+#define TIM2_CH3	"tim2_ch3"
+#define TIM2_CH4	"tim2_ch4"
+
+#define TIM3_TRGO	"tim3_trgo"
+#define TIM3_CH1	"tim3_ch1"
+#define TIM3_CH2	"tim3_ch2"
+#define TIM3_CH3	"tim3_ch3"
+#define TIM3_CH4	"tim3_ch4"
+
+#define TIM4_TRGO	"tim4_trgo"
+#define TIM4_CH1	"tim4_ch1"
+#define TIM4_CH2	"tim4_ch2"
+#define TIM4_CH3	"tim4_ch3"
+#define TIM4_CH4	"tim4_ch4"
+
+#define TIM5_TRGO	"tim5_trgo"
+#define TIM5_CH1	"tim5_ch1"
+#define TIM5_CH2	"tim5_ch2"
+#define TIM5_CH3	"tim5_ch3"
+#define TIM5_CH4	"tim5_ch4"
+
+#define TIM6_TRGO	"tim6_trgo"
+
+#define TIM7_TRGO	"tim7_trgo"
+
+#define TIM8_TRGO	"tim8_trgo"
+#define TIM8_CH1	"tim8_ch1"
+#define TIM8_CH2	"tim8_ch2"
+#define TIM8_CH3	"tim8_ch3"
+#define TIM8_CH4	"tim8_ch4"
+
+#define TIM9_TRGO	"tim9_trgo"
+#define TIM9_CH1	"tim9_ch1"
+#define TIM9_CH2	"tim9_ch2"
+
+#define TIM12_TRGO	"tim12_trgo"
+#define TIM12_CH1	"tim12_ch1"
+#define TIM12_CH2	"tim12_ch2"
+
+bool is_stm32_timer_trigger(struct iio_trigger *trig);
+
+#endif
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 7/8] ARM: dts: stm32: add Timers driver for stm32f429 MCU
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Add Timers and it sub-nodes into DT for stm32f429 family.

version 6:
- split patch in two: one for SoC family and one for stm32f469
  discovery board.

version 5:
- rename gptimer node to timers
- re-order timers node par addresses

version 4:
- remove unwanted indexing in pwm@ and timer@ node name
- use "reg" instead of additional parameters to set timer
  configuration

version 3:
- use "st,stm32-timer-trigger" in DT

version 2:
- use parameters to describe hardware capabilities
- do not use references for pwm and iio timer subnodes

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 arch/arm/boot/dts/stm32f429.dtsi | 275 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 275 insertions(+)

diff --git a/arch/arm/boot/dts/stm32f429.dtsi b/arch/arm/boot/dts/stm32f429.dtsi
index bca491d..d0fb9cc 100644
--- a/arch/arm/boot/dts/stm32f429.dtsi
+++ b/arch/arm/boot/dts/stm32f429.dtsi
@@ -355,6 +355,21 @@
 					slew-rate = <2>;
 				};
 			};
+
+			pwm1_pins: pwm at 1 {
+				pins {
+					pinmux = <STM32F429_PA8_FUNC_TIM1_CH1>,
+						 <STM32F429_PB13_FUNC_TIM1_CH1N>,
+						 <STM32F429_PB12_FUNC_TIM1_BKIN>;
+				};
+			};
+
+			pwm3_pins: pwm at 3 {
+				pins {
+					pinmux = <STM32F429_PB4_FUNC_TIM3_CH1>,
+						 <STM32F429_PB5_FUNC_TIM3_CH2>;
+				};
+			};
 		};
 
 		rcc: rcc at 40023810 {
@@ -426,6 +441,266 @@
 			interrupts = <80>;
 			clocks = <&rcc 0 38>;
 		};
+
+		timers2: timers at 40000000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40000000 0x400>;
+			clocks = <&rcc 0 128>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <1>;
+				status = "disabled";
+			};
+		};
+
+		timers3: timers at 40000400 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40000400 0x400>;
+			clocks = <&rcc 0 129>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <2>;
+				status = "disabled";
+			};
+		};
+
+		timers4: timers at 40000800 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40000800 0x400>;
+			clocks = <&rcc 0 130>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <3>;
+				status = "disabled";
+			};
+		};
+
+		timers5: timers at 40000C00 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40000C00 0x400>;
+			clocks = <&rcc 0 131>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <4>;
+				status = "disabled";
+			};
+		};
+
+		timers6: timers at 40001000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40001000 0x400>;
+			clocks = <&rcc 0 132>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <5>;
+				status = "disabled";
+			};
+		};
+
+		timers7: timers at 40001400 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40001400 0x400>;
+			clocks = <&rcc 0 133>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <6>;
+				status = "disabled";
+			};
+		};
+
+		timers12: timers at 40001800 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40001800 0x400>;
+			clocks = <&rcc 0 134>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <9>;
+				status = "disabled";
+			};
+		};
+
+		timers13: timers at 40001C00 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40001C00 0x400>;
+			clocks = <&rcc 0 135>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+		};
+
+		timers14: timers at 40002000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40002000 0x400>;
+			clocks = <&rcc 0 136>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+		};
+
+		timers1: timers at 40010000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40010000 0x400>;
+			clocks = <&rcc 0 160>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <0>;
+				status = "disabled";
+			};
+		};
+
+		timers8: timers at 40010400 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40010400 0x400>;
+			clocks = <&rcc 0 161>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <7>;
+				status = "disabled";
+			};
+		};
+
+		timers9: timers at 40014000 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40014000 0x400>;
+			clocks = <&rcc 0 176>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+
+			timer {
+				compatible = "st,stm32-timer-trigger";
+				reg = <8>;
+				status = "disabled";
+			};
+		};
+
+		timers10: timers at 40014400 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40014400 0x400>;
+			clocks = <&rcc 0 177>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+		};
+
+		timers11: timers at 40014800 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-timers";
+			reg = <0x40014800 0x400>;
+			clocks = <&rcc 0 178>;
+			clock-names = "clk_int";
+			status = "disabled";
+
+			pwm {
+				compatible = "st,stm32-pwm";
+				status = "disabled";
+			};
+		};
 	};
 };
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH v6 8/8] ARM: dts: stm32: Enable pw1 and pwm3 for stm32f469-disco
From: Benjamin Gaignard @ 2016-12-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481292919-26587-1-git-send-email-benjamin.gaignard@st.com>

Define and enable pwm1 and pwm3 for stm32f469 discovery board

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
---
 arch/arm/boot/dts/stm32f469-disco.dts | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/arm/boot/dts/stm32f469-disco.dts b/arch/arm/boot/dts/stm32f469-disco.dts
index 8a163d7..0750a71 100644
--- a/arch/arm/boot/dts/stm32f469-disco.dts
+++ b/arch/arm/boot/dts/stm32f469-disco.dts
@@ -81,3 +81,31 @@
 &usart3 {
 	status = "okay";
 };
+
+&timers1 {
+	status = "okay";
+
+	pwm {
+		pinctrl-0 = <&pwm1_pins>;
+		pinctrl-names = "default";
+		status = "okay";
+	};
+
+	timer {
+		status = "okay";
+	};
+};
+
+&timers3 {
+	status = "okay";
+
+	pwm {
+		pinctrl-0 = <&pwm3_pins>;
+		pinctrl-names = "default";
+		status = "okay";
+	};
+
+	timer {
+		status = "okay";
+	};
+};
-- 
1.9.1

^ permalink raw reply related

* [RESEND-PATCH] ARM: EXYNOS: remove smp hook from machine descriptor
From: Krzysztof Kozlowski @ 2016-12-09 14:24 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <fbc36736-ce7c-3eed-2833-c2e45ea02ce5@samsung.com>

On Fri, Dec 09, 2016 at 02:42:36PM +0530, pankaj.dubey wrote:
> Hi Krzysztof,
> 
> On Thursday 08 December 2016 11:03 PM, Krzysztof Kozlowski wrote:
> > On Thu, Dec 08, 2016 at 08:32:15AM +0530, Pankaj Dubey wrote:
> >> Use CPU_METHOD_OF_DECLARE() for smp_ops instead of using it
> >> via machine descriptor.
> >>
> >> Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
> >> ---
> >>
> >> Resending as I missed to include samsung mailing list.
> >>
> >>  arch/arm/boot/dts/exynos3250.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos4210.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos4212.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos4412.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos5250.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos5260.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos5410.dtsi      | 1 +
> >>  arch/arm/boot/dts/exynos5420-cpus.dtsi | 1 +
> >>  arch/arm/boot/dts/exynos5422-cpus.dtsi | 1 +
> >>  arch/arm/boot/dts/exynos5440.dtsi      | 1 +
> >>  arch/arm/mach-exynos/common.h          | 2 --
> >>  arch/arm/mach-exynos/exynos.c          | 1 -
> >>  arch/arm/mach-exynos/platsmp.c         | 2 ++
> >>  13 files changed, 12 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm/boot/dts/exynos3250.dtsi b/arch/arm/boot/dts/exynos3250.dtsi
> >> index ba17ee1..f28f669 100644
> >> --- a/arch/arm/boot/dts/exynos3250.dtsi
> >> +++ b/arch/arm/boot/dts/exynos3250.dtsi
> >> @@ -53,6 +53,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos4210.dtsi b/arch/arm/boot/dts/exynos4210.dtsi
> >> index 7f3a18c..6dfd98d 100644
> >> --- a/arch/arm/boot/dts/exynos4210.dtsi
> >> +++ b/arch/arm/boot/dts/exynos4210.dtsi
> >> @@ -35,6 +35,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 900 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos4212.dtsi b/arch/arm/boot/dts/exynos4212.dtsi
> >> index 5389011..3e8982e 100644
> >> --- a/arch/arm/boot/dts/exynos4212.dtsi
> >> +++ b/arch/arm/boot/dts/exynos4212.dtsi
> >> @@ -25,6 +25,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at A00 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos4412.dtsi b/arch/arm/boot/dts/exynos4412.dtsi
> >> index 40beede..faf2fb8 100644
> >> --- a/arch/arm/boot/dts/exynos4412.dtsi
> >> +++ b/arch/arm/boot/dts/exynos4412.dtsi
> >> @@ -25,6 +25,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at A00 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5250.dtsi b/arch/arm/boot/dts/exynos5250.dtsi
> >> index b6d7444..580897c 100644
> >> --- a/arch/arm/boot/dts/exynos5250.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5250.dtsi
> >> @@ -52,6 +52,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5260.dtsi b/arch/arm/boot/dts/exynos5260.dtsi
> >> index 5818718..1af6e76 100644
> >> --- a/arch/arm/boot/dts/exynos5260.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5260.dtsi
> >> @@ -32,6 +32,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5410.dtsi b/arch/arm/boot/dts/exynos5410.dtsi
> >> index 2b6adaf..b092cdc 100644
> >> --- a/arch/arm/boot/dts/exynos5410.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5410.dtsi
> >> @@ -33,6 +33,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5420-cpus.dtsi b/arch/arm/boot/dts/exynos5420-cpus.dtsi
> >> index 5c052d7..a587704 100644
> >> --- a/arch/arm/boot/dts/exynos5420-cpus.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5420-cpus.dtsi
> >> @@ -24,6 +24,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi
> >> index bf3c6f1..7fcdfd0 100644
> >> --- a/arch/arm/boot/dts/exynos5422-cpus.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi
> >> @@ -23,6 +23,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu0: cpu at 100 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/boot/dts/exynos5440.dtsi b/arch/arm/boot/dts/exynos5440.dtsi
> >> index 2a2e570..0a958e8 100644
> >> --- a/arch/arm/boot/dts/exynos5440.dtsi
> >> +++ b/arch/arm/boot/dts/exynos5440.dtsi
> >> @@ -50,6 +50,7 @@
> >>  	cpus {
> >>  		#address-cells = <1>;
> >>  		#size-cells = <0>;
> >> +		enable-method = "samsung,exynos-smp";
> >>  
> >>  		cpu at 0 {
> >>  			device_type = "cpu";
> >> diff --git a/arch/arm/mach-exynos/common.h b/arch/arm/mach-exynos/common.h
> >> index fb12d11..051e1ab 100644
> >> --- a/arch/arm/mach-exynos/common.h
> >> +++ b/arch/arm/mach-exynos/common.h
> >> @@ -143,8 +143,6 @@ static inline void exynos_pm_init(void) {}
> >>  extern void exynos_cpu_resume(void);
> >>  extern void exynos_cpu_resume_ns(void);
> >>  
> >> -extern const struct smp_operations exynos_smp_ops;
> >> -
> >>  extern void exynos_cpu_power_down(int cpu);
> >>  extern void exynos_cpu_power_up(int cpu);
> >>  extern int  exynos_cpu_power_state(int cpu);
> >> diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
> >> index fa08ef9..f0a766e 100644
> >> --- a/arch/arm/mach-exynos/exynos.c
> >> +++ b/arch/arm/mach-exynos/exynos.c
> >> @@ -211,7 +211,6 @@ DT_MACHINE_START(EXYNOS_DT, "SAMSUNG EXYNOS (Flattened Device Tree)")
> >>  	/* Maintainer: Kukjin Kim <kgene.kim@samsung.com> */
> >>  	.l2c_aux_val	= 0x3c400001,
> >>  	.l2c_aux_mask	= 0xc20fffff,
> >> -	.smp		= smp_ops(exynos_smp_ops),
> >>  	.map_io		= exynos_init_io,
> >>  	.init_early	= exynos_firmware_init,
> >>  	.init_irq	= exynos_init_irq,
> >> diff --git a/arch/arm/mach-exynos/platsmp.c b/arch/arm/mach-exynos/platsmp.c
> >> index 94405c7..43eec10 100644
> >> --- a/arch/arm/mach-exynos/platsmp.c
> >> +++ b/arch/arm/mach-exynos/platsmp.c
> >> @@ -474,3 +474,5 @@ const struct smp_operations exynos_smp_ops __initconst = {
> >>  	.cpu_die		= exynos_cpu_die,
> >>  #endif
> >>  };
> >> +
> >> +CPU_METHOD_OF_DECLARE(exynos_smp, "samsung,exynos-smp", &exynos_smp_ops);
> > 
> > Three issues:
> > 1. This has to be documented. Checkpatch did not complain about it?
> 
> No it didn't.
> 
> > 2. I think this breaks the ABI with existing DTBs because now the
> >    enable-method of cpus becomes mandatory. But the
> >    Documentation/devicetree/bindings/arm/cpus.txt is saying that:
> >    "On ARM 32-bit systems this property is optional and can be one of"
> > 
> 
> I am not very sure that this is an ABI break, as other platforms (e.g
> hisilicon,hip01-smp) also adopted this as some later stage and they also
> removed smp hook support from their machine files when they adopted to
> this enable-method in DTS files.

So they broke the ABI. :)

> 
> If we want to keep older DTBs keep working with new Kernel image, then I
> need to drop patch from mach-exynos and keep smp_ops hook in machine
> descriptor as it is to keep supporting older DTBs. I can see some
> platforms have adopted this way as well.

Please, go ahead with this solution. The bindings documentation clearly
states that this is an optional field so changing it to "required" is an
ABI break.

> 
> Surely I will add new bindings details in
> Documentation/devicetree/bindings/arm/cpus.txt file. I am not sure why
> checkpatch did not complain about this?
> 
> > 3. Please split DTS changes to separate patches. This is, by the way,
> >    connected with #2 above: if there was no ABI break, then the DTS
> >    could go to separate branch easily.
> > 
> 
> Since I am not sure if this will considered as ABI break or not, I just
> looked how this was handled in other platforms, I can see some platforms
> have clubbed DTS change along with mach files, and some have done in
> separate patch as well. So I will look for suggestion from you for this
> how we can go for exynos platform?

Please split it and make safe for legacy DTBs without enable-method
property.

Best regards,
Krzysztof

^ permalink raw reply

* [PATCH v2 0/3] crypto: arm64/ARM: NEON accelerated ChaCha20 *skcipher*
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

Another port of existing x86 SSE code to NEON, again both for arm64 and ARM.

ChaCha20 is a stream cipher described in RFC 7539, and is intended to be
an efficient software implementable 'standby cipher', in case AES cannot
be used.

This NEON implementation is almost 2x as fast as the generic C code
(measured on Cortex-A57 using the arm64 version)

Changes in v2:
- add patch to convert the generic and x86 to skciphers first
- tweaked the arm64 version for some additional performance
- use chunksize == 4x blocksize for optimal speed

Ard Biesheuvel (3):
  crypto: chacha20 - convert generic and x86 versions to skcipher
  crypto: arm64/chacha20 - implement NEON version based on SSE3 code
  crypto: arm/chacha20 - implement NEON version based on SSE3 code

 arch/arm/crypto/Kconfig                |   6 +
 arch/arm/crypto/Makefile               |   2 +
 arch/arm/crypto/chacha20-neon-core.S   | 524 ++++++++++++++++++++
 arch/arm/crypto/chacha20-neon-glue.c   | 127 +++++
 arch/arm64/crypto/Kconfig              |   6 +
 arch/arm64/crypto/Makefile             |   3 +
 arch/arm64/crypto/chacha20-neon-core.S | 450 +++++++++++++++++
 arch/arm64/crypto/chacha20-neon-glue.c | 122 +++++
 arch/x86/crypto/chacha20_glue.c        |  69 ++-
 crypto/chacha20_generic.c              |  73 ++-
 include/crypto/chacha20.h              |   6 +-
 11 files changed, 1304 insertions(+), 84 deletions(-)
 create mode 100644 arch/arm/crypto/chacha20-neon-core.S
 create mode 100644 arch/arm/crypto/chacha20-neon-glue.c
 create mode 100644 arch/arm64/crypto/chacha20-neon-core.S
 create mode 100644 arch/arm64/crypto/chacha20-neon-glue.c

-- 
2.7.4

^ permalink raw reply

* [PATCH v2 1/3] crypto: chacha20 - convert generic and x86 versions to skcipher
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>

This converts the ChaCha20 code from a blkcipher to a skcipher, which
is now the preferred way to implement symmetric block and stream ciphers.

This ports the generic and x86 versions at the same time because the
latter reuses routines of the former.

Note that the skcipher_walk() API guarantees that all presented blocks
except the final one are a multiple of the chunk size, so we can simplify
the encrypt() routine somewhat.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/x86/crypto/chacha20_glue.c | 69 +++++++++---------
 crypto/chacha20_generic.c       | 73 ++++++++------------
 include/crypto/chacha20.h       |  6 +-
 3 files changed, 64 insertions(+), 84 deletions(-)

diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index f910d1d449f0..78f75b07dc25 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -11,7 +11,7 @@
 
 #include <crypto/algapi.h>
 #include <crypto/chacha20.h>
-#include <linux/crypto.h>
+#include <crypto/internal/skcipher.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <asm/fpu/api.h>
@@ -63,36 +63,34 @@ static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src,
 	}
 }
 
-static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst,
-			 struct scatterlist *src, unsigned int nbytes)
+static int chacha20_simd(struct skcipher_request *req)
 {
-	u32 *state, state_buf[16 + (CHACHA20_STATE_ALIGN / sizeof(u32)) - 1];
-	struct blkcipher_walk walk;
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	u32 state[16] __aligned(CHACHA20_STATE_ALIGN);
+	struct skcipher_walk walk;
 	int err;
 
-	if (nbytes <= CHACHA20_BLOCK_SIZE || !may_use_simd())
-		return crypto_chacha20_crypt(desc, dst, src, nbytes);
+	if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
+		return crypto_chacha20_crypt(req);
 
-	state = (u32 *)roundup((uintptr_t)state_buf, CHACHA20_STATE_ALIGN);
+	err = skcipher_walk_virt(&walk, req, true);
 
-	blkcipher_walk_init(&walk, dst, src, nbytes);
-	err = blkcipher_walk_virt_block(desc, &walk, CHACHA20_BLOCK_SIZE);
-
-	crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv);
+	crypto_chacha20_init(state, ctx, walk.iv);
 
 	kernel_fpu_begin();
 
 	while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
 		chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
 				rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
-		err = blkcipher_walk_done(desc, &walk,
-					  walk.nbytes % CHACHA20_BLOCK_SIZE);
+		err = skcipher_walk_done(&walk,
+					 walk.nbytes % CHACHA20_BLOCK_SIZE);
 	}
 
 	if (walk.nbytes) {
 		chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
 				walk.nbytes);
-		err = blkcipher_walk_done(desc, &walk, 0);
+		err = skcipher_walk_done(&walk, 0);
 	}
 
 	kernel_fpu_end();
@@ -100,27 +98,22 @@ static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst,
 	return err;
 }
 
-static struct crypto_alg alg = {
-	.cra_name		= "chacha20",
-	.cra_driver_name	= "chacha20-simd",
-	.cra_priority		= 300,
-	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER,
-	.cra_blocksize		= 1,
-	.cra_type		= &crypto_blkcipher_type,
-	.cra_ctxsize		= sizeof(struct chacha20_ctx),
-	.cra_alignmask		= sizeof(u32) - 1,
-	.cra_module		= THIS_MODULE,
-	.cra_u			= {
-		.blkcipher = {
-			.min_keysize	= CHACHA20_KEY_SIZE,
-			.max_keysize	= CHACHA20_KEY_SIZE,
-			.ivsize		= CHACHA20_IV_SIZE,
-			.geniv		= "seqiv",
-			.setkey		= crypto_chacha20_setkey,
-			.encrypt	= chacha20_simd,
-			.decrypt	= chacha20_simd,
-		},
-	},
+static struct skcipher_alg alg = {
+	.base.cra_name		= "chacha20",
+	.base.cra_driver_name	= "chacha20-simd",
+	.base.cra_priority	= 300,
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_alignmask	= sizeof(u32) - 1,
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= CHACHA20_KEY_SIZE,
+	.max_keysize		= CHACHA20_KEY_SIZE,
+	.ivsize			= CHACHA20_IV_SIZE,
+	.chunksize		= CHACHA20_BLOCK_SIZE,
+	.setkey			= crypto_chacha20_setkey,
+	.encrypt		= chacha20_simd,
+	.decrypt		= chacha20_simd,
 };
 
 static int __init chacha20_simd_mod_init(void)
@@ -133,12 +126,12 @@ static int __init chacha20_simd_mod_init(void)
 			    boot_cpu_has(X86_FEATURE_AVX2) &&
 			    cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL);
 #endif
-	return crypto_register_alg(&alg);
+	return crypto_register_skcipher(&alg);
 }
 
 static void __exit chacha20_simd_mod_fini(void)
 {
-	crypto_unregister_alg(&alg);
+	crypto_unregister_skcipher(&alg);
 }
 
 module_init(chacha20_simd_mod_init);
diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
index 1cab83146e33..8b3c04d625c3 100644
--- a/crypto/chacha20_generic.c
+++ b/crypto/chacha20_generic.c
@@ -10,10 +10,9 @@
  */
 
 #include <crypto/algapi.h>
-#include <linux/crypto.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
 #include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/module.h>
 
 static inline u32 le32_to_cpuvp(const void *p)
 {
@@ -63,10 +62,10 @@ void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv)
 }
 EXPORT_SYMBOL_GPL(crypto_chacha20_init);
 
-int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			   unsigned int keysize)
 {
-	struct chacha20_ctx *ctx = crypto_tfm_ctx(tfm);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
 	int i;
 
 	if (keysize != CHACHA20_KEY_SIZE)
@@ -79,66 +78,54 @@ int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
 }
 EXPORT_SYMBOL_GPL(crypto_chacha20_setkey);
 
-int crypto_chacha20_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
-			  struct scatterlist *src, unsigned int nbytes)
+int crypto_chacha20_crypt(struct skcipher_request *req)
 {
-	struct blkcipher_walk walk;
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
 	u32 state[16];
 	int err;
 
-	blkcipher_walk_init(&walk, dst, src, nbytes);
-	err = blkcipher_walk_virt_block(desc, &walk, CHACHA20_BLOCK_SIZE);
-
-	crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv);
+	err = skcipher_walk_virt(&walk, req, true);
 
-	while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
-		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
-				 rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
-		err = blkcipher_walk_done(desc, &walk,
-					  walk.nbytes % CHACHA20_BLOCK_SIZE);
-	}
+	crypto_chacha20_init(state, ctx, walk.iv);
 
-	if (walk.nbytes) {
+	while (walk.nbytes > 0) {
 		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
 				 walk.nbytes);
-		err = blkcipher_walk_done(desc, &walk, 0);
+		err = skcipher_walk_done(&walk, 0);
 	}
 
 	return err;
 }
 EXPORT_SYMBOL_GPL(crypto_chacha20_crypt);
 
-static struct crypto_alg alg = {
-	.cra_name		= "chacha20",
-	.cra_driver_name	= "chacha20-generic",
-	.cra_priority		= 100,
-	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER,
-	.cra_blocksize		= 1,
-	.cra_type		= &crypto_blkcipher_type,
-	.cra_ctxsize		= sizeof(struct chacha20_ctx),
-	.cra_alignmask		= sizeof(u32) - 1,
-	.cra_module		= THIS_MODULE,
-	.cra_u			= {
-		.blkcipher = {
-			.min_keysize	= CHACHA20_KEY_SIZE,
-			.max_keysize	= CHACHA20_KEY_SIZE,
-			.ivsize		= CHACHA20_IV_SIZE,
-			.geniv		= "seqiv",
-			.setkey		= crypto_chacha20_setkey,
-			.encrypt	= crypto_chacha20_crypt,
-			.decrypt	= crypto_chacha20_crypt,
-		},
-	},
+static struct skcipher_alg alg = {
+	.base.cra_name		= "chacha20",
+	.base.cra_driver_name	= "chacha20-generic",
+	.base.cra_priority	= 100,
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_alignmask	= sizeof(u32) - 1,
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= CHACHA20_KEY_SIZE,
+	.max_keysize		= CHACHA20_KEY_SIZE,
+	.ivsize			= CHACHA20_IV_SIZE,
+	.chunksize		= CHACHA20_BLOCK_SIZE,
+	.setkey			= crypto_chacha20_setkey,
+	.encrypt		= crypto_chacha20_crypt,
+	.decrypt		= crypto_chacha20_crypt,
 };
 
 static int __init chacha20_generic_mod_init(void)
 {
-	return crypto_register_alg(&alg);
+	return crypto_register_skcipher(&alg);
 }
 
 static void __exit chacha20_generic_mod_fini(void)
 {
-	crypto_unregister_alg(&alg);
+	crypto_unregister_skcipher(&alg);
 }
 
 module_init(chacha20_generic_mod_init);
diff --git a/include/crypto/chacha20.h b/include/crypto/chacha20.h
index 20d20f681a72..445fc45f4b5b 100644
--- a/include/crypto/chacha20.h
+++ b/include/crypto/chacha20.h
@@ -5,6 +5,7 @@
 #ifndef _CRYPTO_CHACHA20_H
 #define _CRYPTO_CHACHA20_H
 
+#include <crypto/skcipher.h>
 #include <linux/types.h>
 #include <linux/crypto.h>
 
@@ -18,9 +19,8 @@ struct chacha20_ctx {
 
 void chacha20_block(u32 *state, void *stream);
 void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv);
-int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key,
+int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			   unsigned int keysize);
-int crypto_chacha20_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
-			  struct scatterlist *src, unsigned int nbytes);
+int crypto_chacha20_crypt(struct skcipher_request *req);
 
 #endif
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 2/3] crypto: arm64/chacha20 - implement NEON version based on SSE3 code
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>

This is a straight port to arm64/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/Kconfig              |   6 +
 arch/arm64/crypto/Makefile             |   3 +
 arch/arm64/crypto/chacha20-neon-core.S | 450 ++++++++++++++++++++
 arch/arm64/crypto/chacha20-neon-glue.c | 122 ++++++
 4 files changed, 581 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 450a85df041a..0bf0f531f539 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -72,4 +72,10 @@ config CRYPTO_CRC32_ARM64
 	depends on ARM64
 	select CRYPTO_HASH
 
+config CRYPTO_CHACHA20_NEON
+	tristate "NEON accelerated ChaCha20 symmetric cipher"
+	depends on KERNEL_MODE_NEON
+	select CRYPTO_BLKCIPHER
+	select CRYPTO_CHACHA20
+
 endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index aa8888d7b744..9d2826c5fccf 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -41,6 +41,9 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
 sha512-arm64-y := sha512-glue.o sha512-core.o
 
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
+chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
+
 AFLAGS_aes-ce.o		:= -DINTERLEAVE=4
 AFLAGS_aes-neon.o	:= -DINTERLEAVE=4
 
diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha20-neon-core.S
new file mode 100644
index 000000000000..3cbbf23dc4d2
--- /dev/null
+++ b/arch/arm64/crypto/chacha20-neon-core.S
@@ -0,0 +1,450 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/linkage.h>
+
+	.text
+	.align		6
+
+ENTRY(chacha20_block_xor_neon)
+	// x0: Input state matrix, s
+	// x1: 1 data block output, o
+	// x2: 1 data block input, i
+
+	//
+	// This function encrypts one ChaCha20 block by loading the state matrix
+	// in four NEON registers. It performs matrix operation on four words in
+	// parallel, but requires shuffling to rearrange the words after each
+	// round.
+	//
+
+	// x0..3 = s0..3
+	adr		x3, ROT8
+	ld1		{v0.4s-v3.4s}, [x0]
+	ld1		{v8.4s-v11.4s}, [x0]
+	ld1		{v12.4s}, [x3]
+
+	mov		x3, #10
+
+.Ldoubleround:
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+	add		v0.4s, v0.4s, v1.4s
+	eor		v3.16b, v3.16b, v0.16b
+	rev32		v3.8h, v3.8h
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+	add		v2.4s, v2.4s, v3.4s
+	eor		v4.16b, v1.16b, v2.16b
+	shl		v1.4s, v4.4s, #12
+	sri		v1.4s, v4.4s, #20
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+	add		v0.4s, v0.4s, v1.4s
+	eor		v3.16b, v3.16b, v0.16b
+	tbl		v3.16b, {v3.16b}, v12.16b
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+	add		v2.4s, v2.4s, v3.4s
+	eor		v4.16b, v1.16b, v2.16b
+	shl		v1.4s, v4.4s, #7
+	sri		v1.4s, v4.4s, #25
+
+	// x1 = shuffle32(x1, MASK(0, 3, 2, 1))
+	ext		v1.16b, v1.16b, v1.16b, #4
+	// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+	ext		v2.16b, v2.16b, v2.16b, #8
+	// x3 = shuffle32(x3, MASK(2, 1, 0, 3))
+	ext		v3.16b, v3.16b, v3.16b, #12
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+	add		v0.4s, v0.4s, v1.4s
+	eor		v3.16b, v3.16b, v0.16b
+	rev32		v3.8h, v3.8h
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+	add		v2.4s, v2.4s, v3.4s
+	eor		v4.16b, v1.16b, v2.16b
+	shl		v1.4s, v4.4s, #12
+	sri		v1.4s, v4.4s, #20
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+	add		v0.4s, v0.4s, v1.4s
+	eor		v3.16b, v3.16b, v0.16b
+	tbl		v3.16b, {v3.16b}, v12.16b
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+	add		v2.4s, v2.4s, v3.4s
+	eor		v4.16b, v1.16b, v2.16b
+	shl		v1.4s, v4.4s, #7
+	sri		v1.4s, v4.4s, #25
+
+	// x1 = shuffle32(x1, MASK(2, 1, 0, 3))
+	ext		v1.16b, v1.16b, v1.16b, #12
+	// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+	ext		v2.16b, v2.16b, v2.16b, #8
+	// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
+	ext		v3.16b, v3.16b, v3.16b, #4
+
+	subs		x3, x3, #1
+	b.ne		.Ldoubleround
+
+	ld1		{v4.16b-v7.16b}, [x2]
+
+	// o0 = i0 ^ (x0 + s0)
+	add		v0.4s, v0.4s, v8.4s
+	eor		v0.16b, v0.16b, v4.16b
+
+	// o1 = i1 ^ (x1 + s1)
+	add		v1.4s, v1.4s, v9.4s
+	eor		v1.16b, v1.16b, v5.16b
+
+	// o2 = i2 ^ (x2 + s2)
+	add		v2.4s, v2.4s, v10.4s
+	eor		v2.16b, v2.16b, v6.16b
+
+	// o3 = i3 ^ (x3 + s3)
+	add		v3.4s, v3.4s, v11.4s
+	eor		v3.16b, v3.16b, v7.16b
+
+	st1		{v0.16b-v3.16b}, [x1]
+
+	ret
+ENDPROC(chacha20_block_xor_neon)
+
+	.align		6
+ENTRY(chacha20_4block_xor_neon)
+	// x0: Input state matrix, s
+	// x1: 4 data blocks output, o
+	// x2: 4 data blocks input, i
+
+	//
+	// This function encrypts four consecutive ChaCha20 blocks by loading
+	// the state matrix in NEON registers four times. The algorithm performs
+	// each operation on the corresponding word of each state matrix, hence
+	// requires no word shuffling. For final XORing step we transpose the
+	// matrix by interleaving 32- and then 64-bit words, which allows us to
+	// do XOR in NEON registers.
+	//
+	adr		x3, CTRINC		// ... and ROT8
+	ld1		{v30.4s-v31.4s}, [x3]
+
+	// x0..15[0-3] = s0..3[0..3]
+	mov		x4, x0
+	ld4r		{ v0.4s- v3.4s}, [x4], #16
+	ld4r		{ v4.4s- v7.4s}, [x4], #16
+	ld4r		{ v8.4s-v11.4s}, [x4], #16
+	ld4r		{v12.4s-v15.4s}, [x4]
+
+	// x12 += counter values 0-3
+	add		v12.4s, v12.4s, v30.4s
+
+	mov		x3, #10
+
+.Ldoubleround4:
+	// x0 += x4, x12 = rotl32(x12 ^ x0, 16)
+	// x1 += x5, x13 = rotl32(x13 ^ x1, 16)
+	// x2 += x6, x14 = rotl32(x14 ^ x2, 16)
+	// x3 += x7, x15 = rotl32(x15 ^ x3, 16)
+	add		v0.4s, v0.4s, v4.4s
+	add		v1.4s, v1.4s, v5.4s
+	add		v2.4s, v2.4s, v6.4s
+	add		v3.4s, v3.4s, v7.4s
+
+	eor		v12.16b, v12.16b, v0.16b
+	eor		v13.16b, v13.16b, v1.16b
+	eor		v14.16b, v14.16b, v2.16b
+	eor		v15.16b, v15.16b, v3.16b
+
+	rev32		v12.8h, v12.8h
+	rev32		v13.8h, v13.8h
+	rev32		v14.8h, v14.8h
+	rev32		v15.8h, v15.8h
+
+	// x8 += x12, x4 = rotl32(x4 ^ x8, 12)
+	// x9 += x13, x5 = rotl32(x5 ^ x9, 12)
+	// x10 += x14, x6 = rotl32(x6 ^ x10, 12)
+	// x11 += x15, x7 = rotl32(x7 ^ x11, 12)
+	add		v8.4s, v8.4s, v12.4s
+	add		v9.4s, v9.4s, v13.4s
+	add		v10.4s, v10.4s, v14.4s
+	add		v11.4s, v11.4s, v15.4s
+
+	eor		v18.16b, v4.16b, v8.16b
+	eor		v19.16b, v5.16b, v9.16b
+	eor		v20.16b, v6.16b, v10.16b
+	eor		v21.16b, v7.16b, v11.16b
+
+	shl		v4.4s, v18.4s, #12
+	shl		v5.4s, v19.4s, #12
+	shl		v6.4s, v20.4s, #12
+	shl		v7.4s, v21.4s, #12
+
+	sri		v4.4s, v18.4s, #20
+	sri		v5.4s, v19.4s, #20
+	sri		v6.4s, v20.4s, #20
+	sri		v7.4s, v21.4s, #20
+
+	// x0 += x4, x12 = rotl32(x12 ^ x0, 8)
+	// x1 += x5, x13 = rotl32(x13 ^ x1, 8)
+	// x2 += x6, x14 = rotl32(x14 ^ x2, 8)
+	// x3 += x7, x15 = rotl32(x15 ^ x3, 8)
+	add		v0.4s, v0.4s, v4.4s
+	add		v1.4s, v1.4s, v5.4s
+	add		v2.4s, v2.4s, v6.4s
+	add		v3.4s, v3.4s, v7.4s
+
+	eor		v12.16b, v12.16b, v0.16b
+	eor		v13.16b, v13.16b, v1.16b
+	eor		v14.16b, v14.16b, v2.16b
+	eor		v15.16b, v15.16b, v3.16b
+
+	tbl		v12.16b, {v12.16b}, v31.16b
+	tbl		v13.16b, {v13.16b}, v31.16b
+	tbl		v14.16b, {v14.16b}, v31.16b
+	tbl		v15.16b, {v15.16b}, v31.16b
+
+	// x8 += x12, x4 = rotl32(x4 ^ x8, 7)
+	// x9 += x13, x5 = rotl32(x5 ^ x9, 7)
+	// x10 += x14, x6 = rotl32(x6 ^ x10, 7)
+	// x11 += x15, x7 = rotl32(x7 ^ x11, 7)
+	add		v8.4s, v8.4s, v12.4s
+	add		v9.4s, v9.4s, v13.4s
+	add		v10.4s, v10.4s, v14.4s
+	add		v11.4s, v11.4s, v15.4s
+
+	eor		v18.16b, v4.16b, v8.16b
+	eor		v19.16b, v5.16b, v9.16b
+	eor		v20.16b, v6.16b, v10.16b
+	eor		v21.16b, v7.16b, v11.16b
+
+	shl		v4.4s, v18.4s, #7
+	shl		v5.4s, v19.4s, #7
+	shl		v6.4s, v20.4s, #7
+	shl		v7.4s, v21.4s, #7
+
+	sri		v4.4s, v18.4s, #25
+	sri		v5.4s, v19.4s, #25
+	sri		v6.4s, v20.4s, #25
+	sri		v7.4s, v21.4s, #25
+
+	// x0 += x5, x15 = rotl32(x15 ^ x0, 16)
+	// x1 += x6, x12 = rotl32(x12 ^ x1, 16)
+	// x2 += x7, x13 = rotl32(x13 ^ x2, 16)
+	// x3 += x4, x14 = rotl32(x14 ^ x3, 16)
+	add		v0.4s, v0.4s, v5.4s
+	add		v1.4s, v1.4s, v6.4s
+	add		v2.4s, v2.4s, v7.4s
+	add		v3.4s, v3.4s, v4.4s
+
+	eor		v15.16b, v15.16b, v0.16b
+	eor		v12.16b, v12.16b, v1.16b
+	eor		v13.16b, v13.16b, v2.16b
+	eor		v14.16b, v14.16b, v3.16b
+
+	rev32		v15.8h, v15.8h
+	rev32		v12.8h, v12.8h
+	rev32		v13.8h, v13.8h
+	rev32		v14.8h, v14.8h
+
+	// x10 += x15, x5 = rotl32(x5 ^ x10, 12)
+	// x11 += x12, x6 = rotl32(x6 ^ x11, 12)
+	// x8 += x13, x7 = rotl32(x7 ^ x8, 12)
+	// x9 += x14, x4 = rotl32(x4 ^ x9, 12)
+	add		v10.4s, v10.4s, v15.4s
+	add		v11.4s, v11.4s, v12.4s
+	add		v8.4s, v8.4s, v13.4s
+	add		v9.4s, v9.4s, v14.4s
+
+	eor		v18.16b, v5.16b, v10.16b
+	eor		v19.16b, v6.16b, v11.16b
+	eor		v20.16b, v7.16b, v8.16b
+	eor		v21.16b, v4.16b, v9.16b
+
+	shl		v5.4s, v18.4s, #12
+	shl		v6.4s, v19.4s, #12
+	shl		v7.4s, v20.4s, #12
+	shl		v4.4s, v21.4s, #12
+
+	sri		v5.4s, v18.4s, #20
+	sri		v6.4s, v19.4s, #20
+	sri		v7.4s, v20.4s, #20
+	sri		v4.4s, v21.4s, #20
+
+	// x0 += x5, x15 = rotl32(x15 ^ x0, 8)
+	// x1 += x6, x12 = rotl32(x12 ^ x1, 8)
+	// x2 += x7, x13 = rotl32(x13 ^ x2, 8)
+	// x3 += x4, x14 = rotl32(x14 ^ x3, 8)
+	add		v0.4s, v0.4s, v5.4s
+	add		v1.4s, v1.4s, v6.4s
+	add		v2.4s, v2.4s, v7.4s
+	add		v3.4s, v3.4s, v4.4s
+
+	eor		v15.16b, v15.16b, v0.16b
+	eor		v12.16b, v12.16b, v1.16b
+	eor		v13.16b, v13.16b, v2.16b
+	eor		v14.16b, v14.16b, v3.16b
+
+	tbl		v15.16b, {v15.16b}, v31.16b
+	tbl		v12.16b, {v12.16b}, v31.16b
+	tbl		v13.16b, {v13.16b}, v31.16b
+	tbl		v14.16b, {v14.16b}, v31.16b
+
+	// x10 += x15, x5 = rotl32(x5 ^ x10, 7)
+	// x11 += x12, x6 = rotl32(x6 ^ x11, 7)
+	// x8 += x13, x7 = rotl32(x7 ^ x8, 7)
+	// x9 += x14, x4 = rotl32(x4 ^ x9, 7)
+	add		v10.4s, v10.4s, v15.4s
+	add		v11.4s, v11.4s, v12.4s
+	add		v8.4s, v8.4s, v13.4s
+	add		v9.4s, v9.4s, v14.4s
+
+	eor		v18.16b, v5.16b, v10.16b
+	eor		v19.16b, v6.16b, v11.16b
+	eor		v20.16b, v7.16b, v8.16b
+	eor		v21.16b, v4.16b, v9.16b
+
+	shl		v5.4s, v18.4s, #7
+	shl		v6.4s, v19.4s, #7
+	shl		v7.4s, v20.4s, #7
+	shl		v4.4s, v21.4s, #7
+
+	sri		v5.4s, v18.4s, #25
+	sri		v6.4s, v19.4s, #25
+	sri		v7.4s, v20.4s, #25
+	sri		v4.4s, v21.4s, #25
+
+	subs		x3, x3, #1
+	b.ne		.Ldoubleround4
+
+	ld4r		{v16.4s-v19.4s}, [x0], #16
+	ld4r		{v20.4s-v23.4s}, [x0], #16
+
+	// x12 += counter values 0-3
+	add		v12.4s, v12.4s, v30.4s
+
+	// x0[0-3] += s0[0]
+	// x1[0-3] += s0[1]
+	// x2[0-3] += s0[2]
+	// x3[0-3] += s0[3]
+	add		v0.4s, v0.4s, v16.4s
+	add		v1.4s, v1.4s, v17.4s
+	add		v2.4s, v2.4s, v18.4s
+	add		v3.4s, v3.4s, v19.4s
+
+	ld4r		{v24.4s-v27.4s}, [x0], #16
+	ld4r		{v28.4s-v31.4s}, [x0]
+
+	// x4[0-3] += s1[0]
+	// x5[0-3] += s1[1]
+	// x6[0-3] += s1[2]
+	// x7[0-3] += s1[3]
+	add		v4.4s, v4.4s, v20.4s
+	add		v5.4s, v5.4s, v21.4s
+	add		v6.4s, v6.4s, v22.4s
+	add		v7.4s, v7.4s, v23.4s
+
+	// x8[0-3] += s2[0]
+	// x9[0-3] += s2[1]
+	// x10[0-3] += s2[2]
+	// x11[0-3] += s2[3]
+	add		v8.4s, v8.4s, v24.4s
+	add		v9.4s, v9.4s, v25.4s
+	add		v10.4s, v10.4s, v26.4s
+	add		v11.4s, v11.4s, v27.4s
+
+	// x12[0-3] += s3[0]
+	// x13[0-3] += s3[1]
+	// x14[0-3] += s3[2]
+	// x15[0-3] += s3[3]
+	add		v12.4s, v12.4s, v28.4s
+	add		v13.4s, v13.4s, v29.4s
+	add		v14.4s, v14.4s, v30.4s
+	add		v15.4s, v15.4s, v31.4s
+
+	// interleave 32-bit words in state n, n+1
+	zip1		v16.4s, v0.4s, v1.4s
+	zip2		v17.4s, v0.4s, v1.4s
+	zip1		v18.4s, v2.4s, v3.4s
+	zip2		v19.4s, v2.4s, v3.4s
+	zip1		v20.4s, v4.4s, v5.4s
+	zip2		v21.4s, v4.4s, v5.4s
+	zip1		v22.4s, v6.4s, v7.4s
+	zip2		v23.4s, v6.4s, v7.4s
+	zip1		v24.4s, v8.4s, v9.4s
+	zip2		v25.4s, v8.4s, v9.4s
+	zip1		v26.4s, v10.4s, v11.4s
+	zip2		v27.4s, v10.4s, v11.4s
+	zip1		v28.4s, v12.4s, v13.4s
+	zip2		v29.4s, v12.4s, v13.4s
+	zip1		v30.4s, v14.4s, v15.4s
+	zip2		v31.4s, v14.4s, v15.4s
+
+	// interleave 64-bit words in state n, n+2
+	zip1		v0.2d, v16.2d, v18.2d
+	zip2		v4.2d, v16.2d, v18.2d
+	zip1		v8.2d, v17.2d, v19.2d
+	zip2		v12.2d, v17.2d, v19.2d
+	ld1		{v16.16b-v19.16b}, [x2], #64
+
+	zip1		v1.2d, v20.2d, v22.2d
+	zip2		v5.2d, v20.2d, v22.2d
+	zip1		v9.2d, v21.2d, v23.2d
+	zip2		v13.2d, v21.2d, v23.2d
+	ld1		{v20.16b-v23.16b}, [x2], #64
+
+	zip1		v2.2d, v24.2d, v26.2d
+	zip2		v6.2d, v24.2d, v26.2d
+	zip1		v10.2d, v25.2d, v27.2d
+	zip2		v14.2d, v25.2d, v27.2d
+	ld1		{v24.16b-v27.16b}, [x2], #64
+
+	zip1		v3.2d, v28.2d, v30.2d
+	zip2		v7.2d, v28.2d, v30.2d
+	zip1		v11.2d, v29.2d, v31.2d
+	zip2		v15.2d, v29.2d, v31.2d
+	ld1		{v28.16b-v31.16b}, [x2]
+
+	// xor with corresponding input, write to output
+	eor		v16.16b, v16.16b, v0.16b
+	eor		v17.16b, v17.16b, v1.16b
+	eor		v18.16b, v18.16b, v2.16b
+	eor		v19.16b, v19.16b, v3.16b
+	eor		v20.16b, v20.16b, v4.16b
+	eor		v21.16b, v21.16b, v5.16b
+	st1		{v16.16b-v19.16b}, [x1], #64
+	eor		v22.16b, v22.16b, v6.16b
+	eor		v23.16b, v23.16b, v7.16b
+	eor		v24.16b, v24.16b, v8.16b
+	eor		v25.16b, v25.16b, v9.16b
+	st1		{v20.16b-v23.16b}, [x1], #64
+	eor		v26.16b, v26.16b, v10.16b
+	eor		v27.16b, v27.16b, v11.16b
+	eor		v28.16b, v28.16b, v12.16b
+	st1		{v24.16b-v27.16b}, [x1], #64
+	eor		v29.16b, v29.16b, v13.16b
+	eor		v30.16b, v30.16b, v14.16b
+	eor		v31.16b, v31.16b, v15.16b
+	st1		{v28.16b-v31.16b}, [x1]
+
+	ret
+ENDPROC(chacha20_4block_xor_neon)
+
+CTRINC:	.word		0, 1, 2, 3
+ROT8:	.word		0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f
diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c
new file mode 100644
index 000000000000..06fdb8468ac6
--- /dev/null
+++ b/arch/arm64/crypto/chacha20-neon-glue.c
@@ -0,0 +1,122 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include <asm/neon.h>
+
+asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+
+static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
+			    unsigned int bytes)
+{
+	u8 buf[CHACHA20_BLOCK_SIZE];
+
+	while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+		chacha20_4block_xor_neon(state, dst, src);
+		bytes -= CHACHA20_BLOCK_SIZE * 4;
+		src += CHACHA20_BLOCK_SIZE * 4;
+		dst += CHACHA20_BLOCK_SIZE * 4;
+		state[12] += 4;
+	}
+	while (bytes >= CHACHA20_BLOCK_SIZE) {
+		chacha20_block_xor_neon(state, dst, src);
+		bytes -= CHACHA20_BLOCK_SIZE;
+		src += CHACHA20_BLOCK_SIZE;
+		dst += CHACHA20_BLOCK_SIZE;
+		state[12]++;
+	}
+	if (bytes) {
+		memcpy(buf, src, bytes);
+		chacha20_block_xor_neon(state, buf, buf);
+		memcpy(dst, buf, bytes);
+	}
+}
+
+static int chacha20_neon(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
+	u32 state[16];
+	int err;
+
+	if (req->cryptlen <= CHACHA20_BLOCK_SIZE)
+		return crypto_chacha20_crypt(req);
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	crypto_chacha20_init(state, ctx, walk.iv);
+
+	kernel_neon_begin();
+	while (walk.nbytes > 0) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.chunksize);
+
+		chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
+				nbytes);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+	}
+	kernel_neon_end();
+
+	return err;
+}
+
+static struct skcipher_alg alg = {
+	.base.cra_name		= "chacha20",
+	.base.cra_driver_name	= "chacha20-neon",
+	.base.cra_priority	= 300,
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_alignmask	= 1,
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= CHACHA20_KEY_SIZE,
+	.max_keysize		= CHACHA20_KEY_SIZE,
+	.ivsize			= CHACHA20_IV_SIZE,
+	.chunksize		= 4 * CHACHA20_BLOCK_SIZE,
+	.setkey			= crypto_chacha20_setkey,
+	.encrypt		= chacha20_neon,
+	.decrypt		= chacha20_neon,
+};
+
+static int __init chacha20_simd_mod_init(void)
+{
+	return crypto_register_skcipher(&alg);
+}
+
+static void __exit chacha20_simd_mod_fini(void)
+{
+	crypto_unregister_skcipher(&alg);
+}
+
+module_init(chacha20_simd_mod_init);
+module_exit(chacha20_simd_mod_fini);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("chacha20");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 3/3] crypto: arm/chacha20 - implement NEON version based on SSE3 code
From: Ard Biesheuvel @ 2016-12-09 14:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481294033-23508-1-git-send-email-ard.biesheuvel@linaro.org>

This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/crypto/Kconfig              |   6 +
 arch/arm/crypto/Makefile             |   2 +
 arch/arm/crypto/chacha20-neon-core.S | 524 ++++++++++++++++++++
 arch/arm/crypto/chacha20-neon-glue.c | 127 +++++
 4 files changed, 659 insertions(+)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 13f1b4c289d4..2f3339f015d3 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -130,4 +130,10 @@ config CRYPTO_CRC32_ARM_CE
 	depends on KERNEL_MODE_NEON && CRC32
 	select CRYPTO_HASH
 
+config CRYPTO_CHACHA20_NEON
+	tristate "NEON accelerated ChaCha20 symmetric cipher"
+	depends on KERNEL_MODE_NEON
+	select CRYPTO_BLKCIPHER
+	select CRYPTO_CHACHA20
+
 endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index b578a1820ab1..8d74e55eacd4 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
 obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
 obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
 
 ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -40,6 +41,7 @@ aes-arm-ce-y	:= aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
 crct10dif-arm-ce-y	:= crct10dif-ce-core.o crct10dif-ce-glue.o
 crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
+chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
 
 quiet_cmd_perl = PERL    $@
       cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S
new file mode 100644
index 000000000000..ff1d337bdb4a
--- /dev/null
+++ b/arch/arm/crypto/chacha20-neon-core.S
@@ -0,0 +1,524 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSE3 functions
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/linkage.h>
+
+	.text
+	.fpu		neon
+	.align		5
+
+ENTRY(chacha20_block_xor_neon)
+	// r0: Input state matrix, s
+	// r1: 1 data block output, o
+	// r2: 1 data block input, i
+
+	//
+	// This function encrypts one ChaCha20 block by loading the state matrix
+	// in four NEON registers. It performs matrix operation on four words in
+	// parallel, but requireds shuffling to rearrange the words after each
+	// round.
+	//
+
+	// x0..3 = s0..3
+	add		ip, r0, #0x20
+	vld1.32		{q0-q1}, [r0]
+	vld1.32		{q2-q3}, [ip]
+
+	vmov		q8, q0
+	vmov		q9, q1
+	vmov		q10, q2
+	vmov		q11, q3
+
+	mov		r3, #10
+
+.Ldoubleround:
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+	vadd.i32	q0, q0, q1
+	veor		q4, q3, q0
+	vshl.u32	q3, q4, #16
+	vsri.u32	q3, q4, #16
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+	vadd.i32	q2, q2, q3
+	veor		q4, q1, q2
+	vshl.u32	q1, q4, #12
+	vsri.u32	q1, q4, #20
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+	vadd.i32	q0, q0, q1
+	veor		q4, q3, q0
+	vshl.u32	q3, q4, #8
+	vsri.u32	q3, q4, #24
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+	vadd.i32	q2, q2, q3
+	veor		q4, q1, q2
+	vshl.u32	q1, q4, #7
+	vsri.u32	q1, q4, #25
+
+	// x1 = shuffle32(x1, MASK(0, 3, 2, 1))
+	vext.8		q1, q1, q1, #4
+	// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+	vext.8		q2, q2, q2, #8
+	// x3 = shuffle32(x3, MASK(2, 1, 0, 3))
+	vext.8		q3, q3, q3, #12
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 16)
+	vadd.i32	q0, q0, q1
+	veor		q4, q3, q0
+	vshl.u32	q3, q4, #16
+	vsri.u32	q3, q4, #16
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 12)
+	vadd.i32	q2, q2, q3
+	veor		q4, q1, q2
+	vshl.u32	q1, q4, #12
+	vsri.u32	q1, q4, #20
+
+	// x0 += x1, x3 = rotl32(x3 ^ x0, 8)
+	vadd.i32	q0, q0, q1
+	veor		q4, q3, q0
+	vshl.u32	q3, q4, #8
+	vsri.u32	q3, q4, #24
+
+	// x2 += x3, x1 = rotl32(x1 ^ x2, 7)
+	vadd.i32	q2, q2, q3
+	veor		q4, q1, q2
+	vshl.u32	q1, q4, #7
+	vsri.u32	q1, q4, #25
+
+	// x1 = shuffle32(x1, MASK(2, 1, 0, 3))
+	vext.8		q1, q1, q1, #12
+	// x2 = shuffle32(x2, MASK(1, 0, 3, 2))
+	vext.8		q2, q2, q2, #8
+	// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
+	vext.8		q3, q3, q3, #4
+
+	subs		r3, r3, #1
+	bne		.Ldoubleround
+
+	add		ip, r2, #0x20
+	vld1.8		{q4-q5}, [r2]
+	vld1.8		{q6-q7}, [ip]
+
+	// o0 = i0 ^ (x0 + s0)
+	vadd.i32	q0, q0, q8
+	veor		q0, q0, q4
+
+	// o1 = i1 ^ (x1 + s1)
+	vadd.i32	q1, q1, q9
+	veor		q1, q1, q5
+
+	// o2 = i2 ^ (x2 + s2)
+	vadd.i32	q2, q2, q10
+	veor		q2, q2, q6
+
+	// o3 = i3 ^ (x3 + s3)
+	vadd.i32	q3, q3, q11
+	veor		q3, q3, q7
+
+	add		ip, r1, #0x20
+	vst1.8		{q0-q1}, [r1]
+	vst1.8		{q2-q3}, [ip]
+
+	bx		lr
+ENDPROC(chacha20_block_xor_neon)
+
+	.align		5
+ENTRY(chacha20_4block_xor_neon)
+	push		{r4-r6, lr}
+	mov		ip, sp			// preserve the stack pointer
+	sub		r3, sp, #0x20		// allocate a 32 byte buffer
+	bic		r3, r3, #0x1f		// aligned to 32 bytes
+	mov		sp, r3
+
+	// r0: Input state matrix, s
+	// r1: 4 data blocks output, o
+	// r2: 4 data blocks input, i
+
+	//
+	// This function encrypts four consecutive ChaCha20 blocks by loading
+	// the state matrix in NEON registers four times. The algorithm performs
+	// each operation on the corresponding word of each state matrix, hence
+	// requires no word shuffling. For final XORing step we transpose the
+	// matrix by interleaving 32- and then 64-bit words, which allows us to
+	// do XOR in NEON registers.
+	//
+
+	// x0..15[0-3] = s0..3[0..3]
+	add		r3, r0, #0x20
+	vld1.32		{q0-q1}, [r0]
+	vld1.32		{q2-q3}, [r3]
+
+	adr		r3, CTRINC
+	vdup.32		q15, d7[1]
+	vdup.32		q14, d7[0]
+	vld1.32		{q11}, [r3, :128]
+	vdup.32		q13, d6[1]
+	vdup.32		q12, d6[0]
+	vadd.i32	q12, q12, q11		// x12 += counter values 0-3
+	vdup.32		q11, d5[1]
+	vdup.32		q10, d5[0]
+	vdup.32		q9, d4[1]
+	vdup.32		q8, d4[0]
+	vdup.32		q7, d3[1]
+	vdup.32		q6, d3[0]
+	vdup.32		q5, d2[1]
+	vdup.32		q4, d2[0]
+	vdup.32		q3, d1[1]
+	vdup.32		q2, d1[0]
+	vdup.32		q1, d0[1]
+	vdup.32		q0, d0[0]
+
+	mov		r3, #10
+
+.Ldoubleround4:
+	// x0 += x4, x12 = rotl32(x12 ^ x0, 16)
+	// x1 += x5, x13 = rotl32(x13 ^ x1, 16)
+	// x2 += x6, x14 = rotl32(x14 ^ x2, 16)
+	// x3 += x7, x15 = rotl32(x15 ^ x3, 16)
+	vadd.i32	q0, q0, q4
+	vadd.i32	q1, q1, q5
+	vadd.i32	q2, q2, q6
+	vadd.i32	q3, q3, q7
+
+	veor		q12, q12, q0
+	veor		q13, q13, q1
+	veor		q14, q14, q2
+	veor		q15, q15, q3
+
+	vrev32.16	q12, q12
+	vrev32.16	q13, q13
+	vrev32.16	q14, q14
+	vrev32.16	q15, q15
+
+	// x8 += x12, x4 = rotl32(x4 ^ x8, 12)
+	// x9 += x13, x5 = rotl32(x5 ^ x9, 12)
+	// x10 += x14, x6 = rotl32(x6 ^ x10, 12)
+	// x11 += x15, x7 = rotl32(x7 ^ x11, 12)
+	vadd.i32	q8, q8, q12
+	vadd.i32	q9, q9, q13
+	vadd.i32	q10, q10, q14
+	vadd.i32	q11, q11, q15
+
+	vst1.32		{q8-q9}, [sp, :256]
+
+	veor		q8, q4, q8
+	veor		q9, q5, q9
+	vshl.u32	q4, q8, #12
+	vshl.u32	q5, q9, #12
+	vsri.u32	q4, q8, #20
+	vsri.u32	q5, q9, #20
+
+	veor		q8, q6, q10
+	veor		q9, q7, q11
+	vshl.u32	q6, q8, #12
+	vshl.u32	q7, q9, #12
+	vsri.u32	q6, q8, #20
+	vsri.u32	q7, q9, #20
+
+	// x0 += x4, x12 = rotl32(x12 ^ x0, 8)
+	// x1 += x5, x13 = rotl32(x13 ^ x1, 8)
+	// x2 += x6, x14 = rotl32(x14 ^ x2, 8)
+	// x3 += x7, x15 = rotl32(x15 ^ x3, 8)
+	vadd.i32	q0, q0, q4
+	vadd.i32	q1, q1, q5
+	vadd.i32	q2, q2, q6
+	vadd.i32	q3, q3, q7
+
+	veor		q8, q12, q0
+	veor		q9, q13, q1
+	vshl.u32	q12, q8, #8
+	vshl.u32	q13, q9, #8
+	vsri.u32	q12, q8, #24
+	vsri.u32	q13, q9, #24
+
+	veor		q8, q14, q2
+	veor		q9, q15, q3
+	vshl.u32	q14, q8, #8
+	vshl.u32	q15, q9, #8
+	vsri.u32	q14, q8, #24
+	vsri.u32	q15, q9, #24
+
+	vld1.32		{q8-q9}, [sp, :256]
+
+	// x8 += x12, x4 = rotl32(x4 ^ x8, 7)
+	// x9 += x13, x5 = rotl32(x5 ^ x9, 7)
+	// x10 += x14, x6 = rotl32(x6 ^ x10, 7)
+	// x11 += x15, x7 = rotl32(x7 ^ x11, 7)
+	vadd.i32	q8, q8, q12
+	vadd.i32	q9, q9, q13
+	vadd.i32	q10, q10, q14
+	vadd.i32	q11, q11, q15
+
+	vst1.32		{q8-q9}, [sp, :256]
+
+	veor		q8, q4, q8
+	veor		q9, q5, q9
+	vshl.u32	q4, q8, #7
+	vshl.u32	q5, q9, #7
+	vsri.u32	q4, q8, #25
+	vsri.u32	q5, q9, #25
+
+	veor		q8, q6, q10
+	veor		q9, q7, q11
+	vshl.u32	q6, q8, #7
+	vshl.u32	q7, q9, #7
+	vsri.u32	q6, q8, #25
+	vsri.u32	q7, q9, #25
+
+	vld1.32		{q8-q9}, [sp, :256]
+
+	// x0 += x5, x15 = rotl32(x15 ^ x0, 16)
+	// x1 += x6, x12 = rotl32(x12 ^ x1, 16)
+	// x2 += x7, x13 = rotl32(x13 ^ x2, 16)
+	// x3 += x4, x14 = rotl32(x14 ^ x3, 16)
+	vadd.i32	q0, q0, q5
+	vadd.i32	q1, q1, q6
+	vadd.i32	q2, q2, q7
+	vadd.i32	q3, q3, q4
+
+	veor		q15, q15, q0
+	veor		q12, q12, q1
+	veor		q13, q13, q2
+	veor		q14, q14, q3
+
+	vrev32.16	q15, q15
+	vrev32.16	q12, q12
+	vrev32.16	q13, q13
+	vrev32.16	q14, q14
+
+	// x10 += x15, x5 = rotl32(x5 ^ x10, 12)
+	// x11 += x12, x6 = rotl32(x6 ^ x11, 12)
+	// x8 += x13, x7 = rotl32(x7 ^ x8, 12)
+	// x9 += x14, x4 = rotl32(x4 ^ x9, 12)
+	vadd.i32	q10, q10, q15
+	vadd.i32	q11, q11, q12
+	vadd.i32	q8, q8, q13
+	vadd.i32	q9, q9, q14
+
+	vst1.32		{q8-q9}, [sp, :256]
+
+	veor		q8, q7, q8
+	veor		q9, q4, q9
+	vshl.u32	q7, q8, #12
+	vshl.u32	q4, q9, #12
+	vsri.u32	q7, q8, #20
+	vsri.u32	q4, q9, #20
+
+	veor		q8, q5, q10
+	veor		q9, q6, q11
+	vshl.u32	q5, q8, #12
+	vshl.u32	q6, q9, #12
+	vsri.u32	q5, q8, #20
+	vsri.u32	q6, q9, #20
+
+	// x0 += x5, x15 = rotl32(x15 ^ x0, 8)
+	// x1 += x6, x12 = rotl32(x12 ^ x1, 8)
+	// x2 += x7, x13 = rotl32(x13 ^ x2, 8)
+	// x3 += x4, x14 = rotl32(x14 ^ x3, 8)
+	vadd.i32	q0, q0, q5
+	vadd.i32	q1, q1, q6
+	vadd.i32	q2, q2, q7
+	vadd.i32	q3, q3, q4
+
+	veor		q8, q15, q0
+	veor		q9, q12, q1
+	vshl.u32	q15, q8, #8
+	vshl.u32	q12, q9, #8
+	vsri.u32	q15, q8, #24
+	vsri.u32	q12, q9, #24
+
+	veor		q8, q13, q2
+	veor		q9, q14, q3
+	vshl.u32	q13, q8, #8
+	vshl.u32	q14, q9, #8
+	vsri.u32	q13, q8, #24
+	vsri.u32	q14, q9, #24
+
+	vld1.32		{q8-q9}, [sp, :256]
+
+	// x10 += x15, x5 = rotl32(x5 ^ x10, 7)
+	// x11 += x12, x6 = rotl32(x6 ^ x11, 7)
+	// x8 += x13, x7 = rotl32(x7 ^ x8, 7)
+	// x9 += x14, x4 = rotl32(x4 ^ x9, 7)
+	vadd.i32	q10, q10, q15
+	vadd.i32	q11, q11, q12
+	vadd.i32	q8, q8, q13
+	vadd.i32	q9, q9, q14
+
+	vst1.32		{q8-q9}, [sp, :256]
+
+	veor		q8, q7, q8
+	veor		q9, q4, q9
+	vshl.u32	q7, q8, #7
+	vshl.u32	q4, q9, #7
+	vsri.u32	q7, q8, #25
+	vsri.u32	q4, q9, #25
+
+	veor		q8, q5, q10
+	veor		q9, q6, q11
+	vshl.u32	q5, q8, #7
+	vshl.u32	q6, q9, #7
+	vsri.u32	q5, q8, #25
+	vsri.u32	q6, q9, #25
+
+	subs		r3, r3, #1
+	beq		0f
+
+	vld1.32		{q8-q9}, [sp, :256]
+	b		.Ldoubleround4
+
+	// x0[0-3] += s0[0]
+	// x1[0-3] += s0[1]
+	// x2[0-3] += s0[2]
+	// x3[0-3] += s0[3]
+0:	ldmia		r0!, {r3-r6}
+	vdup.32		q8, r3
+	vdup.32		q9, r4
+	vadd.i32	q0, q0, q8
+	vadd.i32	q1, q1, q9
+	vdup.32		q8, r5
+	vdup.32		q9, r6
+	vadd.i32	q2, q2, q8
+	vadd.i32	q3, q3, q9
+
+	// x4[0-3] += s1[0]
+	// x5[0-3] += s1[1]
+	// x6[0-3] += s1[2]
+	// x7[0-3] += s1[3]
+	ldmia		r0!, {r3-r6}
+	vdup.32		q8, r3
+	vdup.32		q9, r4
+	vadd.i32	q4, q4, q8
+	vadd.i32	q5, q5, q9
+	vdup.32		q8, r5
+	vdup.32		q9, r6
+	vadd.i32	q6, q6, q8
+	vadd.i32	q7, q7, q9
+
+	// interleave 32-bit words in state n, n+1
+	vzip.32		q0, q1
+	vzip.32		q2, q3
+	vzip.32		q4, q5
+	vzip.32		q6, q7
+
+	// interleave 64-bit words in state n, n+2
+	vswp		d1, d4
+	vswp		d3, d6
+	vswp		d9, d12
+	vswp		d11, d14
+
+	// xor with corresponding input, write to output
+	vld1.8		{q8-q9}, [r2]!
+	veor		q8, q8, q0
+	veor		q9, q9, q4
+	vst1.8		{q8-q9}, [r1]!
+
+	vld1.32		{q8-q9}, [sp, :256]
+
+	// x8[0-3] += s2[0]
+	// x9[0-3] += s2[1]
+	// x10[0-3] += s2[2]
+	// x11[0-3] += s2[3]
+	ldmia		r0!, {r3-r6}
+	vdup.32		q0, r3
+	vdup.32		q4, r4
+	vadd.i32	q8, q8, q0
+	vadd.i32	q9, q9, q4
+	vdup.32		q0, r5
+	vdup.32		q4, r6
+	vadd.i32	q10, q10, q0
+	vadd.i32	q11, q11, q4
+
+	// x12[0-3] += s3[0]
+	// x13[0-3] += s3[1]
+	// x14[0-3] += s3[2]
+	// x15[0-3] += s3[3]
+	ldmia		r0!, {r3-r6}
+	vdup.32		q0, r3
+	vdup.32		q4, r4
+	adr		r3, CTRINC
+	vadd.i32	q12, q12, q0
+	vld1.32		{q0}, [r3, :128]
+	vadd.i32	q13, q13, q4
+	vadd.i32	q12, q12, q0		// x12 += counter values 0-3
+
+	vdup.32		q0, r5
+	vdup.32		q4, r6
+	vadd.i32	q14, q14, q0
+	vadd.i32	q15, q15, q4
+
+	// interleave 32-bit words in state n, n+1
+	vzip.32		q8, q9
+	vzip.32		q10, q11
+	vzip.32		q12, q13
+	vzip.32		q14, q15
+
+	// interleave 64-bit words in state n, n+2
+	vswp		d17, d20
+	vswp		d19, d22
+	vswp		d25, d28
+	vswp		d27, d30
+
+	vmov		q4, q1
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q8
+	veor		q1, q1, q12
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q2
+	veor		q1, q1, q6
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q10
+	veor		q1, q1, q14
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q4
+	veor		q1, q1, q5
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q9
+	veor		q1, q1, q13
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]!
+	veor		q0, q0, q3
+	veor		q1, q1, q7
+	vst1.8		{q0-q1}, [r1]!
+
+	vld1.8		{q0-q1}, [r2]
+	veor		q0, q0, q11
+	veor		q1, q1, q15
+	vst1.8		{q0-q1}, [r1]
+
+	mov		sp, ip
+	pop		{r4-r6, pc}
+ENDPROC(chacha20_4block_xor_neon)
+
+	.align		4
+CTRINC:	.word		0, 1, 2, 3
+
diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c
new file mode 100644
index 000000000000..20623237e2bc
--- /dev/null
+++ b/arch/arm/crypto/chacha20-neon-glue.c
@@ -0,0 +1,127 @@
+/*
+ * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ *
+ * Copyright (C) 2016 Linaro, Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on:
+ * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code
+ *
+ * Copyright (C) 2015 Martin Willi
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/chacha20.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+
+asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src);
+
+static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
+			    unsigned int bytes)
+{
+	u8 buf[CHACHA20_BLOCK_SIZE];
+
+	while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
+		chacha20_4block_xor_neon(state, dst, src);
+		bytes -= CHACHA20_BLOCK_SIZE * 4;
+		src += CHACHA20_BLOCK_SIZE * 4;
+		dst += CHACHA20_BLOCK_SIZE * 4;
+		state[12] += 4;
+	}
+	while (bytes >= CHACHA20_BLOCK_SIZE) {
+		chacha20_block_xor_neon(state, dst, src);
+		bytes -= CHACHA20_BLOCK_SIZE;
+		src += CHACHA20_BLOCK_SIZE;
+		dst += CHACHA20_BLOCK_SIZE;
+		state[12]++;
+	}
+	if (bytes) {
+		memcpy(buf, src, bytes);
+		chacha20_block_xor_neon(state, buf, buf);
+		memcpy(dst, buf, bytes);
+	}
+}
+
+static int chacha20_neon(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct chacha20_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
+	u32 state[16];
+	int err;
+
+	if (req->cryptlen <= CHACHA20_BLOCK_SIZE || !may_use_simd())
+		return crypto_chacha20_crypt(req);
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	crypto_chacha20_init(state, ctx, walk.iv);
+
+	kernel_neon_begin();
+	while (walk.nbytes > 0) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.chunksize);
+
+		chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr,
+				nbytes);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+	}
+	kernel_neon_end();
+
+	return err;
+}
+
+static struct skcipher_alg alg = {
+	.base.cra_name		= "chacha20",
+	.base.cra_driver_name	= "chacha20-neon",
+	.base.cra_priority	= 300,
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct chacha20_ctx),
+	.base.cra_alignmask	= 1,
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= CHACHA20_KEY_SIZE,
+	.max_keysize		= CHACHA20_KEY_SIZE,
+	.ivsize			= CHACHA20_IV_SIZE,
+	.chunksize		= 4 * CHACHA20_BLOCK_SIZE,
+	.setkey			= crypto_chacha20_setkey,
+	.encrypt		= chacha20_neon,
+	.decrypt		= chacha20_neon,
+};
+
+static int __init chacha20_simd_mod_init(void)
+{
+	if (!(elf_hwcap & HWCAP_NEON))
+		return -ENODEV;
+
+	return crypto_register_skcipher(&alg);
+}
+
+static void __exit chacha20_simd_mod_fini(void)
+{
+	crypto_unregister_skcipher(&alg);
+}
+
+module_init(chacha20_simd_mod_init);
+module_exit(chacha20_simd_mod_fini);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("chacha20");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v8 03/16] irq: move IRQ routing into irq.c
From: Marc Zyngier @ 2016-12-09 14:41 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161104173203.21168-4-andre.przywara@arm.com>

On 04/11/16 17:31, Andre Przywara wrote:
> The current IRQ routing code in x86/irq.c is mostly implementing a
> generic KVM interface which other architectures may use too.
> Move the code to set up an MSI route into the generic irq.c file and
> guard it with the KVM_CAP_IRQ_ROUTING capability to return an error
> if the kernel does not support interrupt routing.
> This also removes the dummy implementations for all other
> architectures and only leaves the x86 specific code in x86/irq.c.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section
From: Robert Richter @ 2016-12-09 14:51 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <bc099ca6-d927-3eb6-438c-291a66ecf5f9@huawei.com>

On 09.12.16 20:14:24, Yisheng Xie wrote:
> We have merged your patch to 4.9.0-rc8, however we still meet the similar problem
> on our D05 board:

I assume you can reliable trigger the bug. Can you add some debug
messages that show the pfn number, node id, memory range. Also, add to
kernel parameters:

 memblock=debug efi=debug loglevel=8 mminit_loglevel=4

We need the mem layout detected by efi and the memblock regions. This
should be in the full boot log then.

Thanks,

-Robert

> -----------------
> [    5.081971] ------------[ cut here ]------------
> [    5.086668] kernel BUG at mm/page_alloc.c:1863!
> [    5.091281] Internal error: Oops - BUG: 0 [#1] SMP
> [    5.096159] Modules linked in:
> [    5.099265] CPU: 61 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc8+ #58
> [    5.105822] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 UEFI 16.08 RC1 12/08/2016
> [    5.114860] task: fffffe13f23d7400 task.stack: fffffe13f66f8000
> [    5.120903] PC is at move_freepages+0x170/0x180
> [    5.125514] LR is at move_freepages_block+0xa8/0xb8
> [    5.130479] pc : [<fffffc00081e9588>] lr : [<fffffc00081e9640>] pstate: 200000c5
> [    5.138008] sp : fffffe13f66fb910
> [    5.141375] x29: fffffe13f66fb910 x28: 00000000ffffff80
> [    5.146784] x27: fffffdff800b0000 x26: fffffe13fbf62b90
> [    5.152193] x25: 000000000000000a x24: fffffdff800b0020
> [    5.157598] x23: 0000000000000000 x22: fffffdff800b0000
> [    5.163006] x21: fffffdff800fffc0 x20: fffffe13fbf62680
> [    5.168412] x19: fffffdff80080000 x18: 000000004c4d6d26
> [    5.173820] x17: 0000000000000000 x16: 0000000000000000
> [    5.179226] x15: 000000005c589429 x14: 0000000000000000
> [    5.184634] x13: 0000000000000000 x12: 00000000fe2ce6e0
> [    5.190039] x11: 00000000bb5b525b x10: 00000000bb48417b
> [    5.195446] x9 : 0000000000000068 x8 : 0000000000000000
> [    5.200854] x7 : fffffe13fbff2680 x6 : 0000000000000001
> [    5.206260] x5 : fffffc0008e12328 x4 : 0000000000000000
> [    5.211667] x3 : fffffe13fbf62680 x2 : 0000000000000000
> [    5.217073] x1 : fffffe13fbff2680 x0 : fffffe13fbf62680
> [...]
> [    5.768991] [<fffffc00081e9588>] move_freepages+0x170/0x180
> [    5.774664] [<fffffc00081e9640>] move_freepages_block+0xa8/0xb8
> [    5.780691] [<fffffc00081e9bbc>] __rmqueue+0x494/0x5f0
> [    5.785921] [<fffffc00081eb10c>] get_page_from_freelist+0x5ec/0xb58
> [    5.792302] [<fffffc00081ebc4c>] __alloc_pages_nodemask+0x144/0xd08
> [    5.798687] [<fffffc0008240514>] alloc_page_interleave+0x64/0xc0
> [    5.804801] [<fffffc0008240b20>] alloc_pages_current+0x108/0x168
> [    5.810920] [<fffffc0008c75410>] atomic_pool_init+0x78/0x1cc
> [    5.816680] [<fffffc0008c755a0>] arm64_dma_init+0x3c/0x44
> [    5.822180] [<fffffc0008082d94>] do_one_initcall+0x44/0x138
> [    5.827853] [<fffffc0008c70d54>] kernel_init_freeable+0x1ec/0x28c
> [    5.834060] [<fffffc00088a79f0>] kernel_init+0x18/0x110
> [    5.839378] [<fffffc0008082b30>] ret_from_fork+0x10/0x20
> [    5.844785] Code: 9108a021 9400afc5 d4210000 d503201f (d4210000)
> [    5.851024] ---[ end trace 3382df1ae82057db ]---

^ permalink raw reply

* [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section
From: Robert Richter @ 2016-12-09 14:52 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <ffaa77c7-8982-524f-f624-e4463a463daf@huawei.com>

On 09.12.16 21:15:12, Yisheng Xie wrote:
> For invalid pages, their zone and node information is not initialized, and it
> do have risk to trigger the BUG_ON, so I have a silly question,
> why not just change the BUG_ON:

We need to get the page handling correct. Modifying the BUG_ON() just
hides that something is wrong.

-Robert

> -----------
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6de9440..af199b8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1860,12 +1860,13 @@ int move_freepages(struct zone *zone,
>          * Remove at a later date when no bug reports exist related to
>          * grouping pages by mobility
>          */
> -       VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
> +       VM_BUG_ON(early_pfn_valid(start_page) && early_pfn_valid(end_page) &&
> +                       page_zone(start_page) != page_zone(end_page));
>  #endif
> 
>         for (page = start_page; page <= end_page;) {
>                 /* Make sure we are not inadvertently changing nodes */
> -               VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> +               VM_BUG_ON_PAGE(early_pfn_valid(page) && (page_to_nid(page) != zone_to_nid(zone)), page);
> 
>                 if (!pfn_valid_within(page_to_pfn(page))) {
>                         page++;

^ permalink raw reply

* [PATCH 1/3] arm: hisi: add ARCH_MULTI_V5 support
From: Marty Plummer @ 2016-12-09 15:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <2fcf5a5f-8935-c4f0-a034-c8a72dc2be0f@hisilicon.com>

On 12/04/2016 08:03 PM, Jiancheng Xue wrote:
> Hi Arnd,
> 
> On 2016/10/17 21:48, Arnd Bergmann wrote:
>> On Monday, October 17, 2016 8:07:03 PM CEST Pan Wen wrote:
>>> Add support for some HiSilicon SoCs which depend on ARCH_MULTI_V5.
>>>
>>> Signed-off-by: Pan Wen <wenpan@hisilicon.com>
>>>
>>
>> Looks ok. I've added Marty Plummer to Cc, he was recently proposing
>> patches for Hi3520, which I think is closely related to this one.
>> Please try to work together so the patches don't conflict. It should
>> be fairly straightforward since you are basically doing the same
>> change here.
>>
> Marty hasn't give any replies about this thread until now. I reviewed
> the patch for Hi3520. And I think this patch won't conflict with Hi3520.
> Could you help us to ack this patch?
> 
> Thanks,
> Jiancheng
> 
> 
Hello all

Sorry for my lack of activity, I've just been very busy lately with real
world considerations (well, real world but related to this; I have
another board based on hi3521a I've been tinkering with, trying to get
the manuf. to release gpl source via the sfconfservancy). I've not given
up on the project, however, since devices like this really need updates
in light of the recent botnets targeting devices of this sort as
manpower.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161209/978fa980/attachment.sig>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox