Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* mmc: core: complete/wait_for_completion performance
From: Jörg Krause @ 2016-12-15 13:50 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <962480933.163666.1481741832090@email.1und1.de>

Hi Stefan,

On Wed, 2016-12-14 at 19:57 +0100, Stefan Wahren wrote:
> Hi J?rg,
> 

[snip]

> > > 
> > > did you try cyclictest [1]?
> > 
> > Not yet. Not sure what to measure and which values to compare here.
> 
> i tought you have the vendor kernel and the mainline kernel available
> for your platform.
> 
> So you could compare the both kernels.

Yes, that's right. I will have a look at this tool.

> > 
> > > 
> > > Beside the time for a request the amount of requests for the
> > > complete
> > > iperf test
> > > would we interesting. Maybe there are retries.
> > > 
> > > I'm still interested in your PIO mode patches for mxs-mmc even
> > > without clean up.
> > 
> > Actually, the patch does not implement a PIO mode, but drops DMA
> > and
> > uses polling instead. I've attached the patch.
> 
> Thanks. I applied it, but unfortunately this breaks SD card support
> for my Duckbill and the kernel isn't able to mount the rootfs:
> 
> [????2.267073] mxs-mmc 80010000.ssp: initialized
> [????2.272624] mxs-mmc 80010000.ssp: AC command error 0xffffff92

Sorry, I messed up the branches. I attached the correct patch which is
working for me on Linux v4.9.

J?rg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0200-mmc-mxs-mmc-use-PIO-mode.patch
Type: text/x-patch
Size: 15450 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161215/b5c33807/attachment-0001.bin>

^ permalink raw reply

* [PATCH v2] ARM: dts: Add missing CPU frequencies for Exynos5422/5800
From: Markus Reichl @ 2016-12-15 13:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <10512254.nyUcL0zgTP@amdc3058>

Hi Bartlomiej,

Am 15.12.2016 um 12:55 schrieb Bartlomiej Zolnierkiewicz:
> Add missing 2000MHz & 1900MHz OPPs (for A15 cores) and 1400MHz OPP
> (for A7 cores).  Also update common Odroid-XU3 Lite/XU3/XU4 thermal
> cooling maps to account for new OPPs.
> 
> Since new OPPs are not available on all Exynos5422/5800 boards modify
> dts files for Odroid-XU3 Lite (limited to 1.8 GHz / 1.3 GHz) & Peach
> Pi (limited to 2.0 GHz / 1.3 GHz) accordingly.
> 
> Tested on Odroid-XU3 and XU3 Lite.
> 
> Cc: Doug Anderson <dianders@chromium.org>
> Cc: Javier Martinez Canillas <javier@osg.samsung.com>
> Cc: Andreas Faerber <afaerber@suse.de>
> Cc: Thomas Abraham <thomas.ab@samsung.com>
> Cc: Ben Gamari <ben@smart-cactus.org>
> Cc: Arjun K V <arjun.kv@samsung.com>
> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
> ---
> v2:
> - added comments about limitations of SoC revisions used by Odroid-XU3 Lite and
>   Peach Pi boards (suggested by Javier)
> - removed redundant opp_a7_14 label
> - added Arjun to Cc:
> 
> Javier, could you test it on Peach Pi board?
> 
>  arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi |   14 ++++++-------
>  arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts    |   22 +++++++++++++++++++++
>  arch/arm/boot/dts/exynos5800-peach-pi.dts          |    9 ++++++++
>  arch/arm/boot/dts/exynos5800.dtsi                  |   15 ++++++++++++++
>  4 files changed, 53 insertions(+), 7 deletions(-)
> 
> Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
> ===================================================================
> --- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi	2016-12-15 12:43:54.365955950 +0100
> +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi	2016-12-15 12:43:54.361955949 +0100
> @@ -118,7 +118,7 @@
>  				/*
>  				 * When reaching cpu_alert3, reduce CPU
>  				 * by 2 steps. On Exynos5422/5800 that would
> -				 * be: 1600 MHz and 1100 MHz.
> +				 * (usually) be: 1800 MHz and 1200 MHz.
>  				 */
>  				map3 {
>  					trip = <&cpu_alert3>;
> @@ -131,16 +131,16 @@
>  
>  				/*
>  				 * When reaching cpu_alert4, reduce CPU
> -				 * further, down to 600 MHz (11 steps for big,
> -				 * 7 steps for LITTLE).
> +				 * further, down to 600 MHz (13 steps for big,
> +				 * 8 steps for LITTLE).
>  				 */
> -				map5 {
> +				cooling_map5: map5 {
>  					trip = <&cpu_alert4>;
> -					cooling-device = <&cpu0 3 7>;
> +					cooling-device = <&cpu0 3 8>;
>  				};
> -				map6 {
> +				cooling_map6: map6 {
>  					trip = <&cpu_alert4>;
> -					cooling-device = <&cpu4 3 11>;
> +					cooling-device = <&cpu4 3 13>;
>  				};
>  			};
>  		};
> Index: b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts
> ===================================================================
> --- a/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts	2016-12-15 12:43:54.365955950 +0100
> +++ b/arch/arm/boot/dts/exynos5422-odroidxu3-lite.dts	2016-12-15 12:43:54.361955949 +0100
> @@ -21,6 +21,28 @@
>  	compatible = "hardkernel,odroid-xu3-lite", "samsung,exynos5800", "samsung,exynos5";
>  };
>  
> +/*
> + * Odroid XU3-Lite board uses SoC revision with lower maximum frequencies
> + * than Odroid XU3/XU4 boards: 1.8 GHz for A15 cores & 1.3 GHz for A7 cores.
> + * Therefore we need to update OPPs tables and thermal maps accordingly.
> + */
> +&cluster_a15_opp_table {
> +	/delete-node/opp at 2000000000;
> +	/delete-node/opp at 1900000000;
> +};
> +
> +&cluster_a7_opp_table {
> +	/delete-node/opp at 1400000000;
> +};
> +
> +&cooling_map5 {
> +	cooling-device = <&cpu0 3 7>;
> +};
> +
> +&cooling_map6 {
> +	cooling-device = <&cpu4 3 11>;
> +};
> +
>  &pwm {
>  	/*
>  	 * PWM 0 -- fan
> Index: b/arch/arm/boot/dts/exynos5800-peach-pi.dts
> ===================================================================
> --- a/arch/arm/boot/dts/exynos5800-peach-pi.dts	2016-12-15 12:43:54.365955950 +0100
> +++ b/arch/arm/boot/dts/exynos5800-peach-pi.dts	2016-12-15 12:43:54.361955949 +0100
> @@ -146,6 +146,15 @@
>  	vdd-supply = <&ldo9_reg>;
>  };
>  
> +/*
> + * Peach Pi board uses SoC revision with lower maximum frequency for A7 cores
> + * (1.3 GHz instead of 1.4 GHz) than Odroid XU3/XU4 boards.  Thus we need to
> + * update A7 OPPs table accordingly.
> + */
> +&cluster_a7_opp_table {
> +	/delete-property/opp at 1400000000;
> +};
> +
>  &cpu0 {
>  	cpu-supply = <&buck2_reg>;
>  };
> Index: b/arch/arm/boot/dts/exynos5800.dtsi
> ===================================================================
> --- a/arch/arm/boot/dts/exynos5800.dtsi	2016-12-15 12:43:54.365955950 +0100
> +++ b/arch/arm/boot/dts/exynos5800.dtsi	2016-12-15 12:43:54.361955949 +0100
> @@ -24,6 +24,16 @@
>  };
>  
>  &cluster_a15_opp_table {
> +	opp at 2000000000 {
> +		opp-hz = /bits/ 64 <2000000000>;
> +		opp-microvolt = <1250000>;
> +		clock-latency-ns = <140000>;
> +	};
> +	opp at 1900000000 {
> +		opp-hz = /bits/ 64 <1900000000>;
> +		opp-microvolt = <1250000>;
> +		clock-latency-ns = <140000>;
> +	};
>  	opp at 1700000000 {
>  		opp-microvolt = <1250000>;
>  	};
> @@ -85,6 +95,11 @@
>  };
>  
>  &cluster_a7_opp_table {
> +	opp at 1400000000 {
> +		opp-hz = /bits/ 64 <1400000000>;
> +		opp-microvolt = <1250000>;
> +		clock-latency-ns = <140000>;
> +	};
>  	opp at 1300000000 {
>  		opp-microvolt = <1250000>;
>  	};
> 

On Odroid XU4, XU3 and XU3-lite:

Tested-by: Markus Reichl <m.reichl@fivetechno.de>

Thanks,
--
Markus Reichl

^ permalink raw reply

* [PATCH 2/2] xilinx_dma: Add reset support
From: Laurent Pinchart @ 2016-12-15 13:56 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <380d1b11-814d-0164-3d49-e976f2deb3f9@synopsys.com>

Hi Ramiro,

(CC'ing Philipp Zabel)

On Thursday 15 Dec 2016 11:26:54 Ramiro Oliveira wrote:
> On 12/14/2016 8:16 PM, Laurent Pinchart wrote:
> > Hi Ramiro,
> > 
> > Thank you for the patch.
> > 
> > On Wednesday 14 Dec 2016 17:18:24 Ramiro Oliveira wrote:
> >> Add a DT property to control an optional external reset line
> >> 
> >> Signed-off-by: Ramiro Oliveira <roliveir@synopsys.com>
> >> ---
> >> 
> >>  drivers/dma/xilinx/xilinx_dma.c | 23 +++++++++++++++++++++++
> >>  1 file changed, 23 insertions(+)
> >> 
> >> diff --git a/drivers/dma/xilinx/xilinx_dma.c
> >> b/drivers/dma/xilinx/xilinx_dma.c index 5c9f11b..b845224 100644
> >> --- a/drivers/dma/xilinx/xilinx_dma.c
> >> +++ b/drivers/dma/xilinx/xilinx_dma.c
> >> @@ -46,6 +46,7 @@
> >>  #include <linux/slab.h>
> >>  #include <linux/clk.h>
> >>  #include <linux/io-64-nonatomic-lo-hi.h>
> >> +#include <linux/reset.h>
> > 
> > I had neatly sorted the header alphabetically until someone added clk.h
> > and io-64-nonatomic-lo-hi.h :-( Could you please move reset.h just before
> > slab.h ?
>
> Sure. Actually I was tempted to reorder it, but I decided not to do it. I'll
> do it now

Yeah, I'll sleep better tonight :-D

> >>  #include "../dmaengine.h"
> >> 
> >> @@ -409,6 +410,7 @@ struct xilinx_dma_device {
> >>  	struct clk *rxs_clk;
> >>  	u32 nr_channels;
> >>  	u32 chan_id;
> >> +	struct reset_control *rst;
> >>  };
> >>  
> >>  /* Macros */
> >> @@ -2543,6 +2545,27 @@ static int xilinx_dma_probe(struct platform_device
> >> *pdev) if (IS_ERR(xdev->regs))
> >>  		return PTR_ERR(xdev->regs);
> >> 
> >> +	xdev->rst = devm_reset_control_get_optional(&pdev->dev, "reset");
> > 
> > devm_reset_control_get_optional() is deprecated as explained in
> > linux/reset.h, you should use devm_reset_control_get_optional_exclusive()
> > or devm_reset_control_get_optional_shared() instead, as applicable.
> > 
> > This being said, I'm wondering why the optional versions of those
> > functions exist, as they do exactly the same as the non-optional versions.
> > The API feels wrong, it should have been modelled like the GPIO API. Feel
> > free to fix it if you want :-) But that's out of scope for this patch.
> 
> I missed the comment stating that devm_reset_control_get_optional() was
> deprecated.
> 
> I could fix it. Your sugestion is modelling these functions like the GPIO
> API?

I think it would be better for driver if the _get_optional functions would 
return an ERR_PTR() for errors and NULL when reset control is not available, 
and then have the rest of the reset controller API accept NULL as a no-op. 
Your implementation would then be

	xdev->rst = devm_reset_control_get_optional_exclusive(&pdev->dev,
							      "reset");
	if (IS_ERR(xdev->rst)) {
		err = PTR_ERR(xdev->rst);
		if (err != -EPROBE_DEFER)
			dev_err(xdev->dev, "error getting reset %d\n", err);
		return err;
	}

	err = reset_control_deassert(xdev->rst);
	if (err) {
		dev_err(xdev->dev, "failed to deassert reset: %d\n", err);
		return err;
	}

That requires modifying the reset controller API, so it's a bit out-of-scope, 
but if you could give it a go, it would be great.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply

* [PATCH 1/2] media: s5p-cec: Remove unneeded linux/miscdevice.h include
From: Corentin Labbe @ 2016-12-15 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

s5p-cec: does not use any miscdevice so this patch remove this
unnecessary inclusion.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/staging/media/s5p-cec/exynos_hdmi_cec.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/media/s5p-cec/exynos_hdmi_cec.h b/drivers/staging/media/s5p-cec/exynos_hdmi_cec.h
index 3e4fc7b..7d94535 100644
--- a/drivers/staging/media/s5p-cec/exynos_hdmi_cec.h
+++ b/drivers/staging/media/s5p-cec/exynos_hdmi_cec.h
@@ -14,7 +14,6 @@
 #define _EXYNOS_HDMI_CEC_H_ __FILE__
 
 #include <linux/regmap.h>
-#include <linux/miscdevice.h>
 #include "s5p_cec.h"
 
 void s5p_cec_set_divider(struct s5p_cec_dev *cec);
-- 
2.10.2

^ permalink raw reply related

* [PATCH 2/2] media: s5p-cec: Remove references to non-existent PLAT_S5P symbol
From: Corentin Labbe @ 2016-12-15 14:03 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161215140324.28986-1-clabbe.montjoie@gmail.com>

Commit d78c16ccde96 ("ARM: SAMSUNG: Remove remaining legacy code")
removed the Kconfig symbol PLAT_S5P.
This patch remove the last occurrence of this symbol.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/staging/media/s5p-cec/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/s5p-cec/Kconfig b/drivers/staging/media/s5p-cec/Kconfig
index 0315fd7..cba4f8a 100644
--- a/drivers/staging/media/s5p-cec/Kconfig
+++ b/drivers/staging/media/s5p-cec/Kconfig
@@ -1,6 +1,6 @@
 config VIDEO_SAMSUNG_S5P_CEC
        tristate "Samsung S5P CEC driver"
-       depends on VIDEO_DEV && MEDIA_CEC && (PLAT_S5P || ARCH_EXYNOS || COMPILE_TEST)
+       depends on VIDEO_DEV && MEDIA_CEC && (ARCH_EXYNOS || COMPILE_TEST)
        ---help---
          This is a driver for Samsung S5P HDMI CEC interface. It uses the
          generic CEC framework interface.
-- 
2.10.2

^ permalink raw reply related

* [PATCH 1/3] dma: zx: rename zx296702_dma.c to zx_dma.c
From: Shawn Guo @ 2016-12-15 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

From: Shawn Guo <shawn.guo@linaro.org>

ZTE ZX dma driver is not ZX296702 specific.  It works for not only
ZX296702 but also other ZTE ZX family platforms like ZX296718.  Let's
rename the file to reflect that.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/dma/Kconfig                      | 4 ++--
 drivers/dma/Makefile                     | 2 +-
 drivers/dma/{zx296702_dma.c => zx_dma.c} | 0
 3 files changed, 3 insertions(+), 3 deletions(-)
 rename drivers/dma/{zx296702_dma.c => zx_dma.c} (100%)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 263495d0adbd..eab7e12ee4a6 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -571,12 +571,12 @@ config XILINX_ZYNQMP_DMA
 	  Enable support for Xilinx ZynqMP DMA controller.
 
 config ZX_DMA
-	tristate "ZTE ZX296702 DMA support"
+	tristate "ZTE ZX DMA support"
 	depends on ARCH_ZX || COMPILE_TEST
 	select DMA_ENGINE
 	select DMA_VIRTUAL_CHANNELS
 	help
-	  Support the DMA engine for ZTE ZX296702 platform devices.
+	  Support the DMA engine for ZTE ZX family platform devices.
 
 
 # driver files
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index a4fa3360e609..0b723e94d9e6 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -66,7 +66,7 @@ obj-$(CONFIG_TI_CPPI41) += cppi41.o
 obj-$(CONFIG_TI_DMA_CROSSBAR) += ti-dma-crossbar.o
 obj-$(CONFIG_TI_EDMA) += edma.o
 obj-$(CONFIG_XGENE_DMA) += xgene-dma.o
-obj-$(CONFIG_ZX_DMA) += zx296702_dma.o
+obj-$(CONFIG_ZX_DMA) += zx_dma.o
 obj-$(CONFIG_ST_FDMA) += st_fdma.o
 
 obj-y += qcom/
diff --git a/drivers/dma/zx296702_dma.c b/drivers/dma/zx_dma.c
similarity index 100%
rename from drivers/dma/zx296702_dma.c
rename to drivers/dma/zx_dma.c
-- 
1.9.1

^ permalink raw reply related

* [PATCH 2/3] dma: zx: set DMA_CYCLIC cap_mask bit
From: Shawn Guo @ 2016-12-15 14:03 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481810617-7650-1-git-send-email-shawnguo@kernel.org>

From: Shawn Guo <shawn.guo@linaro.org>

The zx_dma driver supports cyclic transfer mode.  Let's set DMA_CYCLIC
cap_mask bit to make that clear, and avoid unnecessary failure when
clients request channel via dma_request_chan_by_mask() with DMA_CYCLIC
bit set in mask.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/dma/zx_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/dma/zx_dma.c b/drivers/dma/zx_dma.c
index 380276d078b2..33155c6816cc 100644
--- a/drivers/dma/zx_dma.c
+++ b/drivers/dma/zx_dma.c
@@ -812,6 +812,7 @@ static int zx_dma_probe(struct platform_device *op)
 	INIT_LIST_HEAD(&d->slave.channels);
 	dma_cap_set(DMA_SLAVE, d->slave.cap_mask);
 	dma_cap_set(DMA_MEMCPY, d->slave.cap_mask);
+	dma_cap_set(DMA_CYCLIC, d->slave.cap_mask);
 	dma_cap_set(DMA_PRIVATE, d->slave.cap_mask);
 	d->slave.dev = &op->dev;
 	d->slave.device_free_chan_resources = zx_dma_free_chan_resources;
-- 
1.9.1

^ permalink raw reply related

* [PATCH 3/3] dma: zx: fix residue calculation
From: Shawn Guo @ 2016-12-15 14:03 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481810617-7650-1-git-send-email-shawnguo@kernel.org>

From: Shawn Guo <shawn.guo@linaro.org>

The dma residue is defined as the free space to end of transfer buffer,
which could be multiple segments chained together.  So the residue
calculation in zx_dma_tx_status() works for both slave_sg and cyclic
case.  But unfortunately, the 'index' is wrong.  It should plus one,
because the current segment is already occupied and shouldn't be counted
into free space.

This fixes the HDMI audio noise issue we see on ZX296718 with SPDIF
interface.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/dma/zx_dma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/zx_dma.c b/drivers/dma/zx_dma.c
index 33155c6816cc..42ff3e66c1e1 100644
--- a/drivers/dma/zx_dma.c
+++ b/drivers/dma/zx_dma.c
@@ -365,7 +365,8 @@ static enum dma_status zx_dma_tx_status(struct dma_chan *chan,
 
 		bytes = 0;
 		clli = zx_dma_get_curr_lli(p);
-		index = (clli - ds->desc_hw_lli) / sizeof(struct zx_desc_hw);
+		index = (clli - ds->desc_hw_lli) /
+				sizeof(struct zx_desc_hw) + 1;
 		for (; index < ds->desc_num; index++) {
 			bytes += ds->desc_hw[index].src_x;
 			/* end of lli */
-- 
1.9.1

^ permalink raw reply related

* [PATCH V6 03/10] efi: parse ARM processor error
From: James Morse @ 2016-12-15 14:08 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481147303-7979-4-git-send-email-tbaicar@codeaurora.org>

Hi Tyler,

On 07/12/16 21:48, Tyler Baicar wrote:
> Add support for ARM Common Platform Error Record (CPER).
> UEFI 2.6 specification adds support for ARM specific
> processor error information to be reported as part of the
> CPER records. This provides more detail on for processor error logs.

Looks good to me, a few minor comments below.

Reviewed-by: James Morse <james.morse@arm.com>


> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 8fa4e23..1ac2572 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -184,6 +199,110 @@ static void cper_print_proc_generic(const char *pfx,
>  		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
>  }
>  
> +static void cper_print_proc_arm(const char *pfx,
> +				const struct cper_sec_proc_arm *proc)
> +{
> +	int i, len, max_ctx_type;
> +	struct cper_arm_err_info *err_info;
> +	struct cper_arm_ctx_info *ctx_info;
> +	char newpfx[64];
> +
> +	printk("%ssection length: %d\n", pfx, proc->section_length);

Compared to the rest of the file, this:
>	printk("%s""section length: %d\n", pfx, proc->section_length);

would be more in keeping. I guess its done this way to avoid some spurious
warning about %ssection not being recognised by printk().


> +	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
> +
> +	len = proc->section_length - (sizeof(*proc) +
> +		proc->err_info_num * (sizeof(*err_info)));
> +	if (len < 0) {
> +		printk("%ssection length is too small\n", pfx);

This calculation is all based on values in the 'struct cper_sec_proc_arm', is it
worth making more noise about how the firmware-generated record is incorrectly
formatted? If we see this message its not the kernel's fault!


> +		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
> +		return;
> +	}
> +
> +	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
> +		printk("%sMPIDR: 0x%016llx\n", pfx, proc->mpidr);
> +	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
> +		printk("%serror affinity level: %d\n", pfx,
> +			proc->affinity_level);
> +	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {

> +		printk("%srunning state: %d\n", pfx, proc->running_state);

This field is described as a bit field in table 260, can we print it as 0x%lx in
case additional bits are set?


> +		printk("%sPSCI state: %d\n", pfx, proc->psci_state);
> +	}
> +
> +	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
> +
> +	err_info = (struct cper_arm_err_info *)(proc + 1);
> +	for (i = 0; i < proc->err_info_num; i++) {
> +		printk("%sError info structure %d:\n", pfx, i);
> +		printk("%sversion:%d\n", newpfx, err_info->version);
> +		printk("%slength:%d\n", newpfx, err_info->length);
> +		if (err_info->validation_bits &
> +		    CPER_ARM_INFO_VALID_MULTI_ERR) {
> +			if (err_info->multiple_error == 0)
> +				printk("%ssingle error\n", newpfx);
> +			else if (err_info->multiple_error == 1)
> +				printk("%smultiple errors\n", newpfx);
> +			else

> +				printk("%smultiple errors count:%d\n",
> +				newpfx, err_info->multiple_error);

This is described as unsigned in table 261.


> +		}
> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
> +				printk("%sfirst error captured\n", newpfx);
> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
> +				printk("%slast error captured\n", newpfx);
> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
> +				printk("%spropagated error captured\n",
> +				       newpfx);

Table 261 also has an 'overflow' bit in flags. It may be worth printing a
warning if this is set:
> Note: Overflow bit indicates that firmware/hardware error
> buffers had experience an overflow, and it is possible that
> some error information has been lost.


> +		}
> +		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
> +			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
> +			proc_error_type_strs[err_info->type] : "unknown");
> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
> +			printk("%serror_info: 0x%016llx\n", newpfx,
> +			       err_info->error_info);
> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
> +			printk("%svirtual fault address: 0x%016llx\n",
> +				newpfx, err_info->virt_fault_addr);
> +		if (err_info->validation_bits &
> +		    CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
> +			printk("%sphysical fault address: 0x%016llx\n",
> +				newpfx, err_info->physical_fault_addr);
> +		err_info += 1;
> +	}
> +	ctx_info = (struct cper_arm_ctx_info *)err_info;

> +	max_ctx_type = (sizeof(arm_reg_ctx_strs) /
> +			sizeof(arm_reg_ctx_strs[0]) - 1);

ARRAY_SIZE() - 1?


> +	for (i = 0; i < proc->context_info_num; i++) {
> +		int size = sizeof(*ctx_info) + ctx_info->size;
> +
> +		printk("%sContext info structure %d:\n", pfx, i);
> +		if (len < size) {
> +			printk("%ssection length is too small\n", newpfx);
> +			return;
> +		}
> +		if (ctx_info->type > max_ctx_type) {
> +			printk("%sInvalid context type: %d\n",	newpfx,
> +						ctx_info->type);
> +			printk("%sMax context type: %d\n", newpfx,
> +						max_ctx_type);
> +			return;
> +		}
> +		printk("%sregister context type %d: %s\n", newpfx,
> +			ctx_info->type, arm_reg_ctx_strs[ctx_info->type]);
> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
> +				(ctx_info + 1), ctx_info->size, 0);
> +		len -= size;
> +		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
> +	}
> +
> +	if (len > 0) {
> +		printk("%sVendor specific error info has %d bytes:\n", pfx,
> +		       len);

%u - just in case it is surprisingly large!


> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
> +				len, 0);
> +	}
> +}
> +
>  static const char * const mem_err_type_strs[] = {
>  	"unknown",
>  	"no error",
> @@ -458,6 +577,15 @@ static void cper_estatus_print_section(
>  			cper_print_pcie(newpfx, pcie, gdata);
>  		else
>  			goto err_section_too_small;
> +	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_ARM)) {
> +		struct cper_sec_proc_arm *arm_err;
> +
> +		arm_err = acpi_hest_generic_data_payload(gdata);
> +		printk("%ssection_type: ARM processor error\n", newpfx);
> +		if (gdata->error_data_length >= sizeof(*arm_err))
> +			cper_print_proc_arm(newpfx, arm_err);
> +		else
> +			goto err_section_too_small;
>  	} else
>  		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>  

This is the only processor-specific entry in this function,
CPER_SEC_PROC_{IA,IPF} don't appear anywhere else in the tree.

Is it worth adding an (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)) in
the if()? This would let the compiler remove cper_print_proc_arm(() on x86/ia64
systems which won't ever see a record of this type.


Thanks!

James

^ permalink raw reply

* [PATCH v2] i2c: designware: add reset interface
From: Phil Reid @ 2016-12-15 14:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481805227.9552.15.camel@linux.intel.com>

On 15/12/2016 20:33, Andy Shevchenko wrote:
> On Thu, 2016-12-15 at 16:59 +0800, Zhangfei Gao wrote:
>> Some platforms like hi3660 need do reset first to allow accessing
>> registers
>
> Patch itself looks good, but would be nice to have it tested.
>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>
>>
>> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> ---
>>  drivers/i2c/busses/i2c-designware-core.h    |  1 +
>>  drivers/i2c/busses/i2c-designware-platdrv.c | 28
>> ++++++++++++++++++++++++----
>>  2 files changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-designware-core.h
>> b/drivers/i2c/busses/i2c-designware-core.h
>> index 0d44d2a..94b14fa 100644
>> --- a/drivers/i2c/busses/i2c-designware-core.h
>> +++ b/drivers/i2c/busses/i2c-designware-core.h
>> @@ -80,6 +80,7 @@ struct dw_i2c_dev {
>>  	void __iomem		*base;
>>  	struct completion	cmd_complete;
>>  	struct clk		*clk;
>> +	struct reset_control	*rst;
>>  	u32			(*get_clk_rate_khz) (struct
>> dw_i2c_dev *dev);
>>  	struct dw_pci_controller *controller;
>>  	int			cmd_err;
>> diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c
>> b/drivers/i2c/busses/i2c-designware-platdrv.c
>> index 0b42a12..e9016ae 100644
>> --- a/drivers/i2c/busses/i2c-designware-platdrv.c
>> +++ b/drivers/i2c/busses/i2c-designware-platdrv.c
>> @@ -38,6 +38,7 @@
>>  #include <linux/pm_runtime.h>
>>  #include <linux/property.h>
>>  #include <linux/io.h>
>> +#include <linux/reset.h>
>>  #include <linux/slab.h>
>>  #include <linux/acpi.h>
>>  #include <linux/platform_data/i2c-designware.h>
>> @@ -176,6 +177,14 @@ static int dw_i2c_plat_probe(struct
>> platform_device *pdev)
>>  	dev->irq = irq;
>>  	platform_set_drvdata(pdev, dev);
>>
>> +	dev->rst = devm_reset_control_get_optional(&pdev->dev, NULL);
>> +	if (IS_ERR(dev->rst)) {
>> +		if (PTR_ERR(dev->rst) == -EPROBE_DEFER)
>> +			return -EPROBE_DEFER;
>> +	} else {
>> +		reset_control_deassert(dev->rst);
>> +	}
>> +
More for my education. But some drivers seem to handle the error codes a little more explicitly.
Whats the best approach?

eg: From usb/dwc2 driver it continues only if ENOENT / ENOTSUPP errors are return
and ENOMEM / EINVAL etc is a fatal error.

hsotg->reset = devm_reset_control_get_optional(hsotg->dev, "dwc2");
if (IS_ERR(hsotg->reset)) {
         ret = PTR_ERR(hsotg->reset);
         switch (ret) {
         case -ENOENT:
         case -ENOTSUPP:
                 hsotg->reset = NULL;
                 break;
         default:
                 dev_err(hsotg->dev, "error getting reset control %d\n",
                         ret);
                 return ret;
         }
}

if (hsotg->reset)
         reset_control_deassert(hsotg->reset);

Regards
Phil

^ permalink raw reply

* [PATCH] dmaengine: xilinx_dma: Add support for multiple buffers
From: Jose Abreu @ 2016-12-15 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Xilinx VDMA supports multiple framebuffers. This patch
adds correct handling for the scenario where multiple
framebuffers are available in the HW and parking mode is
not set.

We corrected these situations:
	1) Do not start VDMA until all the framebuffers
	have been programmed with a valid address.
	2) Restart variables when VDMA halts/resets.
	3) Halt channel when all the framebuffers have
	finished and there is not anymore segments
	pending.
	4) Do not try to overlap framebuffers until they
	are completed.

All these situations, without this patch, can lead to data
corruption and even system memory corruption. If, for example,
user has a VDMA with 3 framebuffers, with direction
DMA_DEV_TO_MEM and user only submits one segment, VDMA will
write first to the segment the user submitted BUT if the user
doesn't submit another segment in a timelly manner then VDMA
will write to position 0 of system mem in the following VSYNC.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Carlos Palminha <palminha@synopsys.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kedareswara rao Appana <appana.durga.rao@xilinx.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: dmaengine at vger.kernel.org
Cc: linux-arm-kernel at lists.infradead.org
Cc: linux-kernel at vger.kernel.org
---
 drivers/dma/xilinx/xilinx_dma.c | 80 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 12 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 8288fe4..30eec5a 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -351,10 +351,12 @@ struct xilinx_dma_chan {
 	bool cyclic;
 	bool genlock;
 	bool err;
+	bool running;
 	struct tasklet_struct tasklet;
 	struct xilinx_vdma_config config;
 	bool flush_on_fsync;
 	u32 desc_pendingcount;
+	u32 seg_pendingcount;
 	bool ext_addr;
 	u32 desc_submitcount;
 	u32 residue;
@@ -946,6 +948,17 @@ static bool xilinx_dma_is_idle(struct xilinx_dma_chan *chan)
 }
 
 /**
+ * xilinx_vdma_is_multi_buffer - Check if VDMA has multiple framebuffers
+ * @chan: Driver specific DMA channel
+ *
+ * Return: '1' if is multi buffer, '0' if not.
+ */
+static bool xilinx_vdma_is_multi_buffer(struct xilinx_dma_chan *chan)
+{
+	return chan->num_frms > 1;
+}
+
+/**
  * xilinx_dma_halt - Halt DMA channel
  * @chan: Driver specific DMA channel
  */
@@ -966,6 +979,10 @@ static void xilinx_dma_halt(struct xilinx_dma_chan *chan)
 			chan, dma_ctrl_read(chan, XILINX_DMA_REG_DMASR));
 		chan->err = true;
 	}
+
+	chan->seg_pendingcount = 0;
+	chan->desc_submitcount = 0;
+	chan->running = false;
 }
 
 /**
@@ -1002,14 +1019,35 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 	struct xilinx_dma_tx_descriptor *desc, *tail_desc;
 	u32 reg;
 	struct xilinx_vdma_tx_segment *tail_segment;
+	bool mbf = xilinx_vdma_is_multi_buffer(chan) && !config->park;
 
 	/* This function was invoked with lock held */
 	if (chan->err)
 		return;
 
-	if (list_empty(&chan->pending_list))
+	/*
+	 * Can't continue if we have already consumed all the available
+	 * framebuffers and they are not done yet.
+	 */
+	if (mbf && (chan->seg_pendingcount >= chan->num_frms))
 		return;
 
+	if (list_empty(&chan->pending_list)) {
+		/*
+		 * Can't keep running if there are no pending segments. Halt
+		 * the channel as security measure. Notice that this will not
+		 * corrupt current transactions because this function is
+		 * called after the pendingcount is decreased and after the
+		 * current transaction has finished.
+		 */
+		if (mbf && chan->running && !chan->seg_pendingcount) {
+			dev_dbg(chan->dev, "pending list empty: halting\n");
+			xilinx_dma_halt(chan);
+		}
+
+		return;
+	}
+
 	desc = list_first_entry(&chan->pending_list,
 				struct xilinx_dma_tx_descriptor, node);
 	tail_desc = list_last_entry(&chan->pending_list,
@@ -1079,6 +1117,8 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 	if (chan->has_sg) {
 		dma_ctrl_write(chan, XILINX_DMA_REG_TAILDESC,
 				tail_segment->phys);
+		list_splice_tail_init(&chan->pending_list, &chan->active_list);
+		chan->desc_pendingcount = 0;
 	} else {
 		struct xilinx_vdma_tx_segment *segment, *last = NULL;
 		int i = 0;
@@ -1097,29 +1137,34 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 					XILINX_VDMA_REG_START_ADDRESS(i++),
 					segment->hw.buf_addr);
 
+			chan->seg_pendingcount++;
 			last = segment;
 		}
 
 		if (!last)
 			return;
 
-		/* HW expects these parameters to be same for one transaction */
-		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last->hw.hsize);
-		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
-				last->hw.stride);
-		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last->hw.vsize);
-	}
-
-	if (!chan->has_sg) {
 		list_del(&desc->node);
 		list_add_tail(&desc->node, &chan->active_list);
 		chan->desc_submitcount++;
 		chan->desc_pendingcount--;
 		if (chan->desc_submitcount == chan->num_frms)
 			chan->desc_submitcount = 0;
-	} else {
-		list_splice_tail_init(&chan->pending_list, &chan->active_list);
-		chan->desc_pendingcount = 0;
+
+		/*
+		 * Can't start until all the framebuffers have been programmed
+		 * or else corruption can occur.
+		 */
+		if (mbf && !chan->running &&
+		   (chan->seg_pendingcount < chan->num_frms))
+			return;
+
+		/* HW expects these parameters to be same for one transaction */
+		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last->hw.hsize);
+		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
+				last->hw.stride);
+		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last->hw.vsize);
+		chan->running = true;
 	}
 }
 
@@ -1327,12 +1372,20 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
 static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
 {
 	struct xilinx_dma_tx_descriptor *desc, *next;
+	struct xilinx_vdma_tx_segment *segment;
 
 	/* This function was invoked with lock held */
 	if (list_empty(&chan->active_list))
 		return;
 
 	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
+		if (chan->xdev->dma_config->dmatype == XDMA_TYPE_VDMA) {
+			list_for_each_entry(segment, &desc->segments, node) {
+				if (chan->seg_pendingcount > 0)
+					chan->seg_pendingcount--;
+			}
+		}
+
 		list_del(&desc->node);
 		if (!desc->cyclic)
 			dma_cookie_complete(&desc->async_tx);
@@ -1366,6 +1419,9 @@ static int xilinx_dma_reset(struct xilinx_dma_chan *chan)
 	}
 
 	chan->err = false;
+	chan->seg_pendingcount = 0;
+	chan->desc_submitcount = 0;
+	chan->running = false;
 
 	return err;
 }
-- 
1.9.1

^ permalink raw reply related

* [PATCH v8] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-15 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, we allow kernel mode NEON in softirq or hardirq context by
stacking and unstacking a slice of the NEON register file for each call
to kernel_neon_begin() and kernel_neon_end(), respectively.

Given that
a) a CPU typically spends most of its time in userland, during which time
   no kernel mode NEON in process context is in progress,
b) a CPU spends most of its time in the kernel doing other things than
   kernel mode NEON when it gets interrupted to perform kernel mode NEON
   in softirq context

the stacking and subsequent unstacking is only necessary if we are
interrupting a thread while it is performing kernel mode NEON in process
context, which means that in all other cases, we can simply preserve the
userland FP/SIMD state once, and only restore it upon return to userland,
even if we are being invoked from softirq or hardirq context.

However, with support being added to the arm64 kernel for Scalable Vector
Extensions (SVE), which shares the bottom 128 bits of each FP/SIMD register,
but could scale up to 2048 bits per register, the nested stacking and
unstacking that occurs in interrupt context is no longer sufficient, given
that the register contents will be truncated to 128 bits upon restore, unless
we add support for stacking/unstacking the entire SVE state, which does not
sound that appealing.

This means that the FP/SIMD save state operation that encounters the
userland state first *has* to be able to run to completion (since any
interruption could truncate the contents of the registers, which would
result in corrupted state to be restored once the interrupted context is
allowed to resume preserving the state)

Since executing all code involving the FP/SIMD state with interrupts
disabled is undesirable, let's ban kernel mode NEON in hardirq context
when running on SVE capable hardware. This is a small price to pay, given
that the primary usecase of kernel mode NEON, crypto, can deal with this
quite easily (and simply falls back to generic scalar algorithms whose
worse performance should not matter in hardirq context anyway)

With hardirq context removed from the equation, we can modify the FP/SIMD
state manipulation code to execute with softirqs disabled. This allows the
critical sections to complete without the risk of having the register
contents getting corrupted half way through.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
v8:
- disallow kernel mode NEON in hardirq context only on SVE capable hardware,
  otherwise we can allow it as long we ensure that interruptions of
  fpsimd_save_state() don't modify the FP/SIMD state as it is being preserved

  Existing code will need to be updated to take may_use_simd() into account:
  https://git.kernel.org/cgit/linux/kernel/git/ardb/linux.git/log/?h=arm64-sve-crypto

v7:
- ban kernel mode NEON in hardirq context, and execute all FP/SIMD state
  manipulations with softirqs disabled

v6:
- use a spinlock instead of disabling interrupts

v5:
- perform the test-and-set and the fpsimd_save_state with interrupts disabled,
  to prevent nested kernel_neon_begin()/_end() pairs to clobber the state
  while it is being preserved

v4:
- use this_cpu_inc/dec, which give sufficient guarantees regarding
  concurrency, but do not imply SMP barriers, which are not needed here

v3:
- avoid corruption by concurrent invocations of kernel_neon_begin()/_end()

v2:
- BUG() on unexpected values of the nesting level
- relax the BUG() on num_regs>32 to a WARN, given that nothing actually
  breaks in that case

 arch/arm64/include/asm/Kbuild |  1 -
 arch/arm64/include/asm/simd.h | 24 ++++++
 arch/arm64/kernel/fpsimd.c    | 82 +++++++++++++++-----
 3 files changed, 86 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index 44e1d7f10add..39ca0409e157 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -33,7 +33,6 @@ generic-y += segment.h
 generic-y += sembuf.h
 generic-y += serial.h
 generic-y += shmbuf.h
-generic-y += simd.h
 generic-y += sizes.h
 generic-y += socket.h
 generic-y += sockios.h
diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
new file mode 100644
index 000000000000..40a6a177faf2
--- /dev/null
+++ b/arch/arm64/include/asm/simd.h
@@ -0,0 +1,24 @@
+
+#include <linux/hardirq.h>
+#include <asm/hwcap.h>
+
+/*
+ * may_use_simd - whether it is allowable at this time to issue SIMD
+ *                instructions or access the SIMD register file
+ *
+ * On arm64, we allow kernel mode NEON in hardirq context but only when
+ * support for SVE is disabled, or when running on non-SVE capable hardware.
+ */
+static __must_check inline bool may_use_simd(void)
+{
+	if (!IS_ENABLED(CONFIG_ARM64_SVE))
+		return true;
+
+	return !(elf_hwcap & HWCAP_SVE) || !in_irq();
+}
+
+/*
+ * In some cases, it is useful to know at compile time if may_use_simd()
+ * could ever return false.
+ */
+#define need_nonsimd_fallback()		(IS_ENABLED(CONFIG_ARM64_SVE))
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 394c61db5566..94bd9f611a72 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -127,6 +127,8 @@ void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
 
 void fpsimd_thread_switch(struct task_struct *next)
 {
+	BUG_ON(!irqs_disabled());
+
 	/*
 	 * Save the current FPSIMD state to memory, but only if whatever is in
 	 * the registers is in fact the most recent userland FPSIMD state of
@@ -169,8 +171,10 @@ void fpsimd_flush_thread(void)
 void fpsimd_preserve_current_state(void)
 {
 	preempt_disable();
+	local_bh_disable();
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
 		fpsimd_save_state(&current->thread.fpsimd_state);
+	local_bh_enable();
 	preempt_enable();
 }
 
@@ -182,6 +186,7 @@ void fpsimd_preserve_current_state(void)
 void fpsimd_restore_current_state(void)
 {
 	preempt_disable();
+	local_bh_disable();
 	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
 
@@ -189,6 +194,7 @@ void fpsimd_restore_current_state(void)
 		this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
+	local_bh_enable();
 	preempt_enable();
 }
 
@@ -200,6 +206,7 @@ void fpsimd_restore_current_state(void)
 void fpsimd_update_current_state(struct fpsimd_state *state)
 {
 	preempt_disable();
+	local_bh_disable();
 	fpsimd_load_state(state);
 	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
@@ -207,6 +214,7 @@ void fpsimd_update_current_state(struct fpsimd_state *state)
 		this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
+	local_bh_enable();
 	preempt_enable();
 }
 
@@ -220,45 +228,75 @@ void fpsimd_flush_task_state(struct task_struct *t)
 
 #ifdef CONFIG_KERNEL_MODE_NEON
 
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);
+static DEFINE_PER_CPU(int, kernel_neon_nesting_level);
 
 /*
  * Kernel-side NEON support functions
  */
 void kernel_neon_begin_partial(u32 num_regs)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+	struct fpsimd_partial_state *s;
+	int level;
 
-		BUG_ON(num_regs > 32);
-		fpsimd_save_partial_state(s, roundup(num_regs, 2));
-	} else {
+	/*
+	 * On SVE capable hardware, we don't allow kernel mode NEON in hard IRQ
+	 * context. This is necessary because allowing that would force us to
+	 * either preserve/restore the entire SVE state (which could be huge) in
+	 * fpsimd_[save|load]_partial_state(), or perform all manipulations
+	 * involving the preserved FP/SIMD state with interrupts disabled.
+	 * Otherwise, a call to fpsimd_save_sate() could be interrupted by a
+	 * kernel_neon_begin()/kernel_neon_end() sequence, after which the top
+	 * SVE end of the shared SVE/NEON register contents will be gone.
+	 */
+	if (IS_ENABLED(CONFIG_ARM64_SVE))
+		BUG_ON((elf_hwcap & HWCAP_SVE) && in_irq());
+
+	preempt_disable();
+
+	level = this_cpu_inc_return(kernel_neon_nesting_level);
+	BUG_ON(level > 3);
+
+	if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		/*
 		 * Save the userland FPSIMD state if we have one and if we
 		 * haven't done so already. Clear fpsimd_last_state to indicate
 		 * that there is no longer userland FPSIMD state in the
 		 * registers.
 		 */
-		preempt_disable();
-		if (current->mm &&
-		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
+		local_bh_disable();
+		if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
 			fpsimd_save_state(&current->thread.fpsimd_state);
-		this_cpu_write(fpsimd_last_state, NULL);
+		local_bh_enable();
+	}
+	this_cpu_write(fpsimd_last_state, NULL);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
+
+		WARN_ON_ONCE(num_regs > 32);
+		num_regs = max(roundup(num_regs, 2), 32U);
+
+		fpsimd_save_partial_state(&s[level - 2], num_regs);
 	}
 }
 EXPORT_SYMBOL(kernel_neon_begin_partial);
 
 void kernel_neon_end(void)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
-		fpsimd_load_partial_state(s);
-	} else {
-		preempt_enable();
+	struct fpsimd_partial_state *s;
+	int level;
+
+	level = this_cpu_read(kernel_neon_nesting_level);
+	BUG_ON(level < 1);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
+		fpsimd_load_partial_state(&s[level - 2]);
 	}
+
+	this_cpu_dec(kernel_neon_nesting_level);
+	preempt_enable();
 }
 EXPORT_SYMBOL(kernel_neon_end);
 
@@ -270,8 +308,12 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
 {
 	switch (cmd) {
 	case CPU_PM_ENTER:
-		if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE))
-			fpsimd_save_state(&current->thread.fpsimd_state);
+		if (current->mm) {
+			local_bh_disable();
+			if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
+				fpsimd_save_state(&current->thread.fpsimd_state);
+			local_bh_enable();
+		}
 		this_cpu_write(fpsimd_last_state, NULL);
 		break;
 	case CPU_PM_EXIT:
-- 
2.7.4

^ permalink raw reply related

* Need some information about IOMMU support for ARM64 in 4.6
From: Robin Murphy @ 2016-12-15 14:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <E6F4274F7BEC774AAA4E7A98618BCFAE132076@XAP-PVEXMBX01.xlnx.xilinx.com>

Hi Barghav,

I assume you're talking about the ARM SMMU(v2)?

On 15/12/16 06:50, Bhargav Shah wrote:
> Hi,
> I am using kernel version 4.6.
> of_iommu_configure() gets IOMMU ops from of_iommu_list and call arch_setup_dma_ops.
> Presently, __iommu_setup_dma_ops() it expects iommu ops to be present. 
> Here, of_iommu_configure() get NULL iommu ops and it calls arch_setup_dma_ops with it.
> 
> So, in my current flow iommu ops is not getting set for device.

Yes. The SMMUv2 driver in 4.6 does not call of_iommu_set_ops(). That's
because it doesn't support of_xlate() either, so in general can't be
used for DMA ops. The default domain support which went into 4.6 was
just some early groundwork which, in hindsight, caused more problems
than it solved at that time.

> Am I missing something  which is causing this ?
> What is proper way to set iommu ops to device ? 

The proper way would be to update to 4.9, where full generic
configuration and DMA ops support finally landed ;)

Robin.

> 
> Thanks
> Bhargav 
> 
> 

^ permalink raw reply

* [PATCH 0/3] dmaengine: xilinx_dma: Bug fixes
From: Kedareswara rao Appana @ 2016-12-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

This patch series fixes below bugs in DMA and VDMA IP's
---> Do not start VDMA until frame buffer is processed by the h/w
---> Fix bug in Multi frame sotres handling in VDMA
---> Fix issues w.r.to multi frame descriptors submit with AXI DMA S2MM(recv) Side.


Kedareswara rao Appana (3):
  dmaengine: xilinx_dma: Check for channel idle state before submitting
    dma descriptor
  dmaeninge: xilinx_dma: Fix bug in multiple frame stores scenario in
    vdma
  dmaengine: xilinx_dma: Fix race condition in the driver for multiple
    descriptor scenario

 drivers/dma/xilinx/xilinx_dma.c | 185 +++++++++++++++++++++++++---------------
 1 file changed, 118 insertions(+), 67 deletions(-)

-- 
2.1.2

^ permalink raw reply

* [PATCH 1/3] dmaengine: xilinx_dma: Check for channel idle state before submitting dma descriptor
From: Kedareswara rao Appana @ 2016-12-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481814682-31780-1-git-send-email-appanad@xilinx.com>

Add channel idle state to ensure that dma descriptor is not
submitted when VDMA engine is in progress.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
---
 drivers/dma/xilinx/xilinx_dma.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 8288fe4..736c2a3 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -321,6 +321,7 @@ struct xilinx_dma_tx_descriptor {
  * @cyclic: Check for cyclic transfers.
  * @genlock: Support genlock mode
  * @err: Channel has errors
+ * @idle: Check for channel idle
  * @tasklet: Cleanup work after irq
  * @config: Device configuration info
  * @flush_on_fsync: Flush on Frame sync
@@ -351,6 +352,7 @@ struct xilinx_dma_chan {
 	bool cyclic;
 	bool genlock;
 	bool err;
+	bool idle;
 	struct tasklet_struct tasklet;
 	struct xilinx_vdma_config config;
 	bool flush_on_fsync;
@@ -966,6 +968,7 @@ static void xilinx_dma_halt(struct xilinx_dma_chan *chan)
 			chan, dma_ctrl_read(chan, XILINX_DMA_REG_DMASR));
 		chan->err = true;
 	}
+	chan->idle = true;
 }
 
 /**
@@ -1007,6 +1010,9 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 	if (chan->err)
 		return;
 
+	if (!chan->idle)
+		return;
+
 	if (list_empty(&chan->pending_list))
 		return;
 
@@ -1110,6 +1116,7 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last->hw.vsize);
 	}
 
+	chan->idle = false;
 	if (!chan->has_sg) {
 		list_del(&desc->node);
 		list_add_tail(&desc->node, &chan->active_list);
@@ -1447,6 +1454,7 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
 	if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
 		spin_lock(&chan->lock);
 		xilinx_dma_complete_descriptor(chan);
+		chan->idle = true;
 		chan->start_transfer(chan);
 		spin_unlock(&chan->lock);
 	}
@@ -2327,6 +2335,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev,
 	chan->has_sg = xdev->has_sg;
 	chan->desc_pendingcount = 0x0;
 	chan->ext_addr = xdev->ext_addr;
+	chan->idle = true;
 
 	spin_lock_init(&chan->lock);
 	INIT_LIST_HEAD(&chan->pending_list);
-- 
2.1.2

^ permalink raw reply related

* [PATCH 2/3] dmaeninge: xilinx_dma: Fix bug in multiple frame stores scenario in vdma
From: Kedareswara rao Appana @ 2016-12-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481814682-31780-1-git-send-email-appanad@xilinx.com>

When VDMA is configured for more than one frame in the h/w
for example h/w is configured for n number of frames and user
Submits n number of frames and triggered the DMA using issue_pending API.
In the current driver flow we are submitting one frame at a time
but we should submit all the n number of frames at one time as the h/w
Is configured for n number of frames.

This patch fixes this issue.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
---
 drivers/dma/xilinx/xilinx_dma.c | 43 +++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 736c2a3..4f3fa94 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -1087,23 +1087,33 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 				tail_segment->phys);
 	} else {
 		struct xilinx_vdma_tx_segment *segment, *last = NULL;
-		int i = 0;
+		int i = 0, j = 0;
 
 		if (chan->desc_submitcount < chan->num_frms)
 			i = chan->desc_submitcount;
 
-		list_for_each_entry(segment, &desc->segments, node) {
-			if (chan->ext_addr)
-				vdma_desc_write_64(chan,
-					XILINX_VDMA_REG_START_ADDRESS_64(i++),
-					segment->hw.buf_addr,
-					segment->hw.buf_addr_msb);
-			else
-				vdma_desc_write(chan,
-					XILINX_VDMA_REG_START_ADDRESS(i++),
-					segment->hw.buf_addr);
-
-			last = segment;
+		for (j = 0; j < chan->num_frms; ) {
+			list_for_each_entry(segment, &desc->segments, node) {
+				if (chan->ext_addr)
+					vdma_desc_write_64(chan,
+					  XILINX_VDMA_REG_START_ADDRESS_64(i++),
+					  segment->hw.buf_addr,
+					  segment->hw.buf_addr_msb);
+				else
+					vdma_desc_write(chan,
+					    XILINX_VDMA_REG_START_ADDRESS(i++),
+					    segment->hw.buf_addr);
+
+				last = segment;
+			}
+			list_del(&desc->node);
+			list_add_tail(&desc->node, &chan->active_list);
+			j++;
+			if (list_empty(&chan->pending_list))
+				break;
+			desc = list_first_entry(&chan->pending_list,
+						struct xilinx_dma_tx_descriptor,
+						node);
 		}
 
 		if (!last)
@@ -1114,14 +1124,13 @@ static void xilinx_vdma_start_transfer(struct xilinx_dma_chan *chan)
 		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
 				last->hw.stride);
 		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last->hw.vsize);
+
+		chan->desc_submitcount += j;
+		chan->desc_pendingcount -= j;
 	}
 
 	chan->idle = false;
 	if (!chan->has_sg) {
-		list_del(&desc->node);
-		list_add_tail(&desc->node, &chan->active_list);
-		chan->desc_submitcount++;
-		chan->desc_pendingcount--;
 		if (chan->desc_submitcount == chan->num_frms)
 			chan->desc_submitcount = 0;
 	} else {
-- 
2.1.2

^ permalink raw reply related

* [PATCH 3/3] dmaengine: xilinx_dma: Fix race condition in the driver for multiple descriptor scenario
From: Kedareswara rao Appana @ 2016-12-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481814682-31780-1-git-send-email-appanad@xilinx.com>

When driver is handling AXI DMA SoftIP
When user submits multiple descriptors back to back on the S2MM(recv)
side with the current driver flow the last buffer descriptor next bd
points to a invalid location resulting the invalid data or errors in the
DMA engine.

This patch fixes this issue by creating a BD Chain during
channel allocation itself and use those BD's.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
---
 drivers/dma/xilinx/xilinx_dma.c | 133 +++++++++++++++++++++++++---------------
 1 file changed, 83 insertions(+), 50 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 4f3fa94..8a4cb56 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -163,6 +163,7 @@
 #define XILINX_DMA_BD_SOP		BIT(27)
 #define XILINX_DMA_BD_EOP		BIT(26)
 #define XILINX_DMA_COALESCE_MAX		255
+#define XILINX_DMA_NUM_DESCS		255
 #define XILINX_DMA_NUM_APP_WORDS	5
 
 /* Multi-Channel DMA Descriptor offsets*/
@@ -310,6 +311,7 @@ struct xilinx_dma_tx_descriptor {
  * @pending_list: Descriptors waiting
  * @active_list: Descriptors ready to submit
  * @done_list: Complete descriptors
+ * @free_seg_list: Free descriptors
  * @common: DMA common channel
  * @desc_pool: Descriptors pool
  * @dev: The dma device
@@ -330,7 +332,9 @@ struct xilinx_dma_tx_descriptor {
  * @desc_submitcount: Descriptor h/w submitted count
  * @residue: Residue for AXI DMA
  * @seg_v: Statically allocated segments base
+ * @seg_p: Physical allocated segments base
  * @cyclic_seg_v: Statically allocated segment base for cyclic transfers
+ * @cyclic_seg_p: Physical allocated segments base for cyclic dma
  * @start_transfer: Differentiate b/w DMA IP's transfer
  */
 struct xilinx_dma_chan {
@@ -341,6 +345,7 @@ struct xilinx_dma_chan {
 	struct list_head pending_list;
 	struct list_head active_list;
 	struct list_head done_list;
+	struct list_head free_seg_list;
 	struct dma_chan common;
 	struct dma_pool *desc_pool;
 	struct device *dev;
@@ -361,7 +366,9 @@ struct xilinx_dma_chan {
 	u32 desc_submitcount;
 	u32 residue;
 	struct xilinx_axidma_tx_segment *seg_v;
+	dma_addr_t seg_p;
 	struct xilinx_axidma_tx_segment *cyclic_seg_v;
+	dma_addr_t cyclic_seg_p;
 	void (*start_transfer)(struct xilinx_dma_chan *chan);
 	u16 tdest;
 };
@@ -567,17 +574,31 @@ static struct xilinx_axidma_tx_segment *
 xilinx_axidma_alloc_tx_segment(struct xilinx_dma_chan *chan)
 {
 	struct xilinx_axidma_tx_segment *segment;
-	dma_addr_t phys;
-
-	segment = dma_pool_zalloc(chan->desc_pool, GFP_ATOMIC, &phys);
-	if (!segment)
-		return NULL;
+	unsigned long flags;
 
-	segment->phys = phys;
+	spin_lock_irqsave(&chan->lock, flags);
+	if (!list_empty(&chan->free_seg_list)) {
+		segment = list_first_entry(&chan->free_seg_list,
+					   struct xilinx_axidma_tx_segment,
+					   node);
+		list_del(&segment->node);
+	}
+	spin_unlock_irqrestore(&chan->lock, flags);
 
 	return segment;
 }
 
+static void xilinx_dma_clean_hw_desc(struct xilinx_axidma_desc_hw *hw)
+{
+	u32 next_desc = hw->next_desc;
+	u32 next_desc_msb = hw->next_desc_msb;
+
+	memset(hw, 0, sizeof(struct xilinx_axidma_desc_hw));
+
+	hw->next_desc = next_desc;
+	hw->next_desc_msb = next_desc_msb;
+}
+
 /**
  * xilinx_dma_free_tx_segment - Free transaction segment
  * @chan: Driver specific DMA channel
@@ -586,7 +607,9 @@ xilinx_axidma_alloc_tx_segment(struct xilinx_dma_chan *chan)
 static void xilinx_dma_free_tx_segment(struct xilinx_dma_chan *chan,
 				struct xilinx_axidma_tx_segment *segment)
 {
-	dma_pool_free(chan->desc_pool, segment, segment->phys);
+	xilinx_dma_clean_hw_desc(&segment->hw);
+
+	list_add_tail(&segment->node, &chan->free_seg_list);
 }
 
 /**
@@ -711,16 +734,26 @@ static void xilinx_dma_free_descriptors(struct xilinx_dma_chan *chan)
 static void xilinx_dma_free_chan_resources(struct dma_chan *dchan)
 {
 	struct xilinx_dma_chan *chan = to_xilinx_chan(dchan);
+	unsigned long flags;
 
 	dev_dbg(chan->dev, "Free all channel resources.\n");
 
 	xilinx_dma_free_descriptors(chan);
+
 	if (chan->xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
-		xilinx_dma_free_tx_segment(chan, chan->cyclic_seg_v);
-		xilinx_dma_free_tx_segment(chan, chan->seg_v);
+		spin_lock_irqsave(&chan->lock, flags);
+		INIT_LIST_HEAD(&chan->free_seg_list);
+		spin_unlock_irqrestore(&chan->lock, flags);
+
+		/* Free Memory that is allocated for cyclic DMA Mode */
+		dma_free_coherent(chan->dev, sizeof(*chan->cyclic_seg_v),
+				  chan->cyclic_seg_v, chan->cyclic_seg_p);
+	}
+
+	if (chan->xdev->dma_config->dmatype != XDMA_TYPE_AXIDMA) {
+		dma_pool_destroy(chan->desc_pool);
+		chan->desc_pool = NULL;
 	}
-	dma_pool_destroy(chan->desc_pool);
-	chan->desc_pool = NULL;
 }
 
 /**
@@ -803,6 +836,7 @@ static void xilinx_dma_do_tasklet(unsigned long data)
 static int xilinx_dma_alloc_chan_resources(struct dma_chan *dchan)
 {
 	struct xilinx_dma_chan *chan = to_xilinx_chan(dchan);
+	int i;
 
 	/* Has this channel already been allocated? */
 	if (chan->desc_pool)
@@ -813,11 +847,30 @@ static int xilinx_dma_alloc_chan_resources(struct dma_chan *dchan)
 	 * for meeting Xilinx VDMA specification requirement.
 	 */
 	if (chan->xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
-		chan->desc_pool = dma_pool_create("xilinx_dma_desc_pool",
-				   chan->dev,
-				   sizeof(struct xilinx_axidma_tx_segment),
-				   __alignof__(struct xilinx_axidma_tx_segment),
-				   0);
+		/* Allocate the buffer descriptors. */
+		chan->seg_v = dma_zalloc_coherent(chan->dev,
+						  sizeof(*chan->seg_v) *
+						  XILINX_DMA_NUM_DESCS,
+						  &chan->seg_p, GFP_KERNEL);
+		if (!chan->seg_v) {
+			dev_err(chan->dev,
+				"unable to allocate channel %d descriptors\n",
+				chan->id);
+			return -ENOMEM;
+		}
+
+		for (i = 0; i < XILINX_DMA_NUM_DESCS; i++) {
+			chan->seg_v[i].hw.next_desc =
+			lower_32_bits(chan->seg_p + sizeof(*chan->seg_v) *
+				((i + 1) % XILINX_DMA_NUM_DESCS));
+			chan->seg_v[i].hw.next_desc_msb =
+			upper_32_bits(chan->seg_p + sizeof(*chan->seg_v) *
+				((i + 1) % XILINX_DMA_NUM_DESCS));
+			chan->seg_v[i].phys = chan->seg_p +
+				sizeof(*chan->seg_v) * i;
+			list_add_tail(&chan->seg_v[i].node,
+				      &chan->free_seg_list);
+		}
 	} else if (chan->xdev->dma_config->dmatype == XDMA_TYPE_CDMA) {
 		chan->desc_pool = dma_pool_create("xilinx_cdma_desc_pool",
 				   chan->dev,
@@ -832,7 +885,8 @@ static int xilinx_dma_alloc_chan_resources(struct dma_chan *dchan)
 				     0);
 	}
 
-	if (!chan->desc_pool) {
+	if (!chan->desc_pool &&
+	    (chan->xdev->dma_config->dmatype != XDMA_TYPE_AXIDMA)) {
 		dev_err(chan->dev,
 			"unable to allocate channel %d descriptor pool\n",
 			chan->id);
@@ -841,22 +895,20 @@ static int xilinx_dma_alloc_chan_resources(struct dma_chan *dchan)
 
 	if (chan->xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
 		/*
-		 * For AXI DMA case after submitting a pending_list, keep
-		 * an extra segment allocated so that the "next descriptor"
-		 * pointer on the tail descriptor always points to a
-		 * valid descriptor, even when paused after reaching taildesc.
-		 * This way, it is possible to issue additional
-		 * transfers without halting and restarting the channel.
-		 */
-		chan->seg_v = xilinx_axidma_alloc_tx_segment(chan);
-
-		/*
 		 * For cyclic DMA mode we need to program the tail Descriptor
 		 * register with a value which is not a part of the BD chain
 		 * so allocating a desc segment during channel allocation for
 		 * programming tail descriptor.
 		 */
-		chan->cyclic_seg_v = xilinx_axidma_alloc_tx_segment(chan);
+		chan->cyclic_seg_v = dma_zalloc_coherent(chan->dev,
+					sizeof(*chan->cyclic_seg_v),
+					&chan->cyclic_seg_p, GFP_KERNEL);
+		if (!chan->cyclic_seg_v) {
+			dev_err(chan->dev,
+				"unable to allocate desc segment for cyclic DMA\n");
+			return -ENOMEM;
+		}
+		chan->cyclic_seg_v->phys = chan->cyclic_seg_p;
 	}
 
 	dma_cookie_init(dchan);
@@ -1206,7 +1258,7 @@ static void xilinx_cdma_start_transfer(struct xilinx_dma_chan *chan)
 static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
 {
 	struct xilinx_dma_tx_descriptor *head_desc, *tail_desc;
-	struct xilinx_axidma_tx_segment *tail_segment, *old_head, *new_head;
+	struct xilinx_axidma_tx_segment *tail_segment;
 	u32 reg;
 
 	if (chan->err)
@@ -1229,21 +1281,6 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
 	tail_segment = list_last_entry(&tail_desc->segments,
 				       struct xilinx_axidma_tx_segment, node);
 
-	if (chan->has_sg && !chan->xdev->mcdma) {
-		old_head = list_first_entry(&head_desc->segments,
-					struct xilinx_axidma_tx_segment, node);
-		new_head = chan->seg_v;
-		/* Copy Buffer Descriptor fields. */
-		new_head->hw = old_head->hw;
-
-		/* Swap and save new reserve */
-		list_replace_init(&old_head->node, &new_head->node);
-		chan->seg_v = old_head;
-
-		tail_segment->hw.next_desc = chan->seg_v->phys;
-		head_desc->async_tx.phys = new_head->phys;
-	}
-
 	reg = dma_ctrl_read(chan, XILINX_DMA_REG_DMACR);
 
 	if (chan->desc_pendingcount <= XILINX_DMA_COALESCE_MAX) {
@@ -1738,7 +1775,7 @@ static struct dma_async_tx_descriptor *xilinx_dma_prep_slave_sg(
 {
 	struct xilinx_dma_chan *chan = to_xilinx_chan(dchan);
 	struct xilinx_dma_tx_descriptor *desc;
-	struct xilinx_axidma_tx_segment *segment = NULL, *prev = NULL;
+	struct xilinx_axidma_tx_segment *segment = NULL;
 	u32 *app_w = (u32 *)context;
 	struct scatterlist *sg;
 	size_t copy;
@@ -1789,10 +1826,6 @@ static struct dma_async_tx_descriptor *xilinx_dma_prep_slave_sg(
 					       XILINX_DMA_NUM_APP_WORDS);
 			}
 
-			if (prev)
-				prev->hw.next_desc = segment->phys;
-
-			prev = segment;
 			sg_used += copy;
 
 			/*
@@ -1806,7 +1839,6 @@ static struct dma_async_tx_descriptor *xilinx_dma_prep_slave_sg(
 	segment = list_first_entry(&desc->segments,
 				   struct xilinx_axidma_tx_segment, node);
 	desc->async_tx.phys = segment->phys;
-	prev->hw.next_desc = segment->phys;
 
 	/* For the last DMA_MEM_TO_DEV transfer, set EOP */
 	if (chan->direction == DMA_MEM_TO_DEV) {
@@ -2350,6 +2382,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev,
 	INIT_LIST_HEAD(&chan->pending_list);
 	INIT_LIST_HEAD(&chan->done_list);
 	INIT_LIST_HEAD(&chan->active_list);
+	INIT_LIST_HEAD(&chan->free_seg_list);
 
 	/* Retrieve the channel properties from the device tree */
 	has_dre = of_property_read_bool(node, "xlnx,include-dre");
-- 
2.1.2

^ permalink raw reply related

* [PATCH] dmaengine: xilinx_dma: Add support for multiple buffers
From: Appana Durga Kedareswara Rao @ 2016-12-15 15:11 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <40ba7c24927eeffdcc66425f8ca1203589276c08.1481812291.git.joabreu@synopsys.com>

Hi Jose Abreu,

	Thanks for the patch...

	I have just posted different patch series for fixing these issues just now...
Please take a look into it...

Regards,
Kedar.

> Subject: [PATCH] dmaengine: xilinx_dma: Add support for multiple buffers
> 
> Xilinx VDMA supports multiple framebuffers. This patch adds correct handling for
> the scenario where multiple framebuffers are available in the HW and parking
> mode is not set.
> 
> We corrected these situations:
> 	1) Do not start VDMA until all the framebuffers
> 	have been programmed with a valid address.
> 	2) Restart variables when VDMA halts/resets.
> 	3) Halt channel when all the framebuffers have
> 	finished and there is not anymore segments
> 	pending.
> 	4) Do not try to overlap framebuffers until they
> 	are completed.
> 
> All these situations, without this patch, can lead to data corruption and even
> system memory corruption. If, for example, user has a VDMA with 3
> framebuffers, with direction DMA_DEV_TO_MEM and user only submits one
> segment, VDMA will write first to the segment the user submitted BUT if the
> user doesn't submit another segment in a timelly manner then VDMA will write
> to position 0 of system mem in the following VSYNC.
> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
> Cc: Carlos Palminha <palminha@synopsys.com>
> Cc: Vinod Koul <vinod.koul@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Kedareswara rao Appana <appana.durga.rao@xilinx.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: dmaengine at vger.kernel.org
> Cc: linux-arm-kernel at lists.infradead.org
> Cc: linux-kernel at vger.kernel.org
> ---
>  drivers/dma/xilinx/xilinx_dma.c | 80 ++++++++++++++++++++++++++++++++++-
> ------
>  1 file changed, 68 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 8288fe4..30eec5a 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -351,10 +351,12 @@ struct xilinx_dma_chan {
>  	bool cyclic;
>  	bool genlock;
>  	bool err;
> +	bool running;
>  	struct tasklet_struct tasklet;
>  	struct xilinx_vdma_config config;
>  	bool flush_on_fsync;
>  	u32 desc_pendingcount;
> +	u32 seg_pendingcount;
>  	bool ext_addr;
>  	u32 desc_submitcount;
>  	u32 residue;
> @@ -946,6 +948,17 @@ static bool xilinx_dma_is_idle(struct xilinx_dma_chan
> *chan)  }
> 
>  /**
> + * xilinx_vdma_is_multi_buffer - Check if VDMA has multiple
> +framebuffers
> + * @chan: Driver specific DMA channel
> + *
> + * Return: '1' if is multi buffer, '0' if not.
> + */
> +static bool xilinx_vdma_is_multi_buffer(struct xilinx_dma_chan *chan) {
> +	return chan->num_frms > 1;
> +}
> +
> +/**
>   * xilinx_dma_halt - Halt DMA channel
>   * @chan: Driver specific DMA channel
>   */
> @@ -966,6 +979,10 @@ static void xilinx_dma_halt(struct xilinx_dma_chan
> *chan)
>  			chan, dma_ctrl_read(chan,
> XILINX_DMA_REG_DMASR));
>  		chan->err = true;
>  	}
> +
> +	chan->seg_pendingcount = 0;
> +	chan->desc_submitcount = 0;
> +	chan->running = false;
>  }
> 
>  /**
> @@ -1002,14 +1019,35 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
>  	struct xilinx_dma_tx_descriptor *desc, *tail_desc;
>  	u32 reg;
>  	struct xilinx_vdma_tx_segment *tail_segment;
> +	bool mbf = xilinx_vdma_is_multi_buffer(chan) && !config->park;
> 
>  	/* This function was invoked with lock held */
>  	if (chan->err)
>  		return;
> 
> -	if (list_empty(&chan->pending_list))
> +	/*
> +	 * Can't continue if we have already consumed all the available
> +	 * framebuffers and they are not done yet.
> +	 */
> +	if (mbf && (chan->seg_pendingcount >= chan->num_frms))
>  		return;
> 
> +	if (list_empty(&chan->pending_list)) {
> +		/*
> +		 * Can't keep running if there are no pending segments. Halt
> +		 * the channel as security measure. Notice that this will not
> +		 * corrupt current transactions because this function is
> +		 * called after the pendingcount is decreased and after the
> +		 * current transaction has finished.
> +		 */
> +		if (mbf && chan->running && !chan->seg_pendingcount) {
> +			dev_dbg(chan->dev, "pending list empty: halting\n");
> +			xilinx_dma_halt(chan);
> +		}
> +
> +		return;
> +	}
> +
>  	desc = list_first_entry(&chan->pending_list,
>  				struct xilinx_dma_tx_descriptor, node);
>  	tail_desc = list_last_entry(&chan->pending_list,
> @@ -1079,6 +1117,8 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
>  	if (chan->has_sg) {
>  		dma_ctrl_write(chan, XILINX_DMA_REG_TAILDESC,
>  				tail_segment->phys);
> +		list_splice_tail_init(&chan->pending_list, &chan->active_list);
> +		chan->desc_pendingcount = 0;
>  	} else {
>  		struct xilinx_vdma_tx_segment *segment, *last = NULL;
>  		int i = 0;
> @@ -1097,29 +1137,34 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
> 
> 	XILINX_VDMA_REG_START_ADDRESS(i++),
>  					segment->hw.buf_addr);
> 
> +			chan->seg_pendingcount++;
>  			last = segment;
>  		}
> 
>  		if (!last)
>  			return;
> 
> -		/* HW expects these parameters to be same for one
> transaction */
> -		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
> >hw.hsize);
> -		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
> -				last->hw.stride);
> -		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
> >hw.vsize);
> -	}
> -
> -	if (!chan->has_sg) {
>  		list_del(&desc->node);
>  		list_add_tail(&desc->node, &chan->active_list);
>  		chan->desc_submitcount++;
>  		chan->desc_pendingcount--;
>  		if (chan->desc_submitcount == chan->num_frms)
>  			chan->desc_submitcount = 0;
> -	} else {
> -		list_splice_tail_init(&chan->pending_list, &chan->active_list);
> -		chan->desc_pendingcount = 0;
> +
> +		/*
> +		 * Can't start until all the framebuffers have been programmed
> +		 * or else corruption can occur.
> +		 */
> +		if (mbf && !chan->running &&
> +		   (chan->seg_pendingcount < chan->num_frms))
> +			return;
> +
> +		/* HW expects these parameters to be same for one
> transaction */
> +		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
> >hw.hsize);
> +		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
> +				last->hw.stride);
> +		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
> >hw.vsize);
> +		chan->running = true;
>  	}
>  }
> 
> @@ -1327,12 +1372,20 @@ static void xilinx_dma_issue_pending(struct
> dma_chan *dchan)  static void xilinx_dma_complete_descriptor(struct
> xilinx_dma_chan *chan)  {
>  	struct xilinx_dma_tx_descriptor *desc, *next;
> +	struct xilinx_vdma_tx_segment *segment;
> 
>  	/* This function was invoked with lock held */
>  	if (list_empty(&chan->active_list))
>  		return;
> 
>  	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> +		if (chan->xdev->dma_config->dmatype == XDMA_TYPE_VDMA)
> {
> +			list_for_each_entry(segment, &desc->segments, node)
> {
> +				if (chan->seg_pendingcount > 0)
> +					chan->seg_pendingcount--;
> +			}
> +		}
> +
>  		list_del(&desc->node);
>  		if (!desc->cyclic)
>  			dma_cookie_complete(&desc->async_tx);
> @@ -1366,6 +1419,9 @@ static int xilinx_dma_reset(struct xilinx_dma_chan
> *chan)
>  	}
> 
>  	chan->err = false;
> +	chan->seg_pendingcount = 0;
> +	chan->desc_submitcount = 0;
> +	chan->running = false;
> 
>  	return err;
>  }
> --
> 1.9.1
> 

^ permalink raw reply

* [PATCH 1/2] ARM: hyp-stub: improve ABI
From: Russell King - ARM Linux @ 2016-12-15 15:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <06fca797-da5d-f7f2-eecb-9b1b33b7e83f@arm.com>

On Thu, Dec 15, 2016 at 11:46:41AM +0000, Marc Zyngier wrote:
> On 15/12/16 11:35, Russell King - ARM Linux wrote:
> > On Thu, Dec 15, 2016 at 11:18:48AM +0000, Marc Zyngier wrote:
> >> On 14/12/16 10:46, Russell King wrote:
> >>> @@ -231,10 +244,14 @@ ENDPROC(__hyp_stub_do_trap)
> >>>   * initialisation entry point.
> >>>   */
> >>>  ENTRY(__hyp_get_vectors)
> >>> -	mov	r0, #-1
> >>> +	mov	r0, #HVC_GET_VECTORS
> >>
> >> This breaks the KVM implementation of __hyp_get_vectors, easily fixed
> >> with the following patchlet:
> > 
> > Right, so what Mark said is wrong:
> > 
> > "The hyp-stub is part of the kernel image, and the API is private to
> >  that particular image, so we can change things -- there's no ABI to
> >  worry about."
> 
> I think Mark is right. The API *is* private to the kernel, and KVM being
> the only in-kernel hypervisor on ARM, this is not an ABI.

Again, that's wrong.

We have two hypervisors in the kernel.  One is KVM, the other is the
stub.  Sure, the stub isn't a full implementation of a hypervisor, but
it is nevertheless, for the purposes of _this_ discussion, a hypervisor
of sorts.

The reason that both are included is because they both appear to share
a common interface (although that's totally not documented anywhere.)

> > So no, I'm going with my original patch (which TI has tested) which is
> > the minimal change, and if we _then_ want to rework the HYP mode
> > interfaces, that's the time to do the other changes when more people
> > (such as KVM folk) are paying attention and we can come to a cross-
> > hypervisor agreement on what the interface should be.
> 
> Given that there is a single in-kernel hypervisor, I can't really see
> who we're going to agree anything with...

As far as I can see, the hyp-stub falls under ARM arch maintanence.
KVM falls under KVM people.  Two different groups, we need agreement
between them what a sane API for both "hypervisors" should be.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* Linux fails to start secondary cores when system resumes from Suspend-to-RAM
From: Mason @ 2016-12-15 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

I'm playing with suspend-to-RAM on the tango platform:

  http://lxr.free-electrons.com/source/arch/arm/mach-tango/platsmp.c

When the system is suspended, the CPU is completely powered down
(receives no power whatsoever). When the system receives a wake-up
event, the CPU is powered up, and starts up exactly the same way
as for a cold boot (I think).

However, while Linux successfully starts the secondary cores when
the system first boots, it fails when the system resumes from "S3".

I added printascii() calls inside secondary_start_kernel() and I can
see that the following instruction are "properly" run:

	cpu_switch_mm(mm->pgd, mm);
	local_flush_bp_all();
	enter_lazy_tlb(mm, current);

but it seems local_flush_tlb_all(); never returns... :-(

  http://lxr.free-electrons.com/source/arch/arm/include/asm/tlbflush.h#L332


Looking more closely at that function, it seems to be failing in:

	tlb_op(TLB_V7_UIS_FULL, "c8, c7, 0", zero);

(meaning: I get a log before, but not after)

On my system, tlb_op(TLB_V7_UIS_FULL, "c8, c7, 0", zero);
resolves to:

c010ce18:       e3170602        tst     r7, #2097152    ; 0x200000
c010ce1c:       1e086f17        mcrne   15, 0, r6, cr8, cr7, {0}

What could be happening?
Can a core "hang" on this instruction?
Can a core "crash" on this instruction (meaning, an exception
is raised, and the core loops inside the exception code without
Linux noticing... that seems unlikely)

I'm stumped. Could someone throw me a clue?

Mark Rutland offered a guess about IPIs not working correctly.
Could this explain the behavior I'm seeing?

Regards.

^ permalink raw reply

* [PATCH] dmaengine: xilinx_dma: Add support for multiple buffers
From: Appana Durga Kedareswara Rao @ 2016-12-15 15:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <40ba7c24927eeffdcc66425f8ca1203589276c08.1481812291.git.joabreu@synopsys.com>

Hi Jose Abreu,

	Thanks for the patch...

> 
> Xilinx VDMA supports multiple framebuffers. This patch adds correct handling for
> the scenario where multiple framebuffers are available in the HW and parking
> mode is not set.
> 
> We corrected these situations:
> 	1) Do not start VDMA until all the framebuffers
> 	have been programmed with a valid address.
> 	2) Restart variables when VDMA halts/resets.
> 	3) Halt channel when all the framebuffers have
> 	finished and there is not anymore segments
> 	pending.
> 	4) Do not try to overlap framebuffers until they
> 	are completed.
> 
> All these situations, without this patch, can lead to data corruption and even
> system memory corruption. If, for example, user has a VDMA with 3
> framebuffers, with direction DMA_DEV_TO_MEM and user only submits one
> segment, VDMA will write first to the segment the user submitted BUT if the
> user doesn't submit another segment in a timelly manner then VDMA will write
> to position 0 of system mem in the following VSYNC.

	I have posted different patch series for fixing these issues just now...
Please take a look into it...

Regards,
Kedar.

> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
> Cc: Carlos Palminha <palminha@synopsys.com>
> Cc: Vinod Koul <vinod.koul@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Kedareswara rao Appana <appana.durga.rao@xilinx.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: dmaengine at vger.kernel.org
> Cc: linux-arm-kernel at lists.infradead.org
> Cc: linux-kernel at vger.kernel.org
> ---
>  drivers/dma/xilinx/xilinx_dma.c | 80 ++++++++++++++++++++++++++++++++++-
> ------
>  1 file changed, 68 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 8288fe4..30eec5a 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -351,10 +351,12 @@ struct xilinx_dma_chan {
>  	bool cyclic;
>  	bool genlock;
>  	bool err;
> +	bool running;
>  	struct tasklet_struct tasklet;
>  	struct xilinx_vdma_config config;
>  	bool flush_on_fsync;
>  	u32 desc_pendingcount;
> +	u32 seg_pendingcount;
>  	bool ext_addr;
>  	u32 desc_submitcount;
>  	u32 residue;
> @@ -946,6 +948,17 @@ static bool xilinx_dma_is_idle(struct xilinx_dma_chan
> *chan)  }
> 
>  /**
> + * xilinx_vdma_is_multi_buffer - Check if VDMA has multiple
> +framebuffers
> + * @chan: Driver specific DMA channel
> + *
> + * Return: '1' if is multi buffer, '0' if not.
> + */
> +static bool xilinx_vdma_is_multi_buffer(struct xilinx_dma_chan *chan) {
> +	return chan->num_frms > 1;
> +}
> +
> +/**
>   * xilinx_dma_halt - Halt DMA channel
>   * @chan: Driver specific DMA channel
>   */
> @@ -966,6 +979,10 @@ static void xilinx_dma_halt(struct xilinx_dma_chan
> *chan)
>  			chan, dma_ctrl_read(chan,
> XILINX_DMA_REG_DMASR));
>  		chan->err = true;
>  	}
> +
> +	chan->seg_pendingcount = 0;
> +	chan->desc_submitcount = 0;
> +	chan->running = false;
>  }
> 
>  /**
> @@ -1002,14 +1019,35 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
>  	struct xilinx_dma_tx_descriptor *desc, *tail_desc;
>  	u32 reg;
>  	struct xilinx_vdma_tx_segment *tail_segment;
> +	bool mbf = xilinx_vdma_is_multi_buffer(chan) && !config->park;
> 
>  	/* This function was invoked with lock held */
>  	if (chan->err)
>  		return;
> 
> -	if (list_empty(&chan->pending_list))
> +	/*
> +	 * Can't continue if we have already consumed all the available
> +	 * framebuffers and they are not done yet.
> +	 */
> +	if (mbf && (chan->seg_pendingcount >= chan->num_frms))
>  		return;
> 
> +	if (list_empty(&chan->pending_list)) {
> +		/*
> +		 * Can't keep running if there are no pending segments. Halt
> +		 * the channel as security measure. Notice that this will not
> +		 * corrupt current transactions because this function is
> +		 * called after the pendingcount is decreased and after the
> +		 * current transaction has finished.
> +		 */
> +		if (mbf && chan->running && !chan->seg_pendingcount) {
> +			dev_dbg(chan->dev, "pending list empty: halting\n");
> +			xilinx_dma_halt(chan);
> +		}
> +
> +		return;
> +	}
> +
>  	desc = list_first_entry(&chan->pending_list,
>  				struct xilinx_dma_tx_descriptor, node);
>  	tail_desc = list_last_entry(&chan->pending_list,
> @@ -1079,6 +1117,8 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
>  	if (chan->has_sg) {
>  		dma_ctrl_write(chan, XILINX_DMA_REG_TAILDESC,
>  				tail_segment->phys);
> +		list_splice_tail_init(&chan->pending_list, &chan->active_list);
> +		chan->desc_pendingcount = 0;
>  	} else {
>  		struct xilinx_vdma_tx_segment *segment, *last = NULL;
>  		int i = 0;
> @@ -1097,29 +1137,34 @@ static void xilinx_vdma_start_transfer(struct
> xilinx_dma_chan *chan)
> 
> 	XILINX_VDMA_REG_START_ADDRESS(i++),
>  					segment->hw.buf_addr);
> 
> +			chan->seg_pendingcount++;
>  			last = segment;
>  		}
> 
>  		if (!last)
>  			return;
> 
> -		/* HW expects these parameters to be same for one
> transaction */
> -		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
> >hw.hsize);
> -		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
> -				last->hw.stride);
> -		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
> >hw.vsize);
> -	}
> -
> -	if (!chan->has_sg) {
>  		list_del(&desc->node);
>  		list_add_tail(&desc->node, &chan->active_list);
>  		chan->desc_submitcount++;
>  		chan->desc_pendingcount--;
>  		if (chan->desc_submitcount == chan->num_frms)
>  			chan->desc_submitcount = 0;
> -	} else {
> -		list_splice_tail_init(&chan->pending_list, &chan->active_list);
> -		chan->desc_pendingcount = 0;
> +
> +		/*
> +		 * Can't start until all the framebuffers have been programmed
> +		 * or else corruption can occur.
> +		 */
> +		if (mbf && !chan->running &&
> +		   (chan->seg_pendingcount < chan->num_frms))
> +			return;
> +
> +		/* HW expects these parameters to be same for one
> transaction */
> +		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
> >hw.hsize);
> +		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
> +				last->hw.stride);
> +		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
> >hw.vsize);
> +		chan->running = true;
>  	}
>  }
> 
> @@ -1327,12 +1372,20 @@ static void xilinx_dma_issue_pending(struct
> dma_chan *dchan)  static void xilinx_dma_complete_descriptor(struct
> xilinx_dma_chan *chan)  {
>  	struct xilinx_dma_tx_descriptor *desc, *next;
> +	struct xilinx_vdma_tx_segment *segment;
> 
>  	/* This function was invoked with lock held */
>  	if (list_empty(&chan->active_list))
>  		return;
> 
>  	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> +		if (chan->xdev->dma_config->dmatype == XDMA_TYPE_VDMA)
> {
> +			list_for_each_entry(segment, &desc->segments, node)
> {
> +				if (chan->seg_pendingcount > 0)
> +					chan->seg_pendingcount--;
> +			}
> +		}
> +
>  		list_del(&desc->node);
>  		if (!desc->cyclic)
>  			dma_cookie_complete(&desc->async_tx);
> @@ -1366,6 +1419,9 @@ static int xilinx_dma_reset(struct xilinx_dma_chan
> *chan)
>  	}
> 
>  	chan->err = false;
> +	chan->seg_pendingcount = 0;
> +	chan->desc_submitcount = 0;
> +	chan->running = false;
> 
>  	return err;
>  }
> --
> 1.9.1
> 

^ permalink raw reply

* [PATCH v6 2/5] i2c: Add STM32F4 I2C driver
From: M'boumba Cedric Madianga @ 2016-12-15 15:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161213092031.d2ax2pnpzzicriel@pengutronix.de>

Hello,

Thanks for this review, it will help.

2016-12-13 10:20 GMT+01:00 Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de>:
> Hello,
>
> On Mon, Dec 12, 2016 at 05:15:39PM +0100, M'boumba Cedric Madianga wrote:
>> This patch adds support for the STM32F4 I2C controller.
>>
>> Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
>> ---
>>  drivers/i2c/busses/Kconfig       |  10 +
>>  drivers/i2c/busses/Makefile      |   1 +
>>  drivers/i2c/busses/i2c-stm32f4.c | 849 +++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 860 insertions(+)
>>  create mode 100644 drivers/i2c/busses/i2c-stm32f4.c
>>
>> diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
>> index 0cdc844..2719208 100644
>> --- a/drivers/i2c/busses/Kconfig
>> +++ b/drivers/i2c/busses/Kconfig
>> @@ -886,6 +886,16 @@ config I2C_ST
>>         This driver can also be built as module. If so, the module
>>         will be called i2c-st.
>>
>> +config I2C_STM32F4
>> +     tristate "STMicroelectronics STM32F4 I2C support"
>> +     depends on ARCH_STM32 || COMPILE_TEST
>> +     help
>> +       Enable this option to add support for STM32 I2C controller embedded
>> +       in STM32F4 SoCs.
>> +
>> +       This driver can also be built as module. If so, the module
>> +       will be called i2c-stm32f4.
>> +
>>  config I2C_STU300
>>       tristate "ST Microelectronics DDC I2C interface"
>>       depends on MACH_U300
>> diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
>> index 1c1bac8..a2c6ff5 100644
>> --- a/drivers/i2c/busses/Makefile
>> +++ b/drivers/i2c/busses/Makefile
>> @@ -85,6 +85,7 @@ obj-$(CONFIG_I2C_SH_MOBILE) += i2c-sh_mobile.o
>>  obj-$(CONFIG_I2C_SIMTEC)     += i2c-simtec.o
>>  obj-$(CONFIG_I2C_SIRF)               += i2c-sirf.o
>>  obj-$(CONFIG_I2C_ST)         += i2c-st.o
>> +obj-$(CONFIG_I2C_STM32F4)    += i2c-stm32f4.o
>>  obj-$(CONFIG_I2C_STU300)     += i2c-stu300.o
>>  obj-$(CONFIG_I2C_SUN6I_P2WI) += i2c-sun6i-p2wi.o
>>  obj-$(CONFIG_I2C_TEGRA)              += i2c-tegra.o
>> diff --git a/drivers/i2c/busses/i2c-stm32f4.c b/drivers/i2c/busses/i2c-stm32f4.c
>> new file mode 100644
>> index 0000000..89ad579
>> --- /dev/null
>> +++ b/drivers/i2c/busses/i2c-stm32f4.c
>> @@ -0,0 +1,849 @@
>> +/*
>> + * Driver for STMicroelectronics STM32 I2C controller
>> + *
>> + * Copyright (C) M'boumba Cedric Madianga 2015
>> + * Author: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
>> + *
>> + * This driver is based on i2c-st.c
>> + *
>> + * License terms:  GNU General Public License (GPL), version 2
>> + */
>
> If there is a public description available for the device, a link here
> would be great.
The device is described in the reference manual of the STM32F429/439 SoC.
As this reference manual is public, I will add it at the beginning of
the driver as requested.

>
>> +
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/err.h>
>> +#include <linux/i2c.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/iopoll.h>
>> +#include <linux/module.h>
>> +#include <linux/of_address.h>
>> +#include <linux/of_irq.h>
>> +#include <linux/of.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/reset.h>
>> +
>> +/* STM32F4 I2C offset registers */
>> +#define STM32F4_I2C_CR1                      0x00
>> +#define STM32F4_I2C_CR2                      0x04
>> +#define STM32F4_I2C_DR                       0x10
>> +#define STM32F4_I2C_SR1                      0x14
>> +#define STM32F4_I2C_SR2                      0x18
>> +#define STM32F4_I2C_CCR                      0x1C
>> +#define STM32F4_I2C_TRISE            0x20
>> +#define STM32F4_I2C_FLTR             0x24
>> +
>> +/* STM32F4 I2C control 1*/
>> +#define STM32F4_I2C_CR1_SWRST                BIT(15)
>> +#define STM32F4_I2C_CR1_POS          BIT(11)
>> +#define STM32F4_I2C_CR1_ACK          BIT(10)
>> +#define STM32F4_I2C_CR1_STOP         BIT(9)
>> +#define STM32F4_I2C_CR1_START                BIT(8)
>> +#define STM32F4_I2C_CR1_PE           BIT(0)
>> +
>> +/* STM32F4 I2C control 2 */
>> +#define STM32F4_I2C_CR2_FREQ_MASK    GENMASK(5, 0)
>> +#define STM32F4_I2C_CR2_FREQ(n)              ((n & STM32F4_I2C_CR2_FREQ_MASK))
>
> This should better be ((n) & STM32F4_I2C_CR2_FREQ_MASK). There a few
> more constants that need the same fix.
OK I will fix it in the V7. Thanks.

>
>> +#define STM32F4_I2C_CR2_ITBUFEN              BIT(10)
>> +#define STM32F4_I2C_CR2_ITEVTEN              BIT(9)
>> +#define STM32F4_I2C_CR2_ITERREN              BIT(8)
>> +#define STM32F4_I2C_CR2_IRQ_MASK     (STM32F4_I2C_CR2_ITBUFEN \
>> +                                     | STM32F4_I2C_CR2_ITEVTEN \
>> +                                     | STM32F4_I2C_CR2_ITERREN)
>
> I'd layout this like:
>
>         #define STM32F4_I2C_CR2_IRQ_MASK        (STM32F4_I2C_CR2_ITBUFEN | \
>                                                  STM32F4_I2C_CR2_ITEVTEN | \
>                                                  STM32F4_I2C_CR2_ITERREN)
>
> which is more usual I think.
OK I will fix it in the V7. Thanks.

>
>> +/* STM32F4 I2C Status 1 */
>> +#define STM32F4_I2C_SR1_AF           BIT(10)
>> +#define STM32F4_I2C_SR1_ARLO         BIT(9)
>> +#define STM32F4_I2C_SR1_BERR         BIT(8)
>> +#define STM32F4_I2C_SR1_TXE          BIT(7)
>> +#define STM32F4_I2C_SR1_RXNE         BIT(6)
>> +#define STM32F4_I2C_SR1_BTF          BIT(2)
>> +#define STM32F4_I2C_SR1_ADDR         BIT(1)
>> +#define STM32F4_I2C_SR1_SB           BIT(0)
>> +#define STM32F4_I2C_SR1_ITEVTEN_MASK (STM32F4_I2C_SR1_BTF \
>> +                                     | STM32F4_I2C_SR1_ADDR \
>> +                                     | STM32F4_I2C_SR1_SB)
>> +#define STM32F4_I2C_SR1_ITBUFEN_MASK (STM32F4_I2C_SR1_TXE \
>> +                                     | STM32F4_I2C_SR1_RXNE)
>> +#define STM32F4_I2C_SR1_ITERREN_MASK (STM32F4_I2C_SR1_AF \
>> +                                     | STM32F4_I2C_SR1_ARLO \
>> +                                     | STM32F4_I2C_SR1_BERR)
>> +
>> +/* STM32F4 I2C Status 2 */
>> +#define STM32F4_I2C_SR2_BUSY         BIT(1)
>> +
>> +/* STM32F4 I2C Control Clock */
>> +#define STM32F4_I2C_CCR_CCR_MASK     GENMASK(11, 0)
>> +#define STM32F4_I2C_CCR_CCR(n)               ((n & STM32F4_I2C_CCR_CCR_MASK))
>> +#define STM32F4_I2C_CCR_FS           BIT(15)
>> +#define STM32F4_I2C_CCR_DUTY         BIT(14)
>> +
>> +/* STM32F4 I2C Trise */
>> +#define STM32F4_I2C_TRISE_VALUE_MASK GENMASK(5, 0)
>> +#define STM32F4_I2C_TRISE_VALUE(n)   ((n & STM32F4_I2C_TRISE_VALUE_MASK))
>> +
>> +/* STM32F4 I2C Filter */
>> +#define STM32F4_I2C_FLTR_DNF_MASK    GENMASK(3, 0)
>> +#define STM32F4_I2C_FLTR_DNF(n)              ((n & STM32F4_I2C_FLTR_DNF_MASK))
>> +#define STM32F4_I2C_FLTR_ANOFF               BIT(4)
>> +
>> +#define STM32F4_I2C_MIN_FREQ         2U
>> +#define STM32F4_I2C_MAX_FREQ         42U
>> +#define FAST_MODE_MAX_RISE_TIME              1000
>> +#define STD_MODE_MAX_RISE_TIME               300
>
> Are these supposed to be the values "rise time of both SDA and SCL
> signals" from the i2c specification? If so, you got it wrong, fast mode
> has the smaller value.
Yes you are right. My mistake. Thanks

> Maybe these constants could get a home in a more central place?
Yes we probably could add these constants in a include file like i2c.h
or someting like that.
Wolfram, what is your opinion regarding this proposal ?

> Also I'd add /* ns */ to the definition.
Ok

>
>> +#define MHZ_TO_HZ                    1000000
>> +
>> +enum stm32f4_i2c_speed {
>> +     STM32F4_I2C_SPEED_STANDARD, /* 100 kHz */
>> +     STM32F4_I2C_SPEED_FAST, /* 400 kHz */
>> +     STM32F4_I2C_SPEED_END,
>> +};
>> +
>> +/**
>> + * struct stm32f4_i2c_timings - per-Mode tuning parameters
>> + * @duty: Fast mode duty cycle
>> + * @mul_ccr: Value to be multiplied to CCR to reach 100Khz/400Khz SCL frequency
>> + * @min_ccr: Minimum clock ctrl reg value to reach 100Khz/400Khz SCL frequency
>> + */
>> +struct stm32f4_i2c_timings {
>> +     u32 rate;
>
> rate is undocumented and unused.
Good point. I will remove it. Thanks.

>
>> +     u32 duty;
>> +     u32 mul_ccr;
>> +     u32 min_ccr;
>> +};
>> +
>> +/**
>> + * struct stm32f4_i2c_msg - client specific data
>> + * @addr: 8-bit slave addr, including r/w bit
>> + * @count: number of bytes to be transferred
>> + * @buf: data buffer
>> + * @result: result of the transfer
>> + * @stop: last I2C msg to be sent, i.e. STOP to be generated
>> + */
>> +struct stm32f4_i2c_msg {
>> +     u8      addr;
>> +     u32     count;
>> +     u8      *buf;
>> +     int     result;
>> +     bool    stop;
>> +};
>> +
>> +/**
>> + * struct stm32f4_i2c_dev - private data of the controller
>> + * @adap: I2C adapter for this controller
>> + * @dev: device for this controller
>> + * @base: virtual memory area
>> + * @complete: completion of I2C message
>> + * @irq_event: interrupt event line for the controller
>> + * @irq_error: interrupt error line for the controller
>> + * @clk: hw i2c clock
>> + * speed: I2C clock frequency of the controller. Standard or Fast only supported
>> + * @msg: I2C transfer information
>> + */
>> +struct stm32f4_i2c_dev {
>> +     struct i2c_adapter              adap;
>> +     struct device                   *dev;
>> +     void __iomem                    *base;
>> +     struct completion               complete;
>> +     int                             irq_event;
>> +     int                             irq_error;
>
> You only use irq_event in the probe function. So there is no need to
> remember this one and you could use a local variable instead.
Ok, I will fix it in the V7. Thanks

>
>> +     struct clk                      *clk;
>> +     int                             speed;
>> +     struct stm32f4_i2c_msg          msg;
>> +};
>> +
>> +static struct stm32f4_i2c_timings i2c_timings[] = {
>> +     [STM32F4_I2C_SPEED_STANDARD] = {
>> +             .mul_ccr                = 1,
>> +             .min_ccr                = 4,
>> +             .duty                   = 0,
>> +     },
>> +     [STM32F4_I2C_SPEED_FAST] = {
>> +             .mul_ccr                = 16,
>> +             .min_ccr                = 1,
>> +             .duty                   = 1,
>> +     },
>
> Are these values from the datasheet?
Yes, these values come from I2C IP datasheet.

>
>> +};
>> +
>> +static inline void stm32f4_i2c_set_bits(void __iomem *reg, u32 mask)
>> +{
>> +     writel_relaxed(readl_relaxed(reg) | mask, reg);
>> +}
>> +
>> +static inline void stm32f4_i2c_clr_bits(void __iomem *reg, u32 mask)
>> +{
>> +     writel_relaxed(readl_relaxed(reg) & ~mask, reg);
>> +}
>> +
>> +static void stm32f4_i2c_soft_reset(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     void __iomem *reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +
>> +     stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_SWRST);
>> +     stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR1_SWRST);
>
> Not very critical, but you're doing an unneeded register access here
> because the register is read twice.
>
> Also I think readability would improve if you dropped
> stm32f4_i2c_{set,clr}_bits and do their logic explicitly in the callers.
>
>         stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_SWRST);
>         stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR1_SWRST);
>
> vs
>
>         val = readl_relaxed(reg);
>         writel_relaxed(val | STM32F4_I2C_CR1_SWRST, reg);
>         writel_relaxed(val, reg);
>
If it is just for this function, I don't have any objection to use
your proposal.
But if your request, it is to drop all calls of
stm32f4_i2c_{set,clr}_bits, I am wondering if it is something really
critical ?
Indeed, this is a big impact in this driver and I would prefer to avoid it.

>> +}
>> +
>> +static void stm32f4_i2c_disable_it(struct stm32f4_i2c_dev *i2c_dev)
>
> What is "it"? If it stands for "interrupt" the more usual abbrev is
> "irq".
Yes it stands for interrupt.
So, I will replace it by irq in the next version.

>
>> +{
>> +     void __iomem *reg = i2c_dev->base + STM32F4_I2C_CR2;
>> +
>> +     stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR2_IRQ_MASK);
>> +}
>> +
>> +static void stm32f4_i2c_set_periph_clk_freq(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     u32 clk_rate, cr2, freq;
>> +
>> +     cr2 = readl_relaxed(i2c_dev->base + STM32F4_I2C_CR2);
>> +     cr2 &= ~STM32F4_I2C_CR2_FREQ_MASK;
>> +     clk_rate = clk_get_rate(i2c_dev->clk);
>> +     freq = clk_rate / MHZ_TO_HZ;
>> +     freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
>> +     cr2 |= STM32F4_I2C_CR2_FREQ(freq);
>> +     writel_relaxed(cr2, i2c_dev->base + STM32F4_I2C_CR2);
>
> Can you quote the data sheet enough in a comment here to make it obvious
> that your calculation is right?
Ok I will add it.

>
> Would it be more sensible to error out if clk_rate / MHZ_TO_HZ isn't in
> the interval [STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ]?
Yes, input clock must be between 2 MHz and 42 Mhz to achieve standard
or fast mode mode I?C frequencies.
If it is not the case, the I2C signal integrity is not guarantee and
could lead to communication issue between devices on the I2C bus.

>
> Usually I would expect that you need to use
> DIV_ROUND_UP(clk_rate, MHZ_TO_HZ) instead of a plain division.
Ok, I will use this macro in the next version

>
>> +}
>> +
>> +static void stm32f4_i2c_set_rise_time(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     u32 trise, freq, cr2, val;
>> +
>> +     cr2 = readl_relaxed(i2c_dev->base + STM32F4_I2C_CR2);
>> +     freq = cr2 & STM32F4_I2C_CR2_FREQ_MASK;
>> +
>> +     trise = readl_relaxed(i2c_dev->base + STM32F4_I2C_TRISE);
>> +     trise &= ~STM32F4_I2C_TRISE_VALUE_MASK;
>
> Are you required to use rmw for STM32F4_I2C_TRISE? I'd prefer
>
>         writel_relaxed(STM32F4_I2C_TRISE_VALUE(..), i2c_dev->base + STM32F4_I2C_TRISE);
>
> unless the datasheet requires rmw.
>
>> +     /* Maximum rise time computation */
>> +     if (i2c_dev->speed == STM32F4_I2C_SPEED_STANDARD) {
>> +             trise |= STM32F4_I2C_TRISE_VALUE((freq + 1));
>
> A single pair of parenthesis is enough when you fix
> STM32F4_I2C_TRISE_VALUE as suggested above.
The datasheet does not required to use rmw so I will use your proposal
in the next version. Thanks.

>
>> +     } else {
>> +             val = freq * FAST_MODE_MAX_RISE_TIME / STD_MODE_MAX_RISE_TIME;
>> +             trise |= STM32F4_I2C_TRISE_VALUE((val + 1));
>
> val could be local to this branch.
>
> Or make it shorter using:
>
>         freq = cr2 & STM32F4_I2C_CR2_FREQ_MASK;
>         if (i2c_dev->speed == STM32F4_I2C_SPEED_FAST)
>                 freq = freq * FAST_MODE_MAX_RISE_TIME / STD_MODE_MAX_RISE_TIME;
>
>         writel_relaxed(STM32F4_I2C_TRISE_VALUE(freq + 1), ...);
>
> A quote from the data sheet about the algorithm would be good here, too.
Ok, I will use a shorter way to compute freq and add a quote from the datasheet

>
>> +     }
>> +
>> +     writel_relaxed(trise, i2c_dev->base + STM32F4_I2C_TRISE);
>> +}
>> +
>> +static void stm32f4_i2c_set_speed_mode(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     struct stm32f4_i2c_timings *t = &i2c_timings[i2c_dev->speed];
>> +     u32 ccr, clk_rate;
>> +     int val;
>> +
>> +     ccr = readl_relaxed(i2c_dev->base + STM32F4_I2C_CCR);
>> +     ccr &= ~(STM32F4_I2C_CCR_FS | STM32F4_I2C_CCR_DUTY |
>> +              STM32F4_I2C_CCR_CCR_MASK);
>> +
>> +     clk_rate = clk_get_rate(i2c_dev->clk);
>> +     val = clk_rate / MHZ_TO_HZ * t->mul_ccr;
>
> Is the rounding done right? Again please describe the hardware in a
> comment.
Ok I will use DIV_ROUND_UP here and add a comment from datasheet

>
>> +     if (val < t->min_ccr)
>> +             val = t->min_ccr;
>> +     ccr |= STM32F4_I2C_CCR_CCR(val);
>> +
>> +     if (t->duty)
>> +             ccr |= STM32F4_I2C_CCR_FS | STM32F4_I2C_CCR_DUTY;
>> +
>> +     writel_relaxed(ccr, i2c_dev->base + STM32F4_I2C_CCR);
>> +}
>> +[...]
>> +
>> +static int stm32f4_i2c_wait_free_bus(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     u32 status;
>> +     int ret;
>> +
>> +     ret = readl_relaxed_poll_timeout(i2c_dev->base + STM32F4_I2C_SR2,
>> +                                      status,
>> +                                      !(status & STM32F4_I2C_SR2_BUSY),
>> +                                      10, 1000);
>> +     if (ret) {
>> +             dev_err(i2c_dev->dev, "bus not free\n");
>> +             ret = -EBUSY;
>
> I'm not sure if "bus not free" deserves an error message. Wolfram?
>
>> +     }
>> +
>> +     return ret;
>> +}
>> +
>> +[...]
>> +static void stm32f4_i2c_read_msg(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     struct stm32f4_i2c_msg *msg = &i2c_dev->msg;
>> +     u32 rbuf;
>> +
>> +     rbuf = readl_relaxed(i2c_dev->base + STM32F4_I2C_DR);
>> +     *msg->buf++ = (u8)rbuf & 0xff;
>
> unneeded cast (or unneeded & 0xff).
Ok thanks

>
>> +     msg->count--;
>> +}
>> +
>> +[...]
>> +/**
>> + * stm32f4_i2c_handle_read() - Handle FIFO empty interrupt in case of read
>> + * @i2c_dev: Controller's private data
>> + */
>> +static void stm32f4_i2c_handle_read(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     struct stm32f4_i2c_msg *msg = &i2c_dev->msg;
>> +     void __iomem *reg = i2c_dev->base + STM32F4_I2C_CR2;
>> +
>> +     switch (msg->count) {
>> +     case 1:
>> +             stm32f4_i2c_disable_it(i2c_dev);
>> +             stm32f4_i2c_read_msg(i2c_dev);
>> +             complete(&i2c_dev->complete);
>> +             break;
>> +     case 2:
>> +     case 3:
>> +             stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR2_ITBUFEN);
>> +             break;
>> +     default:
>> +             stm32f4_i2c_read_msg(i2c_dev);
>> +     }
>
> It looks wrong that you don't call stm32f4_i2c_read_msg if msg->count is
> 2 or 3. I guess that's because these cases are handled in
> stm32f4_i2c_handle_rx_btf? Maybe you can simplify the logic a bit?
stm32f4_i2c_handle_read is called when RXNE bit is set due to new data
present in DR register.
stm32f4_i2c_handle_rx_btf is called when BTF bit is set.
This bit is set when there is new data in shift register whereas data
in DR register has not been read yet.
The datasheet requires to wait for BTF bit for 2 byte reception and
for the 3 last bytes to be read for N bytes reception (with N > 2)
That's why, in these cases (2 and 3), I clear the ITBUF interrupt in
order to not be preempted again for RXNE event as I know that I have
to wait for BTF event.
So, this is not a wrong case but I will add a comment to explain that.

>
>> +}
>> +
>> +/**
>> + * stm32f4_i2c_handle_rx_btf() - Handle byte transfer finished interrupt
>> + * in case of read
>> + * @i2c_dev: Controller's private data
>> + */
>> +static void stm32f4_i2c_handle_rx_btf(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     struct stm32f4_i2c_msg *msg = &i2c_dev->msg;
>> +     void __iomem *reg;
>> +     u32 mask;
>> +     int i;
>> +
>> +     switch (msg->count) {
>
> I don't understand why the handling depends on the number of messages.
Please see my above comment

>
>> +     case 2:
>> +             reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +             /* Generate STOP or REPSTART */
>
> I stumbled about "REPSTART" and would spell it out as "repeated Start".
Ok

>
>> +             if (msg->stop)
>> +                     stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_STOP);
>> +             else
>> +                     stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_START);
>> +
>> +             /* Read two last data bytes */
>> +             for (i = 2; i > 0; i--)
>> +                     stm32f4_i2c_read_msg(i2c_dev);
>> +
>> +             /* Disable EVT and ERR interrupt */
>> +             reg = i2c_dev->base + STM32F4_I2C_CR2;
>> +             mask = STM32F4_I2C_CR2_ITEVTEN | STM32F4_I2C_CR2_ITERREN;
>> +             stm32f4_i2c_clr_bits(reg, mask);
>> +
>> +             complete(&i2c_dev->complete);
>> +             break;
>> +     case 3:
>> +             /* Enable ACK and read data */
>> +             reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +             stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR1_ACK);
>> +             stm32f4_i2c_read_msg(i2c_dev);
>> +             break;
>> +     default:
>> +             stm32f4_i2c_read_msg(i2c_dev);
>> +     }
>> +}
>> +
>> +/**
>> + * stm32f4_i2c_handle_rx_addr() - Handle address matched interrupt in case of
>> + * master receiver
>> + * @i2c_dev: Controller's private data
>> + */
>> +static void stm32f4_i2c_handle_rx_addr(struct stm32f4_i2c_dev *i2c_dev)
>> +{
>> +     struct stm32f4_i2c_msg *msg = &i2c_dev->msg;
>> +     void __iomem *reg;
>> +
>> +     switch (msg->count) {
>> +     case 0:
>> +             stm32f4_i2c_terminate_xfer(i2c_dev);
>> +             /* Clear ADDR flag */
>> +             readl_relaxed(i2c_dev->base + STM32F4_I2C_SR2);
>> +             break;
>> +     case 1:
>> +             /*
>> +              * Single byte reception:
>> +              * Enable NACK, clear ADDR flag and generate STOP or RepSTART
>> +              */
>> +             reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +             stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR1_ACK);
>> +             if (msg->stop)
>> +                     stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_STOP);
>> +             else
>> +                     stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_START);
>> +             break;
>> +     case 2:
>> +             /*
>> +              * 2-byte reception:
>> +              * Enable NACK and PEC Position Ack and clear ADDR flag
>
> What is PEC?
PEC stands for Packet Error Checking.
This feature is used is SMbus mode in order to guarantee data
integrity during an I2C transaction thanks to packet error code
comparaison.
The POS bit is used in reception to indicate that the next byte
received in the shift register will be ACK by hardware.
So, this is a wrong comment I would say POS ACK and I will fix it in
the next version.

>
>> +              */
>> +             reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +             stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR1_ACK);
>> +             stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_POS);
>> +             readl_relaxed(i2c_dev->base + STM32F4_I2C_SR2);
>> +             break;
>> +
>> +     default:
>> +             /* N-byte reception: Enable ACK and clear ADDR flag */
>> +             reg = i2c_dev->base + STM32F4_I2C_CR1;
>> +             stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR1_ACK);
>> +             readl_relaxed(i2c_dev->base + STM32F4_I2C_SR2);
>> +             break;
>> +     }
>> +}
>> +
>> +/**
>> + * stm32f4_i2c_isr_event() - Interrupt routine for I2C bus event
>> + * @irq: interrupt number
>> + * @data: Controller's private data
>> + */
>> +static irqreturn_t stm32f4_i2c_isr_event(int irq, void *data)
>> +{
>> +[...]
>> +     real_status = readl_relaxed(i2c_dev->base + STM32F4_I2C_SR1);
>
> s/real_status/status/ ?
Ok. Thanks

>
>> +
>> +     if (!(real_status & possible_status)) {
>> +             dev_dbg(i2c_dev->dev,
>> +                     "spurious evt it (status=0x%08x, ien=0x%08x)\n",
>> +                     real_status, ien);
>
> s/it/irq/
Ok. Thanks

>
>> +             return IRQ_NONE;
>> +     }
>> +
>> +     /* Use __fls() to check error bits first */
>> +     flag = __fls(real_status & possible_status);
>
> If you get several events reported you only handle a single one. Is this
> effective?
You are right, if several events occur, I will execute this irq
routines for each event.
I will rework this in the next version. Thanks


>
>> +     switch (1 << flag) {
>> +     case STM32F4_I2C_SR1_SB:
>> +             stm32f4_i2c_write_byte(i2c_dev, msg->addr);
>> +             break;
>> +
>> +     case STM32F4_I2C_SR1_ADDR:
>> +             if (msg->addr & I2C_M_RD)
>> +                     stm32f4_i2c_handle_rx_addr(i2c_dev);
>> +             else
>> +                     readl_relaxed(i2c_dev->base + STM32F4_I2C_SR2);
>> +
>> +             /* Enable ITBUF interrupts */
>
> What is ITBUF?
ITBUF is an interrupt generated when RxNE or TxE flag is set

>
>> +             reg = i2c_dev->base + STM32F4_I2C_CR2;
>> +             stm32f4_i2c_set_bits(reg, STM32F4_I2C_CR2_ITBUFEN);
>> +             break;
>> +
>> +     case STM32F4_I2C_SR1_BTF:
>> +             if (msg->addr & I2C_M_RD)
>> +                     stm32f4_i2c_handle_rx_btf(i2c_dev);
>> +             else
>> +                     stm32f4_i2c_handle_write(i2c_dev);
>> +             break;
>> +
>> +     case STM32F4_I2C_SR1_TXE:
>> +             stm32f4_i2c_handle_write(i2c_dev);
>> +             break;
>> +
>> +     case STM32F4_I2C_SR1_RXNE:
>> +             stm32f4_i2c_handle_read(i2c_dev);
>> +             break;
>> +
>> +     default:
>> +             dev_err(i2c_dev->dev,
>> +                     "evt it unhandled: status=0x%08x)\n", real_status);
>
> s/it/irq/
ok

>
>> +             return IRQ_NONE;
>> +     }
>> +
>> +     return IRQ_HANDLED;
>> +}
>> +
>> +[...]
>> +static int stm32f4_i2c_xfer_msg(struct stm32f4_i2c_dev *i2c_dev,
>> +                             struct i2c_msg *msg, bool is_first,
>> +                             bool is_last)
>> +{
>> +[...]
>> +     /* Enable ITEVT and ITERR interrupts */
>
> This comment isn't helpful. Mentioning their meaning would be great
> instead.
ok I will add it (ITEVT = event interrupt and ITERR = error interrupt)

>
>> +[...]
>> +static int stm32f4_i2c_xfer(struct i2c_adapter *i2c_adap, struct i2c_msg msgs[],
>> +                         int num)
>> +{
>> +     struct stm32f4_i2c_dev *i2c_dev = i2c_get_adapdata(i2c_adap);
>> +     int ret, i;
>> +
>> +     ret = clk_enable(i2c_dev->clk);
>> +     if (ret) {
>> +             dev_err(i2c_dev->dev, "Failed to enable clock\n");
>> +             return ret;
>> +     }
>> +
>> +     stm32f4_i2c_hw_config(i2c_dev);
>> +
>> +     for (i = 0; i < num && !ret; i++)
>> +             ret = stm32f4_i2c_xfer_msg(i2c_dev, &msgs[i], i == 0,
>> +                                        i == num - 1);
>> +
>> +     clk_disable(i2c_dev->clk);
>> +
>> +     return (ret < 0) ? ret : i;
>
> using num instead of i would be a bit more obvious.
ok

>
>> +static int stm32f4_i2c_probe(struct platform_device *pdev)
>> +{
>> +[...]
>> +     i2c_dev->speed = STM32F4_I2C_SPEED_STANDARD;
>> +     ret = of_property_read_u32(np, "clock-frequency", &clk_rate);
>> +     if ((!ret) && (clk_rate == 400000))
>> +             i2c_dev->speed = STM32F4_I2C_SPEED_FAST;
>
> I'd use
>
>         if (!ret && clk_rate >= 400000)
>                 i2c_dev->speed = STM32F4_I2C_SPEED_FAST;
>
> . That's less parenthesis and a more robust selection of the bus
> frequency.
ok

>
>> +
>> +     i2c_dev->dev = &pdev->dev;
>> +
>> +     ret = devm_request_threaded_irq(&pdev->dev, i2c_dev->irq_event,
>> +                                     NULL, stm32f4_i2c_isr_event,
>> +                                     IRQF_ONESHOT, pdev->name, i2c_dev);
>> +     if (ret) {
>> +             dev_err(&pdev->dev, "Failed to request irq %i\n",
>> +                     i2c_dev->irq_error);
>
> That's wrong. Requesting irq_event failed.
Ok Thanks.

>
>> +             goto clk_free;
>> +     }
>> +
>> +     ret = devm_request_threaded_irq(&pdev->dev, i2c_dev->irq_error,
>> +                                     NULL, stm32f4_i2c_isr_error,
>> +                                     IRQF_ONESHOT, pdev->name, i2c_dev);
>> +     if (ret) {
>> +             dev_err(&pdev->dev, "Failed to request irq %i\n",
>> +                     i2c_dev->irq_error);
>> +             goto clk_free;
>
> It would also be nice to know for which type of irq this failed. I.e.
> please point out if this is the error irq or the event irq in the
> message. Ditto for checking the return type of irq_of_parse_and_map.
Ok I will fix it in the next version.

>
>> +     }
>> +
>> +     adap = &i2c_dev->adap;
>> +     i2c_set_adapdata(adap, i2c_dev);
>> +     snprintf(adap->name, sizeof(adap->name), "STM32 I2C(%pa)", &res->start);
>> +     adap->owner = THIS_MODULE;
>> +     adap->timeout = 2 * HZ;
>> +     adap->retries = 0;
>> +     adap->algo = &stm32f4_i2c_algo;
>> +     adap->dev.parent = &pdev->dev;
>> +     adap->dev.of_node = pdev->dev.of_node;
>> +
>> +     init_completion(&i2c_dev->complete);
>> +
>> +     ret = i2c_add_adapter(adap);
>> +     if (ret)
>> +             goto clk_free;
>> +
>> +     platform_set_drvdata(pdev, i2c_dev);
>> +
>> +     dev_info(i2c_dev->dev, "STM32F4 I2C driver initialized\n");
>
> This is wrong. The driver is bound now to a device, not initialized.
Ok I will replaced initialized by registered. Thanks.

>
>> +static const struct of_device_id stm32f4_i2c_match[] = {
>> +     { .compatible = "st,stm32f4-i2c", },
>> +     {},
>> +};
>> +MODULE_DEVICE_TABLE(of, stm32f4_i2c_match);
>> +
>> +static struct platform_driver stm32f4_i2c_driver = {
>> +     .driver = {
>> +             .name = "stm32f4-i2c",
>> +             .of_match_table = stm32f4_i2c_match,
>
> Is this needed?
Without of_match_table, I could not match an I2C device instance from
DT with this driver.
So maybe, there is a misunderstanding.
Could you please clarify your question ?

>
>> +     },
>> +     .probe = stm32f4_i2c_probe,
>> +     .remove = stm32f4_i2c_remove,
>> +};
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K.                           | Uwe Kleine-K?nig            |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |

Thanks again
Best regards,

Cedric

^ permalink raw reply

* [PATCH v2] i2c: designware: add reset interface
From: Ramiro Oliveira @ 2016-12-15 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481805227.9552.15.camel@linux.intel.com>

Hi Andy and Zhangfei

On 12/15/2016 12:33 PM, Andy Shevchenko wrote:
> On Thu, 2016-12-15 at 16:59 +0800, Zhangfei Gao wrote:
>> Some platforms like hi3660 need do reset first to allow accessing
>> registers
> 
> Patch itself looks good, but would be nice to have it tested.
> 
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 

I tested the patch and it's working for the ARC architecture.

>>
>> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> ---
>>  drivers/i2c/busses/i2c-designware-core.h    |  1 +
>>  drivers/i2c/busses/i2c-designware-platdrv.c | 28
>> ++++++++++++++++++++++++----
>>  2 files changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-designware-core.h
>> b/drivers/i2c/busses/i2c-designware-core.h
>> index 0d44d2a..94b14fa 100644
>> --- a/drivers/i2c/busses/i2c-designware-core.h
>> +++ b/drivers/i2c/busses/i2c-designware-core.h
>> @@ -80,6 +80,7 @@ struct dw_i2c_dev {
>>  	void __iomem		*base;
>>  	struct completion	cmd_complete;
>>  	struct clk		*clk;
>> +	struct reset_control	*rst;
>>  	u32			(*get_clk_rate_khz) (struct
>> dw_i2c_dev *dev);
>>  	struct dw_pci_controller *controller;
>>  	int			cmd_err;
>> diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c
>> b/drivers/i2c/busses/i2c-designware-platdrv.c
>> index 0b42a12..e9016ae 100644
>> --- a/drivers/i2c/busses/i2c-designware-platdrv.c
>> +++ b/drivers/i2c/busses/i2c-designware-platdrv.c
>> @@ -38,6 +38,7 @@
>>  #include <linux/pm_runtime.h>
>>  #include <linux/property.h>
>>  #include <linux/io.h>
>> +#include <linux/reset.h>
>>  #include <linux/slab.h>
>>  #include <linux/acpi.h>
>>  #include <linux/platform_data/i2c-designware.h>
>> @@ -176,6 +177,14 @@ static int dw_i2c_plat_probe(struct
>> platform_device *pdev)
>>  	dev->irq = irq;
>>  	platform_set_drvdata(pdev, dev);
>>  
>> +	dev->rst = devm_reset_control_get_optional(&pdev->dev, NULL);

devm_reset_control_get_optional() is deprecated as explained in linux/reset.h,
you should use devm_reset_control_get_optional_exclusive() or
devm_reset_control_get_optional_shared() instead, as applicable.

I submitted a similar patch earlier today and I made the same mistake.

>> +	if (IS_ERR(dev->rst)) {
>> +		if (PTR_ERR(dev->rst) == -EPROBE_DEFER)
>> +			return -EPROBE_DEFER;
>> +	} else {
>> +		reset_control_deassert(dev->rst);
>> +	}
>> +
>>  	/* fast mode by default because of legacy reasons */
>>  	dev->clk_freq = 400000;
>>  
>> @@ -207,12 +216,13 @@ static int dw_i2c_plat_probe(struct
>> platform_device *pdev)
>>  	    && dev->clk_freq != 1000000 && dev->clk_freq != 3400000)
>> {
>>  		dev_err(&pdev->dev,
>>  			"Only 100kHz, 400kHz, 1MHz and 3.4MHz
>> supported");
>> -		return -EINVAL;
>> +		r = -EINVAL;
>> +		goto exit_reset;
>>  	}
>>  
>>  	r = i2c_dw_eval_lock_support(dev);
>>  	if (r)
>> -		return r;
>> +		goto exit_reset;
>>  
>>  	dev->functionality =
>>  		I2C_FUNC_I2C |
>> @@ -270,10 +280,18 @@ static int dw_i2c_plat_probe(struct
>> platform_device *pdev)
>>  	}
>>  
>>  	r = i2c_dw_probe(dev);
>> -	if (r && !dev->pm_runtime_disabled)
>> -		pm_runtime_disable(&pdev->dev);
>> +	if (r)
>> +		goto exit_probe;
>>  
>>  	return r;
>> +
>> +exit_probe:
>> +	if (!dev->pm_runtime_disabled)
>> +		pm_runtime_disable(&pdev->dev);
>> +exit_reset:
>> +	if (!IS_ERR_OR_NULL(dev->rst))
>> +		reset_control_assert(dev->rst);
>> +	return r;
>>  }
>>  
>>  static int dw_i2c_plat_remove(struct platform_device *pdev)
>> @@ -290,6 +308,8 @@ static int dw_i2c_plat_remove(struct
>> platform_device *pdev)
>>  	pm_runtime_put_sync(&pdev->dev);
>>  	if (!dev->pm_runtime_disabled)
>>  		pm_runtime_disable(&pdev->dev);
>> +	if (!IS_ERR_OR_NULL(dev->rst))
>> +		reset_control_assert(dev->rst);
>>  
>>  	return 0;
>>  }
> 
Tested-by: Ramiro Oliveira <ramiro.oliveira@synopsys.com>

^ permalink raw reply

* [PATCH] dmaengine: xilinx_dma: Add support for multiple buffers
From: Jose Abreu @ 2016-12-15 15:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <C246CAC1457055469EF09E3A7AC4E11A4A65C899@XAP-PVEXMBX01.xlnx.xilinx.com>

Hi Kedar,


On 15-12-2016 15:19, Appana Durga Kedareswara Rao wrote:
> Hi Jose Abreu,
>
> 	Thanks for the patch...
>
>> Xilinx VDMA supports multiple framebuffers. This patch adds correct handling for
>> the scenario where multiple framebuffers are available in the HW and parking
>> mode is not set.
>>
>> We corrected these situations:
>> 	1) Do not start VDMA until all the framebuffers
>> 	have been programmed with a valid address.
>> 	2) Restart variables when VDMA halts/resets.
>> 	3) Halt channel when all the framebuffers have
>> 	finished and there is not anymore segments
>> 	pending.
>> 	4) Do not try to overlap framebuffers until they
>> 	are completed.
>>
>> All these situations, without this patch, can lead to data corruption and even
>> system memory corruption. If, for example, user has a VDMA with 3
>> framebuffers, with direction DMA_DEV_TO_MEM and user only submits one
>> segment, VDMA will write first to the segment the user submitted BUT if the
>> user doesn't submit another segment in a timelly manner then VDMA will write
>> to position 0 of system mem in the following VSYNC.
> 	I have posted different patch series for fixing these issues just now...
> Please take a look into it...
>
> Regards,
> Kedar.

Thanks, I will review them.

Best regards,
Jose Miguel Abreu

>
>> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
>> Cc: Carlos Palminha <palminha@synopsys.com>
>> Cc: Vinod Koul <vinod.koul@intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Kedareswara rao Appana <appana.durga.rao@xilinx.com>
>> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>> Cc: dmaengine at vger.kernel.org
>> Cc: linux-arm-kernel at lists.infradead.org
>> Cc: linux-kernel at vger.kernel.org
>> ---
>>  drivers/dma/xilinx/xilinx_dma.c | 80 ++++++++++++++++++++++++++++++++++-
>> ------
>>  1 file changed, 68 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
>> index 8288fe4..30eec5a 100644
>> --- a/drivers/dma/xilinx/xilinx_dma.c
>> +++ b/drivers/dma/xilinx/xilinx_dma.c
>> @@ -351,10 +351,12 @@ struct xilinx_dma_chan {
>>  	bool cyclic;
>>  	bool genlock;
>>  	bool err;
>> +	bool running;
>>  	struct tasklet_struct tasklet;
>>  	struct xilinx_vdma_config config;
>>  	bool flush_on_fsync;
>>  	u32 desc_pendingcount;
>> +	u32 seg_pendingcount;
>>  	bool ext_addr;
>>  	u32 desc_submitcount;
>>  	u32 residue;
>> @@ -946,6 +948,17 @@ static bool xilinx_dma_is_idle(struct xilinx_dma_chan
>> *chan)  }
>>
>>  /**
>> + * xilinx_vdma_is_multi_buffer - Check if VDMA has multiple
>> +framebuffers
>> + * @chan: Driver specific DMA channel
>> + *
>> + * Return: '1' if is multi buffer, '0' if not.
>> + */
>> +static bool xilinx_vdma_is_multi_buffer(struct xilinx_dma_chan *chan) {
>> +	return chan->num_frms > 1;
>> +}
>> +
>> +/**
>>   * xilinx_dma_halt - Halt DMA channel
>>   * @chan: Driver specific DMA channel
>>   */
>> @@ -966,6 +979,10 @@ static void xilinx_dma_halt(struct xilinx_dma_chan
>> *chan)
>>  			chan, dma_ctrl_read(chan,
>> XILINX_DMA_REG_DMASR));
>>  		chan->err = true;
>>  	}
>> +
>> +	chan->seg_pendingcount = 0;
>> +	chan->desc_submitcount = 0;
>> +	chan->running = false;
>>  }
>>
>>  /**
>> @@ -1002,14 +1019,35 @@ static void xilinx_vdma_start_transfer(struct
>> xilinx_dma_chan *chan)
>>  	struct xilinx_dma_tx_descriptor *desc, *tail_desc;
>>  	u32 reg;
>>  	struct xilinx_vdma_tx_segment *tail_segment;
>> +	bool mbf = xilinx_vdma_is_multi_buffer(chan) && !config->park;
>>
>>  	/* This function was invoked with lock held */
>>  	if (chan->err)
>>  		return;
>>
>> -	if (list_empty(&chan->pending_list))
>> +	/*
>> +	 * Can't continue if we have already consumed all the available
>> +	 * framebuffers and they are not done yet.
>> +	 */
>> +	if (mbf && (chan->seg_pendingcount >= chan->num_frms))
>>  		return;
>>
>> +	if (list_empty(&chan->pending_list)) {
>> +		/*
>> +		 * Can't keep running if there are no pending segments. Halt
>> +		 * the channel as security measure. Notice that this will not
>> +		 * corrupt current transactions because this function is
>> +		 * called after the pendingcount is decreased and after the
>> +		 * current transaction has finished.
>> +		 */
>> +		if (mbf && chan->running && !chan->seg_pendingcount) {
>> +			dev_dbg(chan->dev, "pending list empty: halting\n");
>> +			xilinx_dma_halt(chan);
>> +		}
>> +
>> +		return;
>> +	}
>> +
>>  	desc = list_first_entry(&chan->pending_list,
>>  				struct xilinx_dma_tx_descriptor, node);
>>  	tail_desc = list_last_entry(&chan->pending_list,
>> @@ -1079,6 +1117,8 @@ static void xilinx_vdma_start_transfer(struct
>> xilinx_dma_chan *chan)
>>  	if (chan->has_sg) {
>>  		dma_ctrl_write(chan, XILINX_DMA_REG_TAILDESC,
>>  				tail_segment->phys);
>> +		list_splice_tail_init(&chan->pending_list, &chan->active_list);
>> +		chan->desc_pendingcount = 0;
>>  	} else {
>>  		struct xilinx_vdma_tx_segment *segment, *last = NULL;
>>  		int i = 0;
>> @@ -1097,29 +1137,34 @@ static void xilinx_vdma_start_transfer(struct
>> xilinx_dma_chan *chan)
>>
>> 	XILINX_VDMA_REG_START_ADDRESS(i++),
>>  					segment->hw.buf_addr);
>>
>> +			chan->seg_pendingcount++;
>>  			last = segment;
>>  		}
>>
>>  		if (!last)
>>  			return;
>>
>> -		/* HW expects these parameters to be same for one
>> transaction */
>> -		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
>>> hw.hsize);
>> -		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
>> -				last->hw.stride);
>> -		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
>>> hw.vsize);
>> -	}
>> -
>> -	if (!chan->has_sg) {
>>  		list_del(&desc->node);
>>  		list_add_tail(&desc->node, &chan->active_list);
>>  		chan->desc_submitcount++;
>>  		chan->desc_pendingcount--;
>>  		if (chan->desc_submitcount == chan->num_frms)
>>  			chan->desc_submitcount = 0;
>> -	} else {
>> -		list_splice_tail_init(&chan->pending_list, &chan->active_list);
>> -		chan->desc_pendingcount = 0;
>> +
>> +		/*
>> +		 * Can't start until all the framebuffers have been programmed
>> +		 * or else corruption can occur.
>> +		 */
>> +		if (mbf && !chan->running &&
>> +		   (chan->seg_pendingcount < chan->num_frms))
>> +			return;
>> +
>> +		/* HW expects these parameters to be same for one
>> transaction */
>> +		vdma_desc_write(chan, XILINX_DMA_REG_HSIZE, last-
>>> hw.hsize);
>> +		vdma_desc_write(chan, XILINX_DMA_REG_FRMDLY_STRIDE,
>> +				last->hw.stride);
>> +		vdma_desc_write(chan, XILINX_DMA_REG_VSIZE, last-
>>> hw.vsize);
>> +		chan->running = true;
>>  	}
>>  }
>>
>> @@ -1327,12 +1372,20 @@ static void xilinx_dma_issue_pending(struct
>> dma_chan *dchan)  static void xilinx_dma_complete_descriptor(struct
>> xilinx_dma_chan *chan)  {
>>  	struct xilinx_dma_tx_descriptor *desc, *next;
>> +	struct xilinx_vdma_tx_segment *segment;
>>
>>  	/* This function was invoked with lock held */
>>  	if (list_empty(&chan->active_list))
>>  		return;
>>
>>  	list_for_each_entry_safe(desc, next, &chan->active_list, node) {
>> +		if (chan->xdev->dma_config->dmatype == XDMA_TYPE_VDMA)
>> {
>> +			list_for_each_entry(segment, &desc->segments, node)
>> {
>> +				if (chan->seg_pendingcount > 0)
>> +					chan->seg_pendingcount--;
>> +			}
>> +		}
>> +
>>  		list_del(&desc->node);
>>  		if (!desc->cyclic)
>>  			dma_cookie_complete(&desc->async_tx);
>> @@ -1366,6 +1419,9 @@ static int xilinx_dma_reset(struct xilinx_dma_chan
>> *chan)
>>  	}
>>
>>  	chan->err = false;
>> +	chan->seg_pendingcount = 0;
>> +	chan->desc_submitcount = 0;
>> +	chan->running = false;
>>
>>  	return err;
>>  }
>> --
>> 1.9.1
>>

^ permalink raw reply

* [PATCH resend] arm64: dts: r8a7796: salvator-x: Update memory node to 4 GiB map
From: Geert Uytterhoeven @ 2016-12-15 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

From: Takeshi Kihara <takeshi.kihara.df@renesas.com>

This patch updates memory region:

  - After changes, the new map of the Salvator-X board on R8A7796 SoC
    Bank0: 2GiB RAM : 0x000048000000 -> 0x000bfffffff
    Bank1: 2GiB RAM : 0x000600000000 -> 0x0067fffffff

  - Before changes, the old map looked like this:
    Bank0: 2GiB RAM : 0x000048000000 -> 0x000bfffffff

Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
[geert: Correct size of old map]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Simon, please apply!

U-Boot already adds the second memory region to the "reg" property of
the first memory node in DT, so this patch is actually a no-op.
IMHO it doesn't make much sense to keep on pretending we don't have
enabled memory outside the 32-bit address space.
---
 arch/arm64/boot/dts/renesas/r8a7796-salvator-x.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a7796-salvator-x.dts b/arch/arm64/boot/dts/renesas/r8a7796-salvator-x.dts
index f35e96ca7d6050b7..38bde9de3250ecbb 100644
--- a/arch/arm64/boot/dts/renesas/r8a7796-salvator-x.dts
+++ b/arch/arm64/boot/dts/renesas/r8a7796-salvator-x.dts
@@ -31,6 +31,11 @@
 		reg = <0x0 0x48000000 0x0 0x78000000>;
 	};
 
+	memory at 600000000 {
+		device_type = "memory";
+		reg = <0x6 0x00000000 0x0 0x80000000>;
+	};
+
 	reg_1p8v: regulator0 {
 		compatible = "regulator-fixed";
 		regulator-name = "fixed-1.8V";
-- 
1.9.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox