Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Summary of LPC guest MSI discussion in Santa Fe
From: Will Deacon @ 2016-11-09 22:25 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109151709.74927f83@t450s.home>

On Wed, Nov 09, 2016 at 03:17:09PM -0700, Alex Williamson wrote:
> On Wed, 9 Nov 2016 20:31:45 +0000
> Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Nov 09, 2016 at 08:23:03PM +0100, Christoffer Dall wrote:
> > > 
> > > (I suppose it's technically possible to get around this issue by letting
> > > QEMU place RAM wherever it wants but tell the guest to never use a
> > > particular subset of its RAM for DMA, because that would conflict with
> > > the doorbell IOVA or be seen as p2p transactions.  But I think we all
> > > probably agree that it's a disgusting idea.)  
> > 
> > Disgusting, yes, but Ben's idea of hotplugging on the host controller with
> > firmware tables describing the reserved regions is something that we could
> > do in the distant future. In the meantime, I don't think that VFIO should
> > explicitly reject overlapping mappings if userspace asks for them.
> 
> I'm confused by the last sentence here, rejecting user mappings that
> overlap reserved ranges, such as MSI doorbell pages, is exactly how
> we'd reject hot-adding a device when we meet such a conflict.  If we
> don't reject such a mapping, we're knowingly creating a situation that
> potentially leads to data loss.  Minimally, QEMU would need to know
> about the reserved region, map around it through VFIO, and take
> responsibility (somehow) for making sure that region is never used for
> DMA.  Thanks,

Yes, but my point is that it should be up to QEMU to abort the hotplug, not
the host kernel, since there may be ways in which a guest can tolerate the
overlapping region (e.g. by avoiding that range of memory for DMA).

Will

^ permalink raw reply

* ARM: AMx3xx/DRA7: crypto IP support data
From: Tony Lindgren @ 2016-11-09 22:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1476777327-700-1-git-send-email-t-kristo@ti.com>

* Tero Kristo <t-kristo@ti.com> [161018 00:56]:
> Hi,
> 
> This series finalizes the crypto support for amx3xx / dra7 socs,
> adding the hwmod data and fixing one issue with l4sec clockdomain
> on dra7.

Applying all into omap-for-v4.10/soc thanks.

Tony

^ permalink raw reply

* [PATCH] ARM: dts: am335x-baltos-ir5221: use both musb channels in host mode
From: Tony Lindgren @ 2016-11-09 22:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1476860943-30994-1-git-send-email-yegorslists@googlemail.com>

* yegorslists at googlemail.com <yegorslists@googlemail.com> [161019 00:10]:
> From: Yegor Yefremov <yegorslists@googlemail.com>
> 
> Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
> ---
>  arch/arm/boot/dts/am335x-baltos-ir5221.dts | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/am335x-baltos-ir5221.dts b/arch/arm/boot/dts/am335x-baltos-ir5221.dts
> index d0faa7b..f599350 100644
> --- a/arch/arm/boot/dts/am335x-baltos-ir5221.dts
> +++ b/arch/arm/boot/dts/am335x-baltos-ir5221.dts
> @@ -114,7 +114,7 @@
>  
>  &usb1 {
>  	status = "okay";
> -	dr_mode = "otg";
> +	dr_mode = "host";
>  };
>  
>  &cpsw_emac0 {

Applying into omap-for-v4.10/dt thanks.

Tony

^ permalink raw reply

* [PATCH] ARM: dts: am335x-baltos: don't reset gpio3 block
From: Tony Lindgren @ 2016-11-09 22:48 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478077696-21003-1-git-send-email-yegorslists@googlemail.com>

* yegorslists at googlemail.com <yegorslists@googlemail.com> [161102 02:08]:
> From: Yegor Yefremov <yegorslists@googlemail.com>
> 
> This change is needed in order to enable some hardware components
> from bootloader.
> 
> Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
> ---
>  arch/arm/boot/dts/am335x-baltos.dtsi | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/am335x-baltos.dtsi b/arch/arm/boot/dts/am335x-baltos.dtsi
> index dd45d17..09b9541 100644
> --- a/arch/arm/boot/dts/am335x-baltos.dtsi
> +++ b/arch/arm/boot/dts/am335x-baltos.dtsi
> @@ -406,3 +406,7 @@
>  &gpio0 {
>  	ti,no-reset-on-init;
>  };
> +
> +&gpio3 {
> +	ti,no-reset-on-init;
> +};

Applying into omap-for-v4.10/dt thanks.

Tony

^ permalink raw reply

* [PATCHv2] PCI: QDF2432 32 bit config space accessors
From: Bjorn Helgaas @ 2016-11-09 22:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAKv+Gu82+5oHMeSvcOL1VG4vw-kVnhq4LjCRtLMw4Nmn0rK2CA@mail.gmail.com>

On Wed, Nov 09, 2016 at 08:29:23PM +0000, Ard Biesheuvel wrote:
> Hi Bjorn,
> 
> On 9 November 2016 at 20:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Nov 09, 2016 at 02:25:56PM -0500, Christopher Covington wrote:
> >> Hi Bjorn,
> >>
> [...]
> >>
> >> We're working to add the PNP0C02 resource to future firmware, but it's
> >> not in the current firmware. Are dmesg and /proc/iomem from the
> >> current firmware interesting or should we wait for the update to file?
> >
> > Note that the ECAM space is not the only thing that should be
> > described via these PNP0C02 devices.  *All* non-enumerable resources
> > should be described by the _CRS method of some ACPI device.  Here's a
> > sample from my laptop:
> >
> >   PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
> >   system 00:01: [io  0x1800-0x189f] could not be reserved
> >   system 00:01: [io  0x0800-0x087f] has been reserved
> >   system 00:01: [io  0x0880-0x08ff] has been reserved
> >   system 00:01: [io  0x0900-0x097f] has been reserved
> >   system 00:01: [io  0x0980-0x09ff] has been reserved
> >   system 00:01: [io  0x0a00-0x0a7f] has been reserved
> >   system 00:01: [io  0x0a80-0x0aff] has been reserved
> >   system 00:01: [io  0x0b00-0x0b7f] has been reserved
> >   system 00:01: [io  0x0b80-0x0bff] has been reserved
> >   system 00:01: [io  0x15e0-0x15ef] has been reserved
> >   system 00:01: [io  0x1600-0x167f] has been reserved
> >   system 00:01: [io  0x1640-0x165f] has been reserved
> >   system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
> >   system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
> >   system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
> >   system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
> >   system 00:01: [mem 0xfeb00000-0xfebfffff] has been reserved
> >   system 00:01: [mem 0xfed20000-0xfed3ffff] has been reserved
> >   system 00:01: [mem 0xfed90000-0xfed93fff] has been reserved
> >   system 00:01: [mem 0xf7fe0000-0xf7ffffff] has been reserved
> >   system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
> >
> > Do you have firmware in the field that may not get updated?  If so,
> > I'd like to see the whole solution for that firmware, including the
> > MCFG quirk (which tells the PCI core where the ECAM region is) and
> > whatever PNP0C02 quirk you figure out to actually reserve the region.
> >
> > I proposed a PNP0C02 quirk to Duc along these lines of the below.  I
> > don't actually know if it's feasible, but it didn't look as bad as I
> > expected, so I'd kind of like somebody to try it out.  I think you
> > would have to call this via a DMI hook (do you have DMI on arm64?),
> > maybe from pnp_init() or similar.
> 
> We do have SMBIOS/DMI on arm64, but we have been successful so far not
> to rely on it for quirks, and we'd very much like to keep it that way.
> 
> Since this ACPI _CRS method has nothing to do with SMBIOS/DMI, surely
> there is a better way to wire up the reservation code to the
> information exposed by ACPI?

I'm open to other ways, feel free to propose one :)

If you do a quirk, you need some way to identify the machine/firmware
combination, because you don't want to apply the quirk on every
machine.  You're trying to work around a firmware issue, so you
probably want something tied to the firmware version.  On x86, that's
typically done with DMI.

Bjorn

^ permalink raw reply

* [PATCH] mfd: qcom-pm8xxx: Clean up PM8XXX namespace
From: Arnd Bergmann @ 2016-11-09 22:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CACRpkdZe0VO=ABfbven8mU0FGb=sJNY_tjk2_GpAP30wsiDkvQ@mail.gmail.com>

On Wednesday, November 9, 2016 11:19:34 PM CET Linus Walleij wrote:
> On Wed, Nov 9, 2016 at 4:47 PM, Lee Jones <lee.jones@linaro.org> wrote:
> 
> > How many more Acks do we need?
> 
> Jacek and one of the ARM SoC people ideally...
> 
> Jacek? Arnd/Olof?
> 

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply

* [PATCH fpga 0/9] Zynq FPGA Manager Improvements
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel

This series is all the changes I made while reviewing the Zynq driver.

The first five patches are independent straight forward changes.

The next four patches rework the FPGA manager core code and all drivers to
only use scatter lists for bitfile storage. This is an essential change
looking forward as the high order physical and vmalloc allocations currently
used are simply unreliable and unusable in many cases.

This does not fully fix the request_firmware path, as that needs changes to
the firmware core, but does all the low level work needed in this
subsystem. Zynq sees a significant improvement as it no longer needs large
amounts of physically contiguous DMA coherent memory.

Other users who can use the new sg interface have no limitations, and could
zero-copy DMA directly out of the page cache, for instance.

I do not have socfpga hardware, so those straightfoward changes are untested,
but I have tested the Zynq driver extensively now.

Jason Gunthorpe (9):
  fpga zynq: Add missing \n to messages
  fpga zynq: Check the bitstream for validity
  fpga zynq: Fix incorrect ISR state on bootup
  fpga zynq: Check for errors after completing DMA
  fpga zynq: Remove priv->dev
  fpga: Add scatterlist based write ops to the driver ops
  fpga zynq: Use the scatterlist interface
  fpga socfpga: Use the scatterlist interface
  fpga: Remove support for non-sg drivers

 drivers/fpga/fpga-mgr.c       |  87 +++++++++++--
 drivers/fpga/socfpga.c        |  56 ++++++---
 drivers/fpga/zynq-fpga.c      | 278 ++++++++++++++++++++++++++++++------------
 include/linux/fpga/fpga-mgr.h |  14 ++-
 4 files changed, 327 insertions(+), 108 deletions(-)

-- 
2.1.4

^ permalink raw reply

* [PATCH fpga 1/9] fpga zynq: Add missing \n to messages
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index c2fb4120bd62..e72340ea7323 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -217,7 +217,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for PCFG_INIT");
+			dev_err(priv->dev, "Timeout waiting for PCFG_INIT\n");
 			goto out_err;
 		}
 
@@ -231,7 +231,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for !PCFG_INIT");
+			dev_err(priv->dev, "Timeout waiting for !PCFG_INIT\n");
 			goto out_err;
 		}
 
@@ -245,7 +245,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for PCFG_INIT");
+			dev_err(priv->dev, "Timeout waiting for PCFG_INIT\n");
 			goto out_err;
 		}
 	}
@@ -262,7 +262,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 	/* check that we have room in the command queue */
 	status = zynq_fpga_read(priv, STATUS_OFFSET);
 	if (status & STATUS_DMA_Q_F) {
-		dev_err(priv->dev, "DMA command queue full");
+		dev_err(priv->dev, "DMA command queue full\n");
 		err = -EBUSY;
 		goto out_err;
 	}
@@ -331,7 +331,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	zynq_fpga_write(priv, INT_STS_OFFSET, intr_status);
 
 	if (!((intr_status & IXR_D_P_DONE_MASK) == IXR_D_P_DONE_MASK)) {
-		dev_err(priv->dev, "Error configuring FPGA");
+		dev_err(priv->dev, "Error configuring FPGA\n");
 		err = -EFAULT;
 	}
 
@@ -426,7 +426,7 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	priv->slcr = syscon_regmap_lookup_by_phandle(dev->of_node,
 		"syscon");
 	if (IS_ERR(priv->slcr)) {
-		dev_err(dev, "unable to get zynq-slcr regmap");
+		dev_err(dev, "unable to get zynq-slcr regmap\n");
 		return PTR_ERR(priv->slcr);
 	}
 
@@ -434,26 +434,26 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 
 	priv->irq = platform_get_irq(pdev, 0);
 	if (priv->irq < 0) {
-		dev_err(dev, "No IRQ available");
+		dev_err(dev, "No IRQ available\n");
 		return priv->irq;
 	}
 
 	err = devm_request_irq(dev, priv->irq, zynq_fpga_isr, 0,
 			       dev_name(dev), priv);
 	if (err) {
-		dev_err(dev, "unable to request IRQ");
+		dev_err(dev, "unable to request IRQ\n");
 		return err;
 	}
 
 	priv->clk = devm_clk_get(dev, "ref_clk");
 	if (IS_ERR(priv->clk)) {
-		dev_err(dev, "input clock not found");
+		dev_err(dev, "input clock not found\n");
 		return PTR_ERR(priv->clk);
 	}
 
 	err = clk_prepare_enable(priv->clk);
 	if (err) {
-		dev_err(dev, "unable to enable clock");
+		dev_err(dev, "unable to enable clock\n");
 		return err;
 	}
 
@@ -465,7 +465,7 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	err = fpga_mgr_register(dev, "Xilinx Zynq FPGA Manager",
 				&zynq_fpga_ops, priv);
 	if (err) {
-		dev_err(dev, "unable to register FPGA manager");
+		dev_err(dev, "unable to register FPGA manager\n");
 		clk_unprepare(priv->clk);
 		return err;
 	}
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 2/9] fpga zynq: Check the bitstream for validity
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

There is no sense in sending a bitstream we know will not work, and
with the variety of options for bitstream generation in Xilinx tools
it is not terribly clear or very well documented what the correct
input should be, especially since auto-detection was removed from this
driver.

All Zynq full configuration bitstreams must start with the sync word in
the correct byte order.

Zynq is also only able to DMA dword quantities, so bitstreams must be
a multiple of 4 bytes. This also fixes a DMA-past the end bug.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 40 +++++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index e72340ea7323..86f4377e2b52 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -175,6 +175,19 @@ static irqreturn_t zynq_fpga_isr(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+/* Sanity check the proposed bitstream. It must start with the sync word in
+ * the correct byte order. The input is a Xilinx .bin file with every 32 bit
+ * quantity swapped.
+ */
+static bool zynq_fpga_has_sync(const char *buf, size_t count)
+{
+	for (; count > 4; buf += 4, count -= 4)
+		if (buf[0] == 0x66 && buf[1] == 0x55 && buf[2] == 0x99 &&
+		    buf[3] == 0xaa)
+			return true;
+	return false;
+}
+
 static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 				    const char *buf, size_t count)
 {
@@ -184,12 +197,28 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 
 	priv = mgr->priv;
 
+	/* The hardware can only DMA multiples of 4 bytes, and we need at
+	 * least the sync word and something else to do anything.
+	 */
+	if (count <= 4 || (count % 4) != 0) {
+		dev_err(priv->dev,
+			"Invalid bitstream size, must be multiples of 4 bytes\n");
+		return -EINVAL;
+	}
+
 	err = clk_enable(priv->clk);
 	if (err)
 		return err;
 
 	/* don't globally reset PL if we're doing partial reconfig */
 	if (!(flags & FPGA_MGR_PARTIAL_RECONFIG)) {
+		if (!zynq_fpga_has_sync(buf, count)) {
+			dev_err(priv->dev,
+				"Invalid bitstream, could not find a sync word. Bitstream must be a byte swaped .bin file\n");
+			err = -EINVAL;
+			goto out_err;
+		}
+
 		/* assert AXI interface resets */
 		regmap_write(priv->slcr, SLCR_FPGA_RST_CTRL_OFFSET,
 			     FPGA_RST_ALL_MASK);
@@ -287,12 +316,9 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	struct zynq_fpga_priv *priv;
 	int err;
 	char *kbuf;
-	size_t in_count;
 	dma_addr_t dma_addr;
-	u32 transfer_length;
 	u32 intr_status;
 
-	in_count = count;
 	priv = mgr->priv;
 
 	kbuf = dma_alloc_coherent(priv->dev, count, &dma_addr, GFP_KERNEL);
@@ -318,11 +344,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	 */
 	zynq_fpga_write(priv, DMA_SRC_ADDR_OFFSET, (u32)(dma_addr) + 1);
 	zynq_fpga_write(priv, DMA_DST_ADDR_OFFSET, (u32)DMA_INVALID_ADDRESS);
-
-	/* convert #bytes to #words */
-	transfer_length = (count + 3) / 4;
-
-	zynq_fpga_write(priv, DMA_SRC_LEN_OFFSET, transfer_length);
+	zynq_fpga_write(priv, DMA_SRC_LEN_OFFSET, count / 4);
 	zynq_fpga_write(priv, DMA_DEST_LEN_OFFSET, 0);
 
 	wait_for_completion(&priv->dma_done);
@@ -338,7 +360,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	clk_disable(priv->clk);
 
 out_free:
-	dma_free_coherent(priv->dev, in_count, kbuf, dma_addr);
+	dma_free_coherent(priv->dev, count, kbuf, dma_addr);
 
 	return err;
 }
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 3/9] fpga zynq: Fix incorrect ISR state on bootup
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

It is best practice to clear and mask all interrupts before
associating the IRQ, and this should be done after the clock
is enabled.

This corrects a bad result from zynq_fpga_ops_state on bootup
where left over latched values in INT_STS_OFFSET caused it to
report an unconfigured FPGA as configured.

After this change the boot up operating state for an unconfigured
FPGA reports 'unknown'.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index 86f4377e2b52..40cf0feaca7c 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -460,13 +460,6 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 		return priv->irq;
 	}
 
-	err = devm_request_irq(dev, priv->irq, zynq_fpga_isr, 0,
-			       dev_name(dev), priv);
-	if (err) {
-		dev_err(dev, "unable to request IRQ\n");
-		return err;
-	}
-
 	priv->clk = devm_clk_get(dev, "ref_clk");
 	if (IS_ERR(priv->clk)) {
 		dev_err(dev, "input clock not found\n");
@@ -482,6 +475,16 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	/* unlock the device */
 	zynq_fpga_write(priv, UNLOCK_OFFSET, UNLOCK_MASK);
 
+	zynq_fpga_write(priv, INT_MASK_OFFSET, 0xFFFFFFFF);
+	zynq_fpga_write(priv, INT_STS_OFFSET, IXR_ALL_MASK);
+	err = devm_request_irq(dev, priv->irq, zynq_fpga_isr, 0, dev_name(dev),
+			       priv);
+	if (err) {
+		dev_err(dev, "unable to request IRQ\n");
+		clk_unprepare(priv->clk);
+		return err;
+	}
+
 	clk_disable(priv->clk);
 
 	err = fpga_mgr_register(dev, "Xilinx Zynq FPGA Manager",
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 4/9] fpga zynq: Check for errors after completing DMA
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

The completion did not check the interrupt status to see if any error
bits were asserted, check error bits and dump some registers if things
went wrong.

A few fixes are needed to make this work, the IXR_ERROR_FLAGS_MASK was
wrong, it included the done bits, which shows a bug in mask/unmask_irqs
which were using the wrong bits, simplify all of this stuff.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 55 +++++++++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 24 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index 40cf0feaca7c..3ffc5fcc3072 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -89,7 +89,7 @@
 #define IXR_D_P_DONE_MASK		BIT(12)
  /* FPGA programmed */
 #define IXR_PCFG_DONE_MASK		BIT(2)
-#define IXR_ERROR_FLAGS_MASK		0x00F0F860
+#define IXR_ERROR_FLAGS_MASK		0x00F0C860
 #define IXR_ALL_MASK			0xF8F7F87F
 
 /* Miscellaneous constant values */
@@ -144,23 +144,10 @@ static inline u32 zynq_fpga_read(const struct zynq_fpga_priv *priv,
 	readl_poll_timeout(priv->io_base + addr, val, cond, sleep_us, \
 			   timeout_us)
 
-static void zynq_fpga_mask_irqs(struct zynq_fpga_priv *priv)
+static inline void zynq_fpga_set_irq_mask(struct zynq_fpga_priv *priv,
+					  u32 enable)
 {
-	u32 intr_mask;
-
-	intr_mask = zynq_fpga_read(priv, INT_MASK_OFFSET);
-	zynq_fpga_write(priv, INT_MASK_OFFSET,
-			intr_mask | IXR_DMA_DONE_MASK | IXR_ERROR_FLAGS_MASK);
-}
-
-static void zynq_fpga_unmask_irqs(struct zynq_fpga_priv *priv)
-{
-	u32 intr_mask;
-
-	intr_mask = zynq_fpga_read(priv, INT_MASK_OFFSET);
-	zynq_fpga_write(priv, INT_MASK_OFFSET,
-			intr_mask
-			& ~(IXR_D_P_DONE_MASK | IXR_ERROR_FLAGS_MASK));
+	zynq_fpga_write(priv, INT_MASK_OFFSET, ~enable);
 }
 
 static irqreturn_t zynq_fpga_isr(int irq, void *data)
@@ -168,7 +155,7 @@ static irqreturn_t zynq_fpga_isr(int irq, void *data)
 	struct zynq_fpga_priv *priv = data;
 
 	/* disable DMA and error IRQs */
-	zynq_fpga_mask_irqs(priv);
+	zynq_fpga_set_irq_mask(priv, 0);
 
 	complete(&priv->dma_done);
 
@@ -314,6 +301,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 			       const char *buf, size_t count)
 {
 	struct zynq_fpga_priv *priv;
+	const char *why;
 	int err;
 	char *kbuf;
 	dma_addr_t dma_addr;
@@ -337,7 +325,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	reinit_completion(&priv->dma_done);
 
 	/* enable DMA and error IRQs */
-	zynq_fpga_unmask_irqs(priv);
+	zynq_fpga_set_irq_mask(priv, IXR_D_P_DONE_MASK | IXR_ERROR_FLAGS_MASK);
 
 	/* the +1 in the src addr is used to hold off on DMA_DONE IRQ
 	 * until both AXI and PCAP are done ...
@@ -352,16 +340,35 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	intr_status = zynq_fpga_read(priv, INT_STS_OFFSET);
 	zynq_fpga_write(priv, INT_STS_OFFSET, intr_status);
 
+	if (intr_status & IXR_ERROR_FLAGS_MASK) {
+		why = "DMA reported error";
+		err = -EIO;
+		goto out_report;
+	}
+
 	if (!((intr_status & IXR_D_P_DONE_MASK) == IXR_D_P_DONE_MASK)) {
-		dev_err(priv->dev, "Error configuring FPGA\n");
-		err = -EFAULT;
+		why = "DMA did not complete";
+		err = -EIO;
+		goto out_report;
 	}
 
+	err = 0;
+	goto out_clk;
+
+out_report:
+	dev_err(priv->dev,
+		"%s: INT_STS:0x%x CTRL:0x%x LOCK:0x%x INT_MASK:0x%x STATUS:0x%x MCTRL:0x%x\n",
+		why,
+		intr_status,
+		zynq_fpga_read(priv, CTRL_OFFSET),
+		zynq_fpga_read(priv, LOCK_OFFSET),
+		zynq_fpga_read(priv, INT_MASK_OFFSET),
+		zynq_fpga_read(priv, STATUS_OFFSET),
+		zynq_fpga_read(priv, MCTRL_OFFSET));
+out_clk:
 	clk_disable(priv->clk);
-
 out_free:
 	dma_free_coherent(priv->dev, count, kbuf, dma_addr);
-
 	return err;
 }
 
@@ -475,7 +482,7 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	/* unlock the device */
 	zynq_fpga_write(priv, UNLOCK_OFFSET, UNLOCK_MASK);
 
-	zynq_fpga_write(priv, INT_MASK_OFFSET, 0xFFFFFFFF);
+	zynq_fpga_set_irq_mask(priv, 0);
 	zynq_fpga_write(priv, INT_STS_OFFSET, IXR_ALL_MASK);
 	err = devm_request_irq(dev, priv->irq, zynq_fpga_isr, 0, dev_name(dev),
 			       priv);
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 5/9] fpga zynq: Remove priv->dev
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

socfpga uses mgr->dev for debug prints, there should be consistency
here, so standardize on that. The only other use was for dma
which can be replaced with mgr->dev.parent.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index 3ffc5fcc3072..ac2deae92dbd 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -118,7 +118,6 @@
 #define FPGA_RST_NONE_MASK		0x0
 
 struct zynq_fpga_priv {
-	struct device *dev;
 	int irq;
 	struct clk *clk;
 
@@ -188,7 +187,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 	 * least the sync word and something else to do anything.
 	 */
 	if (count <= 4 || (count % 4) != 0) {
-		dev_err(priv->dev,
+		dev_err(&mgr->dev,
 			"Invalid bitstream size, must be multiples of 4 bytes\n");
 		return -EINVAL;
 	}
@@ -200,7 +199,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 	/* don't globally reset PL if we're doing partial reconfig */
 	if (!(flags & FPGA_MGR_PARTIAL_RECONFIG)) {
 		if (!zynq_fpga_has_sync(buf, count)) {
-			dev_err(priv->dev,
+			dev_err(&mgr->dev,
 				"Invalid bitstream, could not find a sync word. Bitstream must be a byte swaped .bin file\n");
 			err = -EINVAL;
 			goto out_err;
@@ -233,7 +232,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for PCFG_INIT\n");
+			dev_err(&mgr->dev, "Timeout waiting for PCFG_INIT\n");
 			goto out_err;
 		}
 
@@ -247,7 +246,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for !PCFG_INIT\n");
+			dev_err(&mgr->dev, "Timeout waiting for !PCFG_INIT\n");
 			goto out_err;
 		}
 
@@ -261,7 +260,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 					     INIT_POLL_DELAY,
 					     INIT_POLL_TIMEOUT);
 		if (err) {
-			dev_err(priv->dev, "Timeout waiting for PCFG_INIT\n");
+			dev_err(&mgr->dev, "Timeout waiting for PCFG_INIT\n");
 			goto out_err;
 		}
 	}
@@ -278,7 +277,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 	/* check that we have room in the command queue */
 	status = zynq_fpga_read(priv, STATUS_OFFSET);
 	if (status & STATUS_DMA_Q_F) {
-		dev_err(priv->dev, "DMA command queue full\n");
+		dev_err(&mgr->dev, "DMA command queue full\n");
 		err = -EBUSY;
 		goto out_err;
 	}
@@ -309,7 +308,8 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 
 	priv = mgr->priv;
 
-	kbuf = dma_alloc_coherent(priv->dev, count, &dma_addr, GFP_KERNEL);
+	kbuf =
+	    dma_alloc_coherent(mgr->dev.parent, count, &dma_addr, GFP_KERNEL);
 	if (!kbuf)
 		return -ENOMEM;
 
@@ -356,7 +356,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 	goto out_clk;
 
 out_report:
-	dev_err(priv->dev,
+	dev_err(&mgr->dev,
 		"%s: INT_STS:0x%x CTRL:0x%x LOCK:0x%x INT_MASK:0x%x STATUS:0x%x MCTRL:0x%x\n",
 		why,
 		intr_status,
@@ -368,7 +368,7 @@ out_report:
 out_clk:
 	clk_disable(priv->clk);
 out_free:
-	dma_free_coherent(priv->dev, count, kbuf, dma_addr);
+	dma_free_coherent(mgr->dev.parent, count, kbuf, dma_addr);
 	return err;
 }
 
@@ -445,8 +445,6 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	if (!priv)
 		return -ENOMEM;
 
-	priv->dev = dev;
-
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	priv->io_base = devm_ioremap_resource(dev, res);
 	if (IS_ERR(priv->io_base))
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 6/9] fpga: Add scatterlist based write ops to the driver ops
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

Requiring contiguous kernel memory is not a good idea, this is a limited
resource and allocation can fail under normal work loads.

As a first step allow for drivers to provide a _sg write interface and
internally convert the existing contiguous mappings into a scatterlist.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/fpga-mgr.c       | 127 +++++++++++++++++++++++++++++++++++++++---
 include/linux/fpga/fpga-mgr.h |   9 ++-
 2 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c
index 953dc9195937..c2491ffeabd3 100644
--- a/drivers/fpga/fpga-mgr.c
+++ b/drivers/fpga/fpga-mgr.c
@@ -25,26 +25,76 @@
 #include <linux/of.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
+#include <linux/scatterlist.h>
+#include <linux/highmem.h>
 
 static DEFINE_IDA(fpga_mgr_ida);
 static struct class *fpga_mgr_class;
 
 /**
- * fpga_mgr_buf_load - load fpga from image in buffer
+ * fpga_mgr_buf_load_sg - load fpga from image in buffer from a scatter list
  * @mgr:	fpga manager
  * @flags:	flags setting fpga confuration modes
- * @buf:	buffer contain fpga image
- * @count:	byte count of buf
+ * @sgt:	scatterlist table
  *
  * Step the low level fpga manager through the device-specific steps of getting
  * an FPGA ready to be configured, writing the image to it, then doing whatever
  * post-configuration steps necessary.  This code assumes the caller got the
  * mgr pointer from of_fpga_mgr_get() and checked that it is not an error code.
  *
+ * This is the preferred entry point for FPGA programming, it does not require
+ * any contiguous kernel memory.
+ *
  * Return: 0 on success, negative error code otherwise.
  */
-int fpga_mgr_buf_load(struct fpga_manager *mgr, u32 flags, const char *buf,
-		      size_t count)
+static int fpga_mgr_buf_load_sg(struct fpga_manager *mgr, u32 flags,
+				struct sg_table *sgt)
+{
+	struct device *dev = &mgr->dev;
+	int ret;
+
+	/*
+	 * Call the low level driver's write_init function.  This will do the
+	 * device-specific things to get the FPGA into the state where it is
+	 * ready to receive an FPGA image.
+	 */
+	mgr->state = FPGA_MGR_STATE_WRITE_INIT;
+	ret = mgr->mops->write_init_sg(mgr, flags, sgt);
+	if (ret) {
+		dev_err(dev, "Error preparing FPGA for writing\n");
+		mgr->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
+		return ret;
+	}
+
+	/*
+	 * Write the FPGA image to the FPGA.
+	 */
+	mgr->state = FPGA_MGR_STATE_WRITE;
+	ret = mgr->mops->write_sg(mgr, sgt);
+	if (ret) {
+		dev_err(dev, "Error while writing image data to FPGA\n");
+		mgr->state = FPGA_MGR_STATE_WRITE_ERR;
+		return ret;
+	}
+
+	/*
+	 * After all the FPGA image has been written, do the device specific
+	 * steps to finish and set the FPGA into operating mode.
+	 */
+	mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE;
+	ret = mgr->mops->write_complete(mgr, flags);
+	if (ret) {
+		dev_err(dev, "Error after writing image data to FPGA\n");
+		mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
+		return ret;
+	}
+	mgr->state = FPGA_MGR_STATE_OPERATING;
+
+	return 0;
+}
+
+static int fpga_mgr_buf_load_mapped(struct fpga_manager *mgr, u32 flags,
+				    const char *buf, size_t count)
 {
 	struct device *dev = &mgr->dev;
 	int ret;
@@ -88,6 +138,68 @@ int fpga_mgr_buf_load(struct fpga_manager *mgr, u32 flags, const char *buf,
 
 	return 0;
 }
+
+/**
+ * fpga_mgr_buf_load - load fpga from image in buffer
+ * @mgr:	fpga manager
+ * @flags:	flags setting fpga confuration modes
+ * @buf:	buffer contain fpga image
+ * @count:	byte count of buf
+ *
+ * Step the low level fpga manager through the device-specific steps of getting
+ * an FPGA ready to be configured, writing the image to it, then doing whatever
+ * post-configuration steps necessary.  This code assumes the caller got the
+ * mgr pointer from of_fpga_mgr_get() and checked that it is not an error code.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int fpga_mgr_buf_load(struct fpga_manager *mgr, u32 flags, const char *buf,
+		      size_t count)
+{
+	struct page **pages;
+	struct sg_table sgt;
+	const void *p;
+	int nr_pages;
+	int index;
+	int rc;
+
+	if (!mgr->mops->write_init_sg || !mgr->mops->write_sg)
+		return fpga_mgr_buf_load_mapped(mgr, flags, buf, count);
+
+	/*
+	 * Convert the linear kernel pointer into a sg_table of pages for use
+	 * by the driver.
+	 */
+	nr_pages = DIV_ROUND_UP((unsigned long)buf + count, PAGE_SIZE) -
+		   (unsigned long)buf / PAGE_SIZE;
+	pages = kmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return -ENOMEM;
+
+	p = (buf - offset_in_page(p));
+	for (index = 0; index < nr_pages; index++) {
+		if (is_vmalloc_addr(p))
+			pages[index] = vmalloc_to_page(p);
+		else
+			pages[index] = kmap_to_page((void *)p);
+		p += PAGE_SIZE;
+	}
+
+	/*
+	 * The temporary pages list is used to code share the merging algorithm
+	 * in sg_alloc_table_from_pages
+	 */
+	rc = sg_alloc_table_from_pages(&sgt, pages, index, offset_in_page(buf),
+				       count, GFP_KERNEL);
+	kfree(pages);
+	if (rc)
+		return rc;
+
+	rc = fpga_mgr_buf_load_sg(mgr, flags, &sgt);
+	sg_free_table(&sgt);
+
+	return rc;
+}
 EXPORT_SYMBOL_GPL(fpga_mgr_buf_load);
 
 /**
@@ -256,8 +368,9 @@ int fpga_mgr_register(struct device *dev, const char *name,
 	struct fpga_manager *mgr;
 	int id, ret;
 
-	if (!mops || !mops->write_init || !mops->write ||
-	    !mops->write_complete || !mops->state) {
+	if (!mops || !mops->write_complete || !mops->state ||
+	    ((!mops->write_init || !mops->write) &&
+	     (!mops->write_init_sg || !mops->write_sg))) {
 		dev_err(dev, "Attempt to register without fpga_manager_ops\n");
 		return -EINVAL;
 	}
diff --git a/include/linux/fpga/fpga-mgr.h b/include/linux/fpga/fpga-mgr.h
index 0940bf45e2f2..371b30ea60eb 100644
--- a/include/linux/fpga/fpga-mgr.h
+++ b/include/linux/fpga/fpga-mgr.h
@@ -22,6 +22,7 @@
 #define _LINUX_FPGA_MGR_H
 
 struct fpga_manager;
+struct sg_table;
 
 /**
  * enum fpga_mgr_states - fpga framework states
@@ -71,8 +72,11 @@ enum fpga_mgr_states {
 /**
  * struct fpga_manager_ops - ops for low level fpga manager drivers
  * @state: returns an enum value of the FPGA's state
- * @write_init: prepare the FPGA to receive confuration data
+ * @write_init: prepare the FPGA to receive confuration data (linear memory)
  * @write: write count bytes of configuration data to the FPGA
+ * @write_init_sg: prepare the FPGA to receive confuration data (scatter list
+ *                 table)
+ * @write_sg: write count bytes of configuration data to the FPGA
  * @write_complete: set FPGA to operating state after writing is done
  * @fpga_remove: optional: Set FPGA into a specific state during driver remove
  *
@@ -85,6 +89,9 @@ struct fpga_manager_ops {
 	int (*write_init)(struct fpga_manager *mgr, u32 flags,
 			  const char *buf, size_t count);
 	int (*write)(struct fpga_manager *mgr, const char *buf, size_t count);
+	int (*write_init_sg)(struct fpga_manager *mgr, u32 flags,
+			     struct sg_table *sgt);
+	int (*write_sg)(struct fpga_manager *mgr, struct sg_table *sgt);
 	int (*write_complete)(struct fpga_manager *mgr, u32 flags);
 	void (*fpga_remove)(struct fpga_manager *mgr);
 };
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 7/9] fpga zynq: Use the scatterlist interface
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

This allows the driver to avoid a high order coherent DMA allocation
and memory copy. With this patch it can DMA directly from the kernel
pages that the bitfile is stored in.

Since this is now a gather DMA operation the driver uses the ISR
to feed the chips DMA queue with each entry from the SGL.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/zynq-fpga.c | 194 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 146 insertions(+), 48 deletions(-)

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index ac2deae92dbd..559b4f2ab9f6 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -30,6 +30,7 @@
 #include <linux/pm.h>
 #include <linux/regmap.h>
 #include <linux/string.h>
+#include <linux/scatterlist.h>
 
 /* Offsets into SLCR regmap */
 
@@ -80,6 +81,7 @@
 
 /* FPGA init status */
 #define STATUS_DMA_Q_F			BIT(31)
+#define STATUS_DMA_Q_E			BIT(30)
 #define STATUS_PCFG_INIT_MASK		BIT(4)
 
 /* Interrupt Status/Mask Register Bit definitions */
@@ -98,12 +100,14 @@
 #define DMA_INVALID_ADDRESS		GENMASK(31, 0)
 /* Used to unlock the dev */
 #define UNLOCK_MASK			0x757bdf0d
-/* Timeout for DMA to complete */
-#define DMA_DONE_TIMEOUT		msecs_to_jiffies(1000)
 /* Timeout for polling reset bits */
 #define INIT_POLL_TIMEOUT		2500000
 /* Delay for polling reset bits */
 #define INIT_POLL_DELAY			20
+/* Signal this is the last DMA transfer, wait for the AXI and PCAP before
+ * interrupting
+ */
+#define DMA_SRC_LAST_TRANSFER		1
 
 /* Masks for controlling stuff in SLCR */
 /* Disable all Level shifters */
@@ -124,6 +128,11 @@ struct zynq_fpga_priv {
 	void __iomem *io_base;
 	struct regmap *slcr;
 
+	spinlock_t dma_lock;
+	unsigned int dma_elm;
+	unsigned int dma_nelms;
+	struct scatterlist *cur_sg;
+
 	struct completion dma_done;
 };
 
@@ -149,15 +158,81 @@ static inline void zynq_fpga_set_irq_mask(struct zynq_fpga_priv *priv,
 	zynq_fpga_write(priv, INT_MASK_OFFSET, ~enable);
 }
 
+/* Must be called with dma_lock held */
+static void zynq_step_dma(struct zynq_fpga_priv *priv)
+{
+	u32 addr;
+	u32 len;
+	bool first;
+
+	first = priv->dma_elm == 0;
+	while (priv->cur_sg) {
+		/* Feed the DMA queue until it is full. */
+		if (zynq_fpga_read(priv, STATUS_OFFSET) & STATUS_DMA_Q_F)
+			break;
+
+		addr = sg_dma_address(priv->cur_sg);
+		len = sg_dma_len(priv->cur_sg);
+		if (priv->dma_elm + 1 == priv->dma_nelms) {
+			/* The last transfer waits for the PCAP to finish too,
+			 * notice this also changes the irq_mask to ignore
+			 * IXR_DMA_DONE_MASK which ensures we do not trigger
+			 * the completion too early.
+			 */
+			addr |= DMA_SRC_LAST_TRANSFER;
+			priv->cur_sg = NULL;
+		} else {
+			priv->cur_sg = sg_next(priv->cur_sg);
+			priv->dma_elm++;
+		}
+
+		zynq_fpga_write(priv, DMA_SRC_ADDR_OFFSET, addr);
+		zynq_fpga_write(priv, DMA_DST_ADDR_OFFSET, DMA_INVALID_ADDRESS);
+		zynq_fpga_write(priv, DMA_SRC_LEN_OFFSET, len / 4);
+		zynq_fpga_write(priv, DMA_DEST_LEN_OFFSET, 0);
+	}
+
+	/* Once the first transfer is queued we can turn on the ISR, future
+	 * calls to zynq_step_dma will happen from the ISR context. The
+	 * dma_lock spinlock guarentees this handover is done coherently, the
+	 * ISR enable is put at the end to avoid another CPU spinning in the
+	 * ISR on this lock.
+	 */
+	if (first && priv->cur_sg) {
+		zynq_fpga_set_irq_mask(priv, IXR_DMA_DONE_MASK |
+						 IXR_ERROR_FLAGS_MASK);
+	} else if (!priv->cur_sg) {
+		/* The last transfer changes to DMA & PCAP mode since we do
+		 * not want to continue until everything has bee flushed into
+		 * the PCAP.
+		 */
+		zynq_fpga_set_irq_mask(priv, IXR_D_P_DONE_MASK |
+						 IXR_ERROR_FLAGS_MASK);
+	}
+}
+
 static irqreturn_t zynq_fpga_isr(int irq, void *data)
 {
 	struct zynq_fpga_priv *priv = data;
+	u32 intr_status;
 
-	/* disable DMA and error IRQs */
-	zynq_fpga_set_irq_mask(priv, 0);
+	/* If anything other than DMA completion is reported stop and hand
+	 * control back to zynq_fpga_ops_write, something went wrong,
+	 * otherwise progress the DMA.
+	 */
+	spin_lock(&priv->dma_lock);
+	intr_status = zynq_fpga_read(priv, INT_STS_OFFSET);
+	if ((intr_status & IXR_ERROR_FLAGS_MASK) == 0 &&
+	    (intr_status & IXR_DMA_DONE_MASK) && priv->cur_sg) {
+		zynq_fpga_write(priv, INT_STS_OFFSET, IXR_DMA_DONE_MASK);
+		zynq_step_dma(priv);
+		spin_unlock(&priv->dma_lock);
+		return IRQ_HANDLED;
+	}
+	spin_unlock(&priv->dma_lock);
 
+	zynq_fpga_set_irq_mask(priv, 0);
 	complete(&priv->dma_done);
-
 	return IRQ_HANDLED;
 }
 
@@ -165,31 +240,47 @@ static irqreturn_t zynq_fpga_isr(int irq, void *data)
  * the correct byte order. The input is a Xilinx .bin file with every 32 bit
  * quantity swapped.
  */
-static bool zynq_fpga_has_sync(const char *buf, size_t count)
+static bool zynq_fpga_has_sync(struct sg_table *sgt)
 {
-	for (; count > 4; buf += 4, count -= 4)
-		if (buf[0] == 0x66 && buf[1] == 0x55 && buf[2] == 0x99 &&
-		    buf[3] == 0xaa)
-			return true;
+	struct sg_mapping_iter miter;
+	const u8 *buf, *end;
+
+	sg_miter_start(&miter, sgt->sgl, sgt->nents, SG_MITER_FROM_SG);
+
+	while (sg_miter_next(&miter)) {
+		end = miter.addr + miter.length;
+		for (buf = miter.addr; buf < end; buf += 4) {
+			if (buf[0] == 0x66 && buf[1] == 0x55 &&
+			    buf[2] == 0x99 && buf[3] == 0xaa) {
+				sg_miter_stop(&miter);
+				return true;
+			}
+		}
+	}
+
+	sg_miter_stop(&miter);
 	return false;
 }
 
 static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
-				    const char *buf, size_t count)
+				    struct sg_table *sgt)
 {
 	struct zynq_fpga_priv *priv;
+	struct scatterlist *sg;
 	u32 ctrl, status;
-	int err;
+	int err, i;
 
 	priv = mgr->priv;
 
-	/* The hardware can only DMA multiples of 4 bytes, and we need at
-	 * least the sync word and something else to do anything.
+	/* The hardware can only DMA multiples of 4 bytes, and it requires the
+	 * starting address to be aligned to 64 bits (UG585 pg 212).
 	 */
-	if (count <= 4 || (count % 4) != 0) {
-		dev_err(&mgr->dev,
-			"Invalid bitstream size, must be multiples of 4 bytes\n");
-		return -EINVAL;
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		if ((sg->offset % 8) != 0 || (sg->length % 4) != 0) {
+			dev_err(&mgr->dev,
+			    "Invalid bitstream size, chunks must be aligned\n");
+			return -EINVAL;
+		}
 	}
 
 	err = clk_enable(priv->clk);
@@ -198,7 +289,7 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 
 	/* don't globally reset PL if we're doing partial reconfig */
 	if (!(flags & FPGA_MGR_PARTIAL_RECONFIG)) {
-		if (!zynq_fpga_has_sync(buf, count)) {
+		if (!zynq_fpga_has_sync(sgt)) {
 			dev_err(&mgr->dev,
 				"Invalid bitstream, could not find a sync word. Bitstream must be a byte swaped .bin file\n");
 			err = -EINVAL;
@@ -274,10 +365,11 @@ static int zynq_fpga_ops_write_init(struct fpga_manager *mgr, u32 flags,
 	zynq_fpga_write(priv, CTRL_OFFSET,
 			(CTRL_PCAP_PR_MASK | CTRL_PCAP_MODE_MASK | ctrl));
 
-	/* check that we have room in the command queue */
+	/* We expect that the command queue is empty right now. */
 	status = zynq_fpga_read(priv, STATUS_OFFSET);
-	if (status & STATUS_DMA_Q_F) {
-		dev_err(&mgr->dev, "DMA command queue full\n");
+	if ((status & STATUS_DMA_Q_F) ||
+	    (status & STATUS_DMA_Q_E) != STATUS_DMA_Q_E) {
+		dev_err(&mgr->dev, "DMA command queue not right\n");
 		err = -EBUSY;
 		goto out_err;
 	}
@@ -296,49 +388,50 @@ out_err:
 	return err;
 }
 
-static int zynq_fpga_ops_write(struct fpga_manager *mgr,
-			       const char *buf, size_t count)
+static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt)
 {
 	struct zynq_fpga_priv *priv;
 	const char *why;
 	int err;
-	char *kbuf;
-	dma_addr_t dma_addr;
 	u32 intr_status;
+	unsigned long timeout;
+	unsigned long flags;
 
 	priv = mgr->priv;
 
-	kbuf =
-	    dma_alloc_coherent(mgr->dev.parent, count, &dma_addr, GFP_KERNEL);
-	if (!kbuf)
+	priv->dma_nelms =
+	    dma_map_sg(mgr->dev.parent, sgt->sgl, sgt->nents, DMA_TO_DEVICE);
+	if (priv->dma_nelms == 0)
 		return -ENOMEM;
 
-	memcpy(kbuf, buf, count);
-
 	/* enable clock */
 	err = clk_enable(priv->clk);
 	if (err)
 		goto out_free;
 
 	zynq_fpga_write(priv, INT_STS_OFFSET, IXR_ALL_MASK);
-
 	reinit_completion(&priv->dma_done);
 
-	/* enable DMA and error IRQs */
-	zynq_fpga_set_irq_mask(priv, IXR_D_P_DONE_MASK | IXR_ERROR_FLAGS_MASK);
+	/* zynq_step_dma will turn on interrupts */
+	spin_lock_irqsave(&priv->dma_lock, flags);
+	priv->dma_elm = 0;
+	priv->cur_sg = sgt->sgl;
+	zynq_step_dma(priv);
+	spin_unlock_irqrestore(&priv->dma_lock, flags);
 
-	/* the +1 in the src addr is used to hold off on DMA_DONE IRQ
-	 * until both AXI and PCAP are done ...
-	 */
-	zynq_fpga_write(priv, DMA_SRC_ADDR_OFFSET, (u32)(dma_addr) + 1);
-	zynq_fpga_write(priv, DMA_DST_ADDR_OFFSET, (u32)DMA_INVALID_ADDRESS);
-	zynq_fpga_write(priv, DMA_SRC_LEN_OFFSET, count / 4);
-	zynq_fpga_write(priv, DMA_DEST_LEN_OFFSET, 0);
+	timeout = wait_for_completion_timeout(&priv->dma_done,
+					      msecs_to_jiffies(5 * 1000));
 
-	wait_for_completion(&priv->dma_done);
+	zynq_fpga_set_irq_mask(priv, 0);
 
 	intr_status = zynq_fpga_read(priv, INT_STS_OFFSET);
-	zynq_fpga_write(priv, INT_STS_OFFSET, intr_status);
+	zynq_fpga_write(priv, INT_STS_OFFSET, IXR_ALL_MASK);
+
+	/* There doesn't seem to be a way to force cancel any DMA, so if
+	 * something went wrong we are relying on the hardware to have halted
+	 * the DMA before we get here, if there was we could use
+	 * wait_for_completion_interruptible too.
+	 */
 
 	if (intr_status & IXR_ERROR_FLAGS_MASK) {
 		why = "DMA reported error";
@@ -346,8 +439,12 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr,
 		goto out_report;
 	}
 
-	if (!((intr_status & IXR_D_P_DONE_MASK) == IXR_D_P_DONE_MASK)) {
-		why = "DMA did not complete";
+	if (priv->cur_sg ||
+	    !((intr_status & IXR_D_P_DONE_MASK) == IXR_D_P_DONE_MASK)) {
+		if (timeout == 0)
+			why = "DMA timed out";
+		else
+			why = "DMA did not complete";
 		err = -EIO;
 		goto out_report;
 	}
@@ -368,7 +465,7 @@ out_report:
 out_clk:
 	clk_disable(priv->clk);
 out_free:
-	dma_free_coherent(mgr->dev.parent, count, kbuf, dma_addr);
+	dma_unmap_sg(mgr->dev.parent, sgt->sgl, sgt->nents, DMA_TO_DEVICE);
 	return err;
 }
 
@@ -429,8 +526,8 @@ static enum fpga_mgr_states zynq_fpga_ops_state(struct fpga_manager *mgr)
 
 static const struct fpga_manager_ops zynq_fpga_ops = {
 	.state = zynq_fpga_ops_state,
-	.write_init = zynq_fpga_ops_write_init,
-	.write = zynq_fpga_ops_write,
+	.write_init_sg = zynq_fpga_ops_write_init,
+	.write_sg = zynq_fpga_ops_write,
 	.write_complete = zynq_fpga_ops_write_complete,
 };
 
@@ -444,6 +541,7 @@ static int zynq_fpga_probe(struct platform_device *pdev)
 	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
+	spin_lock_init(&priv->dma_lock);
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	priv->io_base = devm_ioremap_resource(dev, res);
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 8/9] fpga socfpga: Use the scatterlist interface
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

socfpga just uses the CPU to memory copy the bitstream, so there is
no reason it needs contiguous kernel memory. Switch to use the sg
interface.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/socfpga.c | 56 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 19 deletions(-)

diff --git a/drivers/fpga/socfpga.c b/drivers/fpga/socfpga.c
index 27d2ff28132c..f3f390b2eecf 100644
--- a/drivers/fpga/socfpga.c
+++ b/drivers/fpga/socfpga.c
@@ -24,6 +24,7 @@
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
 #include <linux/pm.h>
+#include <linux/scatterlist.h>
 
 /* Register offsets */
 #define SOCFPGA_FPGMGR_STAT_OFST				0x0
@@ -408,10 +409,22 @@ static int socfpga_fpga_reset(struct fpga_manager *mgr)
  * Prepare the FPGA to receive the configuration data.
  */
 static int socfpga_fpga_ops_configure_init(struct fpga_manager *mgr, u32 flags,
-					   const char *buf, size_t count)
+					   struct sg_table *sgt)
 {
 	struct socfpga_fpga_priv *priv = mgr->priv;
-	int ret;
+	struct scatterlist *sg;
+	int ret, i;
+
+	/* We use the CPU to read the bitstream 32 bits at a time, and thus
+	 * require alignment.
+	 */
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		if ((sg->offset % 4) != 0) {
+			dev_err(&mgr->dev,
+				"Invalid bitstream, chunks must be aligned\n");
+			return -EINVAL;
+		}
+	}
 
 	if (flags & FPGA_MGR_PARTIAL_RECONFIG) {
 		dev_err(&mgr->dev, "Partial reconfiguration not supported.\n");
@@ -440,40 +453,45 @@ static int socfpga_fpga_ops_configure_init(struct fpga_manager *mgr, u32 flags,
 /*
  * Step 9: write data to the FPGA data register
  */
-static int socfpga_fpga_ops_configure_write(struct fpga_manager *mgr,
-					    const char *buf, size_t count)
+static void socfpga_write_buf(struct socfpga_fpga_priv *priv, const u32 *buf,
+			      size_t count)
 {
-	struct socfpga_fpga_priv *priv = mgr->priv;
-	u32 *buffer_32 = (u32 *)buf;
 	size_t i = 0;
 
-	if (count <= 0)
-		return -EINVAL;
-
 	/* Write out the complete 32-bit chunks. */
 	while (count >= sizeof(u32)) {
-		socfpga_fpga_data_writel(priv, buffer_32[i++]);
+		socfpga_fpga_data_writel(priv, buf[i++]);
 		count -= sizeof(u32);
 	}
 
 	/* Write out remaining non 32-bit chunks. */
 	switch (count) {
 	case 3:
-		socfpga_fpga_data_writel(priv, buffer_32[i++] & 0x00ffffff);
+		socfpga_fpga_data_writel(priv, buf[i++] & 0x00ffffff);
 		break;
 	case 2:
-		socfpga_fpga_data_writel(priv, buffer_32[i++] & 0x0000ffff);
+		socfpga_fpga_data_writel(priv, buf[i++] & 0x0000ffff);
 		break;
 	case 1:
-		socfpga_fpga_data_writel(priv, buffer_32[i++] & 0x000000ff);
-		break;
-	case 0:
+		socfpga_fpga_data_writel(priv, buf[i++] & 0x000000ff);
 		break;
 	default:
-		/* This will never happen. */
-		return -EFAULT;
+		break;
 	}
+}
+
+static int socfpga_fpga_ops_configure_write(struct fpga_manager *mgr,
+					    struct sg_table *sgt)
+{
+	struct socfpga_fpga_priv *priv = mgr->priv;
+	struct sg_mapping_iter miter;
+
+	sg_miter_start(&miter, sgt->sgl, sgt->nents, SG_MITER_FROM_SG);
+
+	while (sg_miter_next(&miter))
+		socfpga_write_buf(priv, miter.addr, miter.length);
 
+	sg_miter_stop(&miter);
 	return 0;
 }
 
@@ -545,8 +563,8 @@ static enum fpga_mgr_states socfpga_fpga_ops_state(struct fpga_manager *mgr)
 
 static const struct fpga_manager_ops socfpga_fpga_ops = {
 	.state = socfpga_fpga_ops_state,
-	.write_init = socfpga_fpga_ops_configure_init,
-	.write = socfpga_fpga_ops_configure_write,
+	.write_init_sg = socfpga_fpga_ops_configure_init,
+	.write_sg = socfpga_fpga_ops_configure_write,
 	.write_complete = socfpga_fpga_ops_configure_complete,
 };
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH fpga 9/9] fpga: Remove support for non-sg drivers
From: Jason Gunthorpe @ 2016-11-09 22:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478732303-13718-1-git-send-email-jgunthorpe@obsidianresearch.com>

All drivers now use the sg interface so there is no reason to keep
the contiguous interface any more.

Now that all drivers support this interface we can also export it.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 drivers/fpga/fpga-mgr.c       | 62 +++++++------------------------------------
 include/linux/fpga/fpga-mgr.h |  7 ++---
 2 files changed, 11 insertions(+), 58 deletions(-)

diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c
index c2491ffeabd3..4ba22925d9d5 100644
--- a/drivers/fpga/fpga-mgr.c
+++ b/drivers/fpga/fpga-mgr.c
@@ -47,8 +47,8 @@ static struct class *fpga_mgr_class;
  *
  * Return: 0 on success, negative error code otherwise.
  */
-static int fpga_mgr_buf_load_sg(struct fpga_manager *mgr, u32 flags,
-				struct sg_table *sgt)
+int fpga_mgr_buf_load_sg(struct fpga_manager *mgr, u32 flags,
+			 struct sg_table *sgt)
 {
 	struct device *dev = &mgr->dev;
 	int ret;
@@ -92,52 +92,7 @@ static int fpga_mgr_buf_load_sg(struct fpga_manager *mgr, u32 flags,
 
 	return 0;
 }
-
-static int fpga_mgr_buf_load_mapped(struct fpga_manager *mgr, u32 flags,
-				    const char *buf, size_t count)
-{
-	struct device *dev = &mgr->dev;
-	int ret;
-
-	/*
-	 * Call the low level driver's write_init function.  This will do the
-	 * device-specific things to get the FPGA into the state where it is
-	 * ready to receive an FPGA image.
-	 */
-	mgr->state = FPGA_MGR_STATE_WRITE_INIT;
-	ret = mgr->mops->write_init(mgr, flags, buf, count);
-	if (ret) {
-		dev_err(dev, "Error preparing FPGA for writing\n");
-		mgr->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
-		return ret;
-	}
-
-	/*
-	 * Write the FPGA image to the FPGA.
-	 */
-	mgr->state = FPGA_MGR_STATE_WRITE;
-	ret = mgr->mops->write(mgr, buf, count);
-	if (ret) {
-		dev_err(dev, "Error while writing image data to FPGA\n");
-		mgr->state = FPGA_MGR_STATE_WRITE_ERR;
-		return ret;
-	}
-
-	/*
-	 * After all the FPGA image has been written, do the device specific
-	 * steps to finish and set the FPGA into operating mode.
-	 */
-	mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE;
-	ret = mgr->mops->write_complete(mgr, flags);
-	if (ret) {
-		dev_err(dev, "Error after writing image data to FPGA\n");
-		mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
-		return ret;
-	}
-	mgr->state = FPGA_MGR_STATE_OPERATING;
-
-	return 0;
-}
+EXPORT_SYMBOL_GPL(fpga_mgr_buf_load_sg);
 
 /**
  * fpga_mgr_buf_load - load fpga from image in buffer
@@ -163,9 +118,6 @@ int fpga_mgr_buf_load(struct fpga_manager *mgr, u32 flags, const char *buf,
 	int index;
 	int rc;
 
-	if (!mgr->mops->write_init_sg || !mgr->mops->write_sg)
-		return fpga_mgr_buf_load_mapped(mgr, flags, buf, count);
-
 	/*
 	 * Convert the linear kernel pointer into a sg_table of pages for use
 	 * by the driver.
@@ -226,6 +178,11 @@ int fpga_mgr_firmware_load(struct fpga_manager *mgr, u32 flags,
 
 	mgr->state = FPGA_MGR_STATE_FIRMWARE_REQ;
 
+	/*
+	 * FIXME: We do not need a vmap, just a page list, but
+	 * request_firmware has no way to give us that, so this needlessly
+	 * consumes vmalloc space.
+	 */
 	ret = request_firmware(&fw, image_name, dev);
 	if (ret) {
 		mgr->state = FPGA_MGR_STATE_FIRMWARE_REQ_ERR;
@@ -369,8 +326,7 @@ int fpga_mgr_register(struct device *dev, const char *name,
 	int id, ret;
 
 	if (!mops || !mops->write_complete || !mops->state ||
-	    ((!mops->write_init || !mops->write) &&
-	     (!mops->write_init_sg || !mops->write_sg))) {
+	    !mops->write_init_sg || !mops->write_sg) {
 		dev_err(dev, "Attempt to register without fpga_manager_ops\n");
 		return -EINVAL;
 	}
diff --git a/include/linux/fpga/fpga-mgr.h b/include/linux/fpga/fpga-mgr.h
index 371b30ea60eb..5c698c8fe71b 100644
--- a/include/linux/fpga/fpga-mgr.h
+++ b/include/linux/fpga/fpga-mgr.h
@@ -72,8 +72,6 @@ enum fpga_mgr_states {
 /**
  * struct fpga_manager_ops - ops for low level fpga manager drivers
  * @state: returns an enum value of the FPGA's state
- * @write_init: prepare the FPGA to receive confuration data (linear memory)
- * @write: write count bytes of configuration data to the FPGA
  * @write_init_sg: prepare the FPGA to receive confuration data (scatter list
  *                 table)
  * @write_sg: write count bytes of configuration data to the FPGA
@@ -86,9 +84,6 @@ enum fpga_mgr_states {
  */
 struct fpga_manager_ops {
 	enum fpga_mgr_states (*state)(struct fpga_manager *mgr);
-	int (*write_init)(struct fpga_manager *mgr, u32 flags,
-			  const char *buf, size_t count);
-	int (*write)(struct fpga_manager *mgr, const char *buf, size_t count);
 	int (*write_init_sg)(struct fpga_manager *mgr, u32 flags,
 			     struct sg_table *sgt);
 	int (*write_sg)(struct fpga_manager *mgr, struct sg_table *sgt);
@@ -118,6 +113,8 @@ struct fpga_manager {
 
 int fpga_mgr_buf_load(struct fpga_manager *mgr, u32 flags,
 		      const char *buf, size_t count);
+int fpga_mgr_buf_load_sg(struct fpga_manager *mgr, u32 flags,
+			 struct sg_table *sgt);
 
 int fpga_mgr_firmware_load(struct fpga_manager *mgr, u32 flags,
 			   const char *image_name);
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 0/9] ARM: DRA7: Add support for DRA718-evm
From: Tony Lindgren @ 2016-11-09 23:00 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1b9c4d14-78d7-841f-776c-63a8c1ae1fb1@ti.com>

* Lokesh Vutla <lokeshvutla@ti.com> [161106 20:50]:
> Hi Tony,
> 
> On Friday 21 October 2016 04:08 PM, Lokesh Vutla wrote:
> > This series does minor dts cleanup for dra72-evm and adds support for
> > DRA718-evm.
> 
> Do you have any comments on this series?

Looks good to me except for the regulator patch that you need to repost.
Applying the rest split into various topic branches for soc/dt/defconfig.

Regards,

Tony

^ permalink raw reply

* [PATCH] PCI: mvebu: Take control of mbus windows setup by the firmware
From: Jason Gunthorpe @ 2016-11-09 23:01 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161026174440.GA24717@obsidianresearch.com>

Hey Thomas,

Could you take a look at this? Thanks

Jason

On Wed, Oct 26, 2016 at 11:44:40AM -0600, Jason Gunthorpe wrote:
> The firmware may setup the mbus to access PCI-E and indicate this
> has happened with a ranges mapping for the PCI-E ID. If this happens
> then the mbus setup and the pci dynamic setup conflict, creating
> problems.
> 
> Have PCI-E assume control of the firmware specified default mapping by
> setting the value of the bridge window to match the firmware mapping.
> 
> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
>  drivers/bus/mvebu-mbus.c     | 36 ++++++++++++++++++++++++++++++++++++
>  drivers/pci/host/pci-mvebu.c | 22 ++++++++++++++++++++++
>  include/linux/mbus.h         |  3 +++
>  3 files changed, 61 insertions(+)
> 
> In the DT this could look like:
> 
> 	mbus {
> 		ranges = <MBUS_ID(0x04, 0xe8) 0 0xe0000000 0x8000000 /* PEX 0 MEM */
> 		pex at e0000000 {
> 			compatible = "marvell,kirkwood-pcie";
> 			ranges = <0x82000000 0 0x40000 MBUS_ID(0xf0, 0x01) 0x40000 0 0x00002000 /* Controller regs */
> 				  0x82000000 1 0       MBUS_ID(0x04, 0xe8) 0 1 0 /* Port 0.0 MEM */
> 				  >;
> 			pcie at 1,0 {
> 				ranges = <0x82000000 0 0 0x82000000 0x1 0 1 0>;
> 				reg = <0x0800 0 0  0 0>; // 0000:00:01.0
> 
> 				device at 0 {
> 					reg = <0x0 0 0	0 0>; // 0000:01:00.0
> 					ranges = <0x00000000  0x82000000 0x00000000 0x00000000	0x8000000>;
> 
> Which is basically the OF way to describe PCI devices downstream of
> the interface and give information about what their BARs should be.
> 
> This is useful if there is even more DT stuff below the explicitly
> declared device as it allows standard DT address translation to work
> properly.
> 
> diff --git a/drivers/bus/mvebu-mbus.c b/drivers/bus/mvebu-mbus.c
> index c7f396903184..ce0ac5049c1f 100644
> +++ b/drivers/bus/mvebu-mbus.c
> @@ -922,6 +922,42 @@ int mvebu_mbus_add_window_by_id(unsigned int target, unsigned int attribute,
>  						 size, MVEBU_MBUS_NO_REMAP);
>  }
>  
> +/**
> + * mvebu_mbus_get_single_window_by_id() - return the location of the window if
> + * the target/attribute has a single enabled mapping.
> + *
> + * RETURNS:
> + *	-ENODEV if no mapping and -E2BIG if there is more than one mapping
> + */
> +int mvebu_mbus_get_single_window_by_id(unsigned int target,
> +				       unsigned int attribute,
> +				       phys_addr_t *base, phys_addr_t *size)
> +{
> +	unsigned int win;
> +	unsigned int count = 0;
> +
> +	for (win = 0; win < mbus_state.soc->num_wins; win++) {
> +		u64 wbase;
> +		u32 wsize;
> +		u8 wtarget, wattr;
> +		int enabled;
> +
> +		mvebu_mbus_read_window(&mbus_state, win, &enabled, &wbase,
> +				       &wsize, &wtarget, &wattr, NULL);
> +		if (enabled && wtarget == target && wattr == attribute) {
> +			*base = wbase;
> +			*size = wsize;
> +			count++;
> +		}
> +	}
> +
> +	if (count == 1)
> +		return 0;
> +	if (count == 0)
> +		return -ENODEV;
> +	return -E2BIG;
> +}
> +
>  int mvebu_mbus_del_window(phys_addr_t base, size_t size)
>  {
>  	int win;
> diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c
> index 307f81d6b479..5d5a2687b73e 100644
> +++ b/drivers/pci/host/pci-mvebu.c
> @@ -464,6 +464,8 @@ static void mvebu_pcie_handle_membase_change(struct mvebu_pcie_port *port)
>  static void mvebu_sw_pci_bridge_init(struct mvebu_pcie_port *port)
>  {
>  	struct mvebu_sw_pci_bridge *bridge = &port->bridge;
> +	int rc;
> +	phys_addr_t base, size;
>  
>  	memset(bridge, 0, sizeof(struct mvebu_sw_pci_bridge));
>  
> @@ -480,6 +482,26 @@ static void mvebu_sw_pci_bridge_init(struct mvebu_pcie_port *port)
>  
>  	/* Add capabilities */
>  	bridge->status = PCI_STATUS_CAP_LIST;
> +
> +	/*
> +	 * If the firmware has already setup a window for PCIe then assume
> +	 * control of it by defaulting the BAR to the window setting.
> +	 */
> +	rc = mvebu_mbus_get_single_window_by_id(port->mem_target,
> +						port->mem_attr, &base, &size);
> +	if (rc == -E2BIG)
> +		pr_err(FW_BUG "%s: Too many pre-existing mbus mappings\n",
> +		       port->dn->name);
> +	if (!rc) {
> +		if ((base & 0xFFFFF) != 0 || ((size + base) & 0xFFFFF) != 0)
> +			pr_err(FW_BUG "%s: Invalid pre-existing mbus mapping\n",
> +			       port->dn->name);
> +		port->memwin_base = base;
> +		port->memwin_size = size;
> +		port->bridge.membase = (base >> 16) & 0xFFF0;
> +		port->bridge.memlimit = ((size - 1 + base) >> 16) & 0xFFF0;
> +		port->bridge.command |= PCI_COMMAND_MEMORY;
> +	}
>  }
>  
>  /*
> diff --git a/include/linux/mbus.h b/include/linux/mbus.h
> index d610232762e3..dfafc27b41b5 100644
> +++ b/include/linux/mbus.h
> @@ -83,5 +83,8 @@ int mvebu_mbus_init(const char *soc, phys_addr_t mbus_phys_base,
>  		    size_t mbus_size, phys_addr_t sdram_phys_base,
>  		    size_t sdram_size);
>  int mvebu_mbus_dt_init(bool is_coherent);
> +int mvebu_mbus_get_single_window_by_id(unsigned int target,
> +				       unsigned int attribute,
> +				       phys_addr_t *base, phys_addr_t *size);
>  
>  #endif /* __LINUX_MBUS_H */

^ permalink raw reply

* Summary of LPC guest MSI discussion in Santa Fe
From: Alex Williamson @ 2016-11-09 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109222522.GS17771@arm.com>

On Wed, 9 Nov 2016 22:25:22 +0000
Will Deacon <will.deacon@arm.com> wrote:

> On Wed, Nov 09, 2016 at 03:17:09PM -0700, Alex Williamson wrote:
> > On Wed, 9 Nov 2016 20:31:45 +0000
> > Will Deacon <will.deacon@arm.com> wrote:  
> > > On Wed, Nov 09, 2016 at 08:23:03PM +0100, Christoffer Dall wrote:  
> > > > 
> > > > (I suppose it's technically possible to get around this issue by letting
> > > > QEMU place RAM wherever it wants but tell the guest to never use a
> > > > particular subset of its RAM for DMA, because that would conflict with
> > > > the doorbell IOVA or be seen as p2p transactions.  But I think we all
> > > > probably agree that it's a disgusting idea.)    
> > > 
> > > Disgusting, yes, but Ben's idea of hotplugging on the host controller with
> > > firmware tables describing the reserved regions is something that we could
> > > do in the distant future. In the meantime, I don't think that VFIO should
> > > explicitly reject overlapping mappings if userspace asks for them.  
> > 
> > I'm confused by the last sentence here, rejecting user mappings that
> > overlap reserved ranges, such as MSI doorbell pages, is exactly how
> > we'd reject hot-adding a device when we meet such a conflict.  If we
> > don't reject such a mapping, we're knowingly creating a situation that
> > potentially leads to data loss.  Minimally, QEMU would need to know
> > about the reserved region, map around it through VFIO, and take
> > responsibility (somehow) for making sure that region is never used for
> > DMA.  Thanks,  
> 
> Yes, but my point is that it should be up to QEMU to abort the hotplug, not
> the host kernel, since there may be ways in which a guest can tolerate the
> overlapping region (e.g. by avoiding that range of memory for DMA).

The VFIO_IOMMU_MAP_DMA ioctl is a contract, the user ask to map a range
of IOVAs to a range of virtual addresses for a given device.  If VFIO
cannot reasonably fulfill that contract, it must fail.  It's up to QEMU
how to manage the hotplug and what memory regions it asks VFIO to map
for a device, but VFIO must reject mappings that it (or the SMMU by
virtue of using the IOMMU API) know to overlap reserved ranges.  So I
still disagree with the referenced statement.  Thanks,

Alex

^ permalink raw reply

* [PATCH v7 00/16] ACPI IORT ARM SMMU support
From: Rafael J. Wysocki @ 2016-11-09 23:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109141948.19244-1-lorenzo.pieralisi@arm.com>

Hi Lorenzo,

On Wed, Nov 9, 2016 at 3:19 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> This patch series is v7 of a previous posting:
>
> https://lkml.org/lkml/2016/10/18/506

I don't see anything objectionable in this series.

Please let me know which patches in particular to look at in detail.

Thanks,
Rafael

^ permalink raw reply

* [PATCH v7 01/16] drivers: acpi: add FWNODE_ACPI_STATIC fwnode type
From: Rafael J. Wysocki @ 2016-11-09 23:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109141948.19244-2-lorenzo.pieralisi@arm.com>

On Wed, Nov 9, 2016 at 3:19 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On systems booting with a device tree, every struct device is associated
> with a struct device_node, that provides its DT firmware representation.
> The device node can be used in generic kernel contexts (eg IRQ
> translation, IOMMU streamid mapping), to retrieve the properties
> associated with the device and carry out kernel operations accordingly.
> Owing to the 1:1 relationship between the device and its device_node,
> the device_node can also be used as a look-up token for the device (eg
> looking up a device through its device_node), to retrieve the device in
> kernel paths where the device_node is available.
>
> On systems booting with ACPI, the same abstraction provided by
> the device_node is required to provide look-up functionality.
>
> The struct acpi_device, that represents firmware objects in the
> ACPI namespace already includes a struct fwnode_handle of
> type FWNODE_ACPI as their member; the same abstraction is missing
> though for devices that are instantiated out of static ACPI tables
> entries (eg ARM SMMU devices).
>
> Add a new fwnode_handle type to associate devices created out
> of static ACPI table entries to the respective firmware components
> and create a simple ACPI core layer interface to dynamically allocate
> and free the corresponding firmware nodes so that kernel subsystems
> can use it to instantiate the nodes and associate them with the
> respective devices.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
> Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
> Tested-by: Tomasz Nowicki <tn@semihalf.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thanks!

^ permalink raw reply

* Summary of LPC guest MSI discussion in Santa Fe
From: Will Deacon @ 2016-11-09 23:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109162458.39594fdb@t450s.home>

On Wed, Nov 09, 2016 at 04:24:58PM -0700, Alex Williamson wrote:
> On Wed, 9 Nov 2016 22:25:22 +0000
> Will Deacon <will.deacon@arm.com> wrote:
> 
> > On Wed, Nov 09, 2016 at 03:17:09PM -0700, Alex Williamson wrote:
> > > On Wed, 9 Nov 2016 20:31:45 +0000
> > > Will Deacon <will.deacon@arm.com> wrote:  
> > > > On Wed, Nov 09, 2016 at 08:23:03PM +0100, Christoffer Dall wrote:  
> > > > > 
> > > > > (I suppose it's technically possible to get around this issue by letting
> > > > > QEMU place RAM wherever it wants but tell the guest to never use a
> > > > > particular subset of its RAM for DMA, because that would conflict with
> > > > > the doorbell IOVA or be seen as p2p transactions.  But I think we all
> > > > > probably agree that it's a disgusting idea.)    
> > > > 
> > > > Disgusting, yes, but Ben's idea of hotplugging on the host controller with
> > > > firmware tables describing the reserved regions is something that we could
> > > > do in the distant future. In the meantime, I don't think that VFIO should
> > > > explicitly reject overlapping mappings if userspace asks for them.  
> > > 
> > > I'm confused by the last sentence here, rejecting user mappings that
> > > overlap reserved ranges, such as MSI doorbell pages, is exactly how
> > > we'd reject hot-adding a device when we meet such a conflict.  If we
> > > don't reject such a mapping, we're knowingly creating a situation that
> > > potentially leads to data loss.  Minimally, QEMU would need to know
> > > about the reserved region, map around it through VFIO, and take
> > > responsibility (somehow) for making sure that region is never used for
> > > DMA.  Thanks,  
> > 
> > Yes, but my point is that it should be up to QEMU to abort the hotplug, not
> > the host kernel, since there may be ways in which a guest can tolerate the
> > overlapping region (e.g. by avoiding that range of memory for DMA).
> 
> The VFIO_IOMMU_MAP_DMA ioctl is a contract, the user ask to map a range
> of IOVAs to a range of virtual addresses for a given device.  If VFIO
> cannot reasonably fulfill that contract, it must fail.  It's up to QEMU
> how to manage the hotplug and what memory regions it asks VFIO to map
> for a device, but VFIO must reject mappings that it (or the SMMU by
> virtue of using the IOMMU API) know to overlap reserved ranges.  So I
> still disagree with the referenced statement.  Thanks,

I think that's a pity. Not only does it mean that both QEMU and the kernel
have more work to do (the former has to carve up its mapping requests,
whilst the latter has to check that it is indeed doing this), but it also
precludes the use of hugepage mappings on the IOMMU because of reserved
regions. For example, a 4k hole someplace may mean we can't put down 1GB
table entries for the guest memory in the SMMU.

All this seems to do is add complexity and decrease performance. For what?
QEMU has to go read the reserved regions from someplace anyway. It's also
the way that VFIO works *today* on arm64 wrt reserved regions, it just has
no way to identify those holes at present.

Will

^ permalink raw reply

* [PATCH V10 0/6] Enable PMUs in ACPI Systems
From: Jeremy Linton @ 2016-11-09 23:39 UTC (permalink / raw)
  To: linux-arm-kernel

This patch expands and reworks the patches published by Mark Salter
in order to clean up a few of the previous review comments, as well as
add support for newer CPUs and big/little configurations.

v10:
- Rebase to 4.9
- Rework the arm_perf_start_cpu changes to support the 4.9 hotplug
  changes.
- Remove the call to acpi_register_gsi() from the cpu online code path.
  Instead the GSI's are registered during the initcall. This changes
  the error handling a bit because we now try to clean up the
  previously registered GSIs in a couple important places. This
  was also a result of the rebase.
- Dropped the MIDR partnumber usage, its no longer necessary to
  differentiate by only the partnum, so this helps to clarify the code
  a bit. 
- Shuffle some code around and rename a few variables.
- Added a few comments to hopefully clarify some questions people have
  previously had about unused MADT entries, skipping processing cores
  with MIDR=0, etc.

v9:
- Add/cleanup an additional hotplug patch I've had sitting around. This
  patch brings the ACPI PMU mostly on par with the DT functionality with
  respect to having CPUs offline during boot. This should help clarify
  some of the code structuring.
- Cleanup the list of PMU types early if we fail to allocate memory for an
  additional pmu type.

v8:
- Rebase to 4.8rc4
- Assorted minor comment/hunk placement/etc tweaks per Punit Agrawal

v7:
- Rebase to 4.8rc3
- Remove cpu affinity sysfs entry. While providing a CPU mask for
  ARMv8 PMU's is really helpful in big/little environments, reworking
  the PMU code to support the cpumask attribute for !arm64 PMUs is out
  of the scope of this patch set.
- Fix CPU miscount problem where an alloc failure followed by successfully
  allocating the structure can result in under counting the CPUs associated
  with the PMU. This bug was created in v6 with the conversion to a linked
  list.
-  Remove initial platform device creation code by Mark Salter, and re-squash
   multiple platform device creation code together with helper routines.
   Other minor tweakage.

v6:
- Added cpu affinity sysfs entry
- Converted pmu_types array, to linked list
- Restrict use of the armv8_pmu_probe_table to ACPI systems
- Rename MADT parsing routines in smp.c
- Convert sysfs PMU name to use index rather than partnum
- Remove pr_devel statements
- Other Minor cleanups
- Add Partial Ack-by Will Deacon

v5:
- Remove list of CPU types for ACPI systems. We now match a generic
  event list, and use the PMCIED[01] to select events which exist on
  the given PMU. This avoids the need to update the kernel every time
  a new CPU is released.
- Update the maintainers list to include the new file.

v4:
- Correct build issues with ARM (!ARM64) kernels.
- Add ThunderX to list of PMU types.

v3:
- Enable ARM performance monitoring units on ACPI/arm64 machines.

Jeremy Linton (5):
  arm64: Rename the common MADT parse routine
  arm: arm64: Add routine to determine cpuid of other cpus
  arm: arm64: pmu: Assign platform PMU CPU affinity
  arm64: pmu: Detect and enable multiple PMUs in an ACPI system
  arm: pmu: Add PMU definitions for cores not initially online

Mark Salter (1):
  arm64: pmu: Cache PMU interrupt numbers from MADT parse

 arch/arm/include/asm/cputype.h   |   2 +
 arch/arm64/include/asm/cputype.h |   3 +
 arch/arm64/kernel/smp.c          |  18 ++-
 drivers/perf/Kconfig             |   4 +
 drivers/perf/Makefile            |   1 +
 drivers/perf/arm_pmu.c           | 107 +++++++++++++--
 drivers/perf/arm_pmu_acpi.c      | 272 +++++++++++++++++++++++++++++++++++++++
 include/linux/perf/arm_pmu.h     |  11 ++
 8 files changed, 400 insertions(+), 18 deletions(-)
 create mode 100644 drivers/perf/arm_pmu_acpi.c

-- 
2.5.5

^ permalink raw reply

* [PATCH V10 1/6] arm64: Rename the common MADT parse routine
From: Jeremy Linton @ 2016-11-09 23:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478734793-6341-1-git-send-email-jeremy.linton@arm.com>

The MADT parser in smp.c is now being used to parse
out NUMA, PMU and ACPI parking protocol information as
well as the GIC information for which it was originally
created. Rename it to avoid a misleading name.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/smp.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 8507703..f3f1c90 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -511,13 +511,14 @@ static unsigned int cpu_count = 1;
 
 #ifdef CONFIG_ACPI
 /*
- * acpi_map_gic_cpu_interface - parse processor MADT entry
+ * acpi_verify_and_map_madt - parse processor MADT entry
  *
  * Carry out sanity checks on MADT processor entry and initialize
- * cpu_logical_map on success
+ * cpu_logical_map, the ACPI parking protocol, NUMA mapping
+ * and the PMU interrupts on success
  */
 static void __init
-acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
+acpi_verify_and_map_madt(struct acpi_madt_generic_interrupt *processor)
 {
 	u64 hwid = processor->arm_mpidr;
 
@@ -571,7 +572,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
 }
 
 static int __init
-acpi_parse_gic_cpu_interface(struct acpi_subtable_header *header,
+acpi_parse_madt_common(struct acpi_subtable_header *header,
 			     const unsigned long end)
 {
 	struct acpi_madt_generic_interrupt *processor;
@@ -582,7 +583,7 @@ acpi_parse_gic_cpu_interface(struct acpi_subtable_header *header,
 
 	acpi_table_print_madt_entry(header);
 
-	acpi_map_gic_cpu_interface(processor);
+	acpi_verify_and_map_madt(processor);
 
 	return 0;
 }
@@ -666,7 +667,7 @@ void __init smp_init_cpus(void)
 		 * we need for SMP init
 		 */
 		acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
-				      acpi_parse_gic_cpu_interface, 0);
+				      acpi_parse_madt_common, 0);
 
 	if (cpu_count > nr_cpu_ids)
 		pr_warn("Number of cores (%d) exceeds configured maximum of %d - clipping\n",
-- 
2.5.5

^ permalink raw reply related

* [PATCH V10 2/6] arm: arm64: Add routine to determine cpuid of other cpus
From: Jeremy Linton @ 2016-11-09 23:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478734793-6341-1-git-send-email-jeremy.linton@arm.com>

It is helpful if we can read the cpuid/midr of other CPUs
in the system independent of arm/arm64.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm/include/asm/cputype.h   | 2 ++
 arch/arm64/include/asm/cputype.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/arch/arm/include/asm/cputype.h b/arch/arm/include/asm/cputype.h
index 522b5fe..31fb273 100644
--- a/arch/arm/include/asm/cputype.h
+++ b/arch/arm/include/asm/cputype.h
@@ -235,6 +235,8 @@ static inline unsigned int __attribute_const__ read_cpuid_mpidr(void)
 #define cpu_is_sa1100() (read_cpuid_part() == ARM_CPU_PART_SA1100)
 #define cpu_is_sa1110() (read_cpuid_part() == ARM_CPU_PART_SA1110)
 
+#define read_specific_cpuid(cpu_num) per_cpu_ptr(&cpu_data, cpu_num)->cpuid
+
 /*
  * Intel's XScale3 core supports some v6 features (supersections, L2)
  * but advertises itself as v5 as it does not support the v6 ISA.  For
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 26a68dd..a6d26e1 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -124,6 +124,9 @@ static inline u32 __attribute_const__ read_cpuid_cachetype(void)
 {
 	return read_cpuid(CTR_EL0);
 }
+
+#define read_specific_cpuid(cpu_num) per_cpu_ptr(&cpu_data, cpu_num)->reg_midr
+
 #endif /* __ASSEMBLY__ */
 
 #endif
-- 
2.5.5

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox