Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 14:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <28bb9925-d0d7-9261-27fb-aa79345a19f1@free.fr>

Mason <slash.tmp@free.fr> writes:

> On 25/11/2016 14:11, M?ns Rullg?rd wrote:
>
>> Mason writes:
>> 
>>> It seems there is a disconnect between what Linux expects - an IRQ
>>> when the transfer is complete - and the quirks of this HW :-(
>>>
>>> On this system, there are MBUS "agents" connected via a "switch box".
>>> An agent fires an IRQ when it has dealt with its *half* of the transfer.
>>>
>>> SOURCE_AGENT <---> SBOX <---> DESTINATION_AGENT
>>>
>>> Here are the steps for a transfer, in the general case:
>>>
>>> 1) setup the sbox to connect SOURCE TO DEST
>>> 2) configure source to send N bytes
>>> 3) configure dest to receive N bytes
>>>
>>> When SOURCE_AGENT has sent N bytes, it fires an IRQ
>>> When DEST_AGENT has received N bytes, it fires an IRQ
>>> The sbox connection can be torn down only when the destination
>>> agent has received all bytes.
>>> (And the twist is that some agents do not have an IRQ line.)
>>>
>>> The system provides 3 RAM-to-sbox agents (read channels)
>>> and 3 sbox-to-RAM agents (write channels).
>>>
>>> The NAND Flash controller read and write agents do not have
>>> IRQ lines.
>>>
>>> So for a NAND-to-memory transfer (read from device)
>>> - nothing happens when the NFC has finished sending N bytes to the sbox
>>> - the write channel fires an IRQ when it has received N bytes
>>>
>>> In that case, one IRQ fires when the transfer is complete,
>>> like Linux expects.
>>>
>>> For a memory-to-NAND transfer (write to device)
>>> - the read channel fires an IRQ when it has sent N bytes
>>> - the NFC driver is supposed to poll the NFC to determine
>>> when the controller has finished writing N bytes
>>>
>>> In that case, the IRQ does not indicate that the transfer
>>> is complete, merely that the sending half has finished
>>> its part.
>> 
>> When does your NAND controller signal completion?  When it has received
>> the DMA data, or only when it has finished the actual write operation?
>
> The NAND controller provides a STATUS register.
> Bit 31 is the CMD_READY bit.
> This bit goes to 0 when the controller is busy, and to 1
> when the controller is ready to accept the next command.
>
> The NFC driver is doing:
>
> 	res = wait_for_completion_timeout(&tx_done, HZ);
> 	if (res > 0)
> 		err = readl_poll_timeout(addr, val, val & CMD_READY, 0, 1000);
>
> So basically, sleep until the memory agent IRQ falls,
> then spin until the controller is idle.

This doesn't answer my question.  Waiting for the entire operation to
finish isn't necessary.  The dma driver only needs to wait until all the
data has been received by the nand controller, not until the controller
is completely finished with the command.  Does the nand controller
provide an indication for completion of the dma independently of the
progress of the write command?  The dma glue Sigma added to the
Designware sata controller does this.

> Did you see that adding a 10 ?s delay at the start of
> tangox_dma_pchan_detach() makes the system no longer
> fail (passes an mtd_speedtest).

Yes, but maybe that's much longer than is actually necessary.

>>> I think it is possible to have a generic solution:
>>> Right now, the callback is called from tasklet context.
>>> If we can have a new flag to have the callback invoked
>>> directly from the ISR, then the driver for the client
>>> device can do what is required.
>> 
>> No, that won't work.  The callback shouldn't run in interrupt context.
>
> What if the callback only spun for, at most, 10 ?s ?
>
> 	readl_poll_timeout(addr, val, val & CMD_READY, 0, 10);

That's far too long to wait in interrupt of tasklet context.

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 14:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161125141708.GM14217@n2100.armlinux.org.uk>

Russell King - ARM Linux <linux@armlinux.org.uk> writes:

> On Fri, Nov 25, 2016 at 02:03:20PM +0000, M?ns Rullg?rd wrote:
>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>> 
>> > On Fri, Nov 25, 2016 at 01:50:35PM +0000, M?ns Rullg?rd wrote:
>> >> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>> >> > It would be unfair to augment the API and add the burden on everyone
>> >> > for the new API when 99.999% of the world doesn't require it.
>> >> 
>> >> I don't think making this particular dma driver wait for the descriptor
>> >> callback to return before reusing a channel quite amounts to a horrid
>> >> hack.  It certainly wouldn't burden anyone other than the poor drivers
>> >> for devices connected to it, all of which are specific to Sigma AFAIK.
>> >
>> > Except when you stop to think that delaying in a tasklet is exactly
>> > the same as randomly delaying in an interrupt handler - the tasklet
>> > runs on the return path back to the parent context of an interrupt
>> > handler.  Even if you sleep in the tasklet, you're sleeping on behalf
>> > of the currently executing thread - if it's a RT thread, you effectively
>> > destroy the RT-ness of the thread.  Let's hope no one cares about RT
>> > performance on that hardware...
>> 
>> That's why I suggested to do this only if the needed delay is known to
>> be no more than a few bus cycles.  The completion callback is currently
>> the only post-transfer interaction we have between the dma and device
>> drivers.  To handle an arbitrarily long delay, some new interface will
>> be required.
>
> And now we're back at the point I made a few emails ago about undue
> burden which is just about quoted above...

So what do you suggest?  Stick our heads in the sand and pretend
everything is perfect?

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 14:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <da3ddae9-eca2-f67a-1933-25bc7953c201@free.fr>

Mason <slash.tmp@free.fr> writes:

> On 25/11/2016 15:12, M?ns Rullg?rd wrote:
>
>> Mason writes:
>> 
>>> On 25/11/2016 12:57, M?ns Rullg?rd wrote:
>>>
>>>> The same DMA unit is also used for SATA, which is an off the shelf
>>>> Designware controller with an in-kernel driver.  This interrupt timing
>>>> glitch can actually explain some intermittent errors I've observed with
>>>> it.
>>>
>>> FWIW, newer chips embed an AHCI controller, with a dedicated
>>> memory channel.
>>>
>>> FWIW2, the HW dev said memory channels are "almost free", and he
>>> would have no problem giving each device their own private channel
>>> read/write pair.
>> 
>> We still need to deal with the existing hardware.
>
> Can you confirm that your MBUS driver, in its current form,
> does not support memcpy-type transfers, which generate two
> IRQs (one from send agent, one from receive agent)?

It does not.

> Do you plan to support that, or is it just too quirky?

I hadn't planned on doing that, but I'm ruling it out entirely.

-- 
M?ns Rullg?rd

^ permalink raw reply

* [PATCH 0/2] usb: ohci: s3c2410: add device tree support
From: Sergio Prado @ 2016-11-25 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

This series adds support for configuring Samsung's s3c2410 and
compatible USB OHCI controller via devicetree.

Tested on FriendlyARM mini2440, based on s3c2440 SoC.

Sergio Prado (2):
  dt-bindings: usb: add DT binding for s3c2410 USB OHCI controller
  usb: ohci: s3c2410: allow probing from device tree

 .../devicetree/bindings/usb/s3c2410-usb.txt        | 22 ++++++++++++++++++++++
 drivers/usb/host/ohci-s3c2410.c                    |  8 ++++++++
 2 files changed, 30 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/usb/s3c2410-usb.txt

-- 
1.9.1

^ permalink raw reply

* [PATCH 1/2] dt-bindings: usb: add DT binding for s3c2410 USB OHCI controller
From: Sergio Prado @ 2016-11-25 14:47 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480085249-25014-1-git-send-email-sergio.prado@e-labworks.com>

Adds the device tree bindings description for Samsung S3C2410 and
compatible USB OHCI controller.

Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
---
 .../devicetree/bindings/usb/s3c2410-usb.txt        | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/usb/s3c2410-usb.txt

diff --git a/Documentation/devicetree/bindings/usb/s3c2410-usb.txt b/Documentation/devicetree/bindings/usb/s3c2410-usb.txt
new file mode 100644
index 000000000000..e45b38ce2986
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/s3c2410-usb.txt
@@ -0,0 +1,22 @@
+Samsung S3C2410 and compatible SoC USB controller
+
+OHCI
+
+Required properties:
+ - compatible: should be "samsung,s3c2410-ohci" for USB host controller
+ - reg: address and lenght of the controller memory mapped region
+ - interrupts: interrupt number for the USB OHCI controller
+ - clocks: Should reference the bus and host clocks
+ - clock-names: Should contain two strings
+		"usb-bus-host" for the USB bus clock
+		"usb-host" for the USB host clock
+
+Example:
+
+usb0: ohci at 49000000 {
+	compatible = "samsung,s3c2410-ohci";
+	reg = <0x49000000 0x100>;
+	interrupts = <0 0 26 3>;
+	clocks = <&clocks UCLK>, <&clocks HCLK_USBH>;
+	clock-names = "usb-bus-host", "usb-host";
+};
-- 
1.9.1

^ permalink raw reply related

* [PATCH 2/2] usb: ohci: s3c2410: allow probing from device tree
From: Sergio Prado @ 2016-11-25 14:47 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480085249-25014-1-git-send-email-sergio.prado@e-labworks.com>

Allows configuring Samsung's s3c2410 USB OHCI controller using a
devicetree.

Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
---
 drivers/usb/host/ohci-s3c2410.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/host/ohci-s3c2410.c b/drivers/usb/host/ohci-s3c2410.c
index 7a1919ca543a..d8e03a801f2e 100644
--- a/drivers/usb/host/ohci-s3c2410.c
+++ b/drivers/usb/host/ohci-s3c2410.c
@@ -457,6 +457,13 @@ static int ohci_hcd_s3c2410_drv_resume(struct device *dev)
 	.resume		= ohci_hcd_s3c2410_drv_resume,
 };
 
+static const struct of_device_id ohci_hcd_s3c2410_dt_ids[] = {
+	{ .compatible = "samsung,s3c2410-ohci" },
+	{ /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(of, ohci_hcd_s3c2410_dt_ids);
+
 static struct platform_driver ohci_hcd_s3c2410_driver = {
 	.probe		= ohci_hcd_s3c2410_drv_probe,
 	.remove		= ohci_hcd_s3c2410_drv_remove,
@@ -464,6 +471,7 @@ static int ohci_hcd_s3c2410_drv_resume(struct device *dev)
 	.driver		= {
 		.name	= "s3c2410-ohci",
 		.pm	= &ohci_hcd_s3c2410_pm_ops,
+		.of_match_table	= ohci_hcd_s3c2410_dt_ids,
 	},
 };
 
-- 
1.9.1

^ permalink raw reply related

* Tearing down DMA transfer setup after DMA client has finished
From: Russell King - ARM Linux @ 2016-11-25 14:56 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <yw1x1sxz8vxm.fsf@unicorn.mansr.com>

On Fri, Nov 25, 2016 at 02:40:21PM +0000, M?ns Rullg?rd wrote:
> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
> 
> > On Fri, Nov 25, 2016 at 02:03:20PM +0000, M?ns Rullg?rd wrote:
> >> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
> >> 
> >> > On Fri, Nov 25, 2016 at 01:50:35PM +0000, M?ns Rullg?rd wrote:
> >> >> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
> >> >> > It would be unfair to augment the API and add the burden on everyone
> >> >> > for the new API when 99.999% of the world doesn't require it.
> >> >> 
> >> >> I don't think making this particular dma driver wait for the descriptor
> >> >> callback to return before reusing a channel quite amounts to a horrid
> >> >> hack.  It certainly wouldn't burden anyone other than the poor drivers
> >> >> for devices connected to it, all of which are specific to Sigma AFAIK.
> >> >
> >> > Except when you stop to think that delaying in a tasklet is exactly
> >> > the same as randomly delaying in an interrupt handler - the tasklet
> >> > runs on the return path back to the parent context of an interrupt
> >> > handler.  Even if you sleep in the tasklet, you're sleeping on behalf
> >> > of the currently executing thread - if it's a RT thread, you effectively
> >> > destroy the RT-ness of the thread.  Let's hope no one cares about RT
> >> > performance on that hardware...
> >> 
> >> That's why I suggested to do this only if the needed delay is known to
> >> be no more than a few bus cycles.  The completion callback is currently
> >> the only post-transfer interaction we have between the dma and device
> >> drivers.  To handle an arbitrarily long delay, some new interface will
> >> be required.
> >
> > And now we're back at the point I made a few emails ago about undue
> > burden which is just about quoted above...
> 
> So what do you suggest?  Stick our heads in the sand and pretend
> everything is perfect?

Look, if you're going to be arsey, don't be surprised if I start getting
the urge to repeat previous comments.

Let's try and keep this on a technical basis for once, rather than
decending into insults.

So, wind back to my original email where I started talking about PL08x
already doing something along these lines.  Before a DMA user can make
use of a DMA channel, it has to be requested.  Once a DMA user has
finished, it can free up the channel.

What this means is that there's already a solution here - but it depends
how many DMA channels and how many active DMA users there are.  It's
entirely possible to set the mapping up when a DMA user requests a
DMA channel, leave it setup, and only tear it down when the channel
is eventually freed.

At that point, there's no need to spin-wait or sleep to delay the
tear-down of the channel - and I'd suggest that approach _until_
such time that there are more users than there are DMA channels.  This
has minimal overhead, it doesn't screw up RT threads (which include
IRQ threads), and it doesn't spread the maintanence burden across
drivers with a new custom API just for one SoC.

If (or when) the number of active users exceeds the number of hardware
DMA channels, then there's a decision to be made:

1) either limit the number of peripherals that we support DMA on for
   the SoC.
2) add the delay or API as necessary and switch to dynamic channel
   allocation to incoming requests.

Until that point is reached, there's no point inventing new APIs for
something that isn't actually a problem yet.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Mason @ 2016-11-25 15:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161125141708.GM14217@n2100.armlinux.org.uk>

On 25/11/2016 15:17, Russell King - ARM Linux wrote:
> On Fri, Nov 25, 2016 at 02:03:20PM +0000, M?ns Rullg?rd wrote:
>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>>
>>> On Fri, Nov 25, 2016 at 01:50:35PM +0000, M?ns Rullg?rd wrote:
>>>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>>>>> It would be unfair to augment the API and add the burden on everyone
>>>>> for the new API when 99.999% of the world doesn't require it.
>>>>
>>>> I don't think making this particular dma driver wait for the descriptor
>>>> callback to return before reusing a channel quite amounts to a horrid
>>>> hack.  It certainly wouldn't burden anyone other than the poor drivers
>>>> for devices connected to it, all of which are specific to Sigma AFAIK.
>>>
>>> Except when you stop to think that delaying in a tasklet is exactly
>>> the same as randomly delaying in an interrupt handler - the tasklet
>>> runs on the return path back to the parent context of an interrupt
>>> handler.  Even if you sleep in the tasklet, you're sleeping on behalf
>>> of the currently executing thread - if it's a RT thread, you effectively
>>> destroy the RT-ness of the thread.  Let's hope no one cares about RT
>>> performance on that hardware...
>>
>> That's why I suggested to do this only if the needed delay is known to
>> be no more than a few bus cycles.  The completion callback is currently
>> the only post-transfer interaction we have between the dma and device
>> drivers.  To handle an arbitrarily long delay, some new interface will
>> be required.
> 
> And now we're back at the point I made a few emails ago about undue
> burden which is just about quoted above...

I've had several talks with the HW dev, and I don't think they
anticipated the need to mux the 3 channels. In their minds,
customers would choose at most 3 devices to support, and
assign one channel to each device statically.

In fact, in tango4, supported devices are:
A) NAND Flash controllers 0 and 1
NB: the upstream driver only uses controller 0
B) IDE or SATA controllers 0 and 1
C) a few crypto HW blocks which do not work as expected (unused)

Customers typically use 1 channel for NAND, maybe 1 for SATA,
and 1 channel remains unused.

I understand the desire to solve the general case in the
driver, but actual use-cases are much more trivial.

Regards.

^ permalink raw reply

* [PATCH v17 08/15] clocksource/drivers/arm_arch_timer: move arch_timer_needs_of_probing into DT init call
From: Fu Wei @ 2016-11-25 15:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <201611252221.UyJ9iWPV%fengguang.wu@intel.com>

Hi ,

On 25 November 2016 at 22:32, kbuild test robot <lkp@intel.com> wrote:
> Hi Fu,
>
> [auto build test ERROR on pm/linux-next]
> [also build test ERROR on v4.9-rc6]
> [cannot apply to tip/timers/core next-20161125]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url:    https://github.com/0day-ci/linux/commits/fu-wei-linaro-org/acpi-clocksource-add-GTDT-driver-and-GTDT-support-in-arm_arch_timer/20161125-171111
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
> config: arm64-defconfig (attached as .config)
> compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
> reproduce:
>         wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         make.cross ARCH=arm64
>
> Note: the linux-review/fu-wei-linaro-org/acpi-clocksource-add-GTDT-driver-and-GTDT-support-in-arm_arch_timer/20161125-171111 HEAD 498f1f2503da21841b0e7679ddbdb86a40451bdb builds fine.
>       It only hurts bisectibility.
>
> All errors (new ones prefixed by >>):
>
>    drivers/clocksource/arm_arch_timer.c: In function 'arch_timer_acpi_init':

Sorry, again,

a "+ int ret;" should be move from [12/15] to here, I have fix the
problem in my repo, it would happen in next patchset

https://git.linaro.org/people/fu.wei/linux.git/log/?h=topic-gtdt-wakeup-timer_upstream_v18_devel

>>> drivers/clocksource/arm_arch_timer.c:1071:2: error: 'ret' undeclared (first use in this function)
>      ret = arch_timer_register();
>      ^~~
>    drivers/clocksource/arm_arch_timer.c:1071:2: note: each undeclared identifier is reported only once for each function it appears in
>
> vim +/ret +1071 drivers/clocksource/arm_arch_timer.c
>
>   1065                  return -EINVAL;
>   1066          }
>   1067
>   1068          /* Always-on capability */
>   1069          arch_timer_c3stop = !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
>   1070
>> 1071          ret = arch_timer_register();
>   1072          if (ret)
>   1073                  return ret;
>   1074
>
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 15:08 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161125145635.GN14217@n2100.armlinux.org.uk>

Russell King - ARM Linux <linux@armlinux.org.uk> writes:

> On Fri, Nov 25, 2016 at 02:40:21PM +0000, M?ns Rullg?rd wrote:
>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>> 
>> > On Fri, Nov 25, 2016 at 02:03:20PM +0000, M?ns Rullg?rd wrote:
>> >> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>> >> 
>> >> > On Fri, Nov 25, 2016 at 01:50:35PM +0000, M?ns Rullg?rd wrote:
>> >> >> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>> >> >> > It would be unfair to augment the API and add the burden on everyone
>> >> >> > for the new API when 99.999% of the world doesn't require it.
>> >> >> 
>> >> >> I don't think making this particular dma driver wait for the descriptor
>> >> >> callback to return before reusing a channel quite amounts to a horrid
>> >> >> hack.  It certainly wouldn't burden anyone other than the poor drivers
>> >> >> for devices connected to it, all of which are specific to Sigma AFAIK.
>> >> >
>> >> > Except when you stop to think that delaying in a tasklet is exactly
>> >> > the same as randomly delaying in an interrupt handler - the tasklet
>> >> > runs on the return path back to the parent context of an interrupt
>> >> > handler.  Even if you sleep in the tasklet, you're sleeping on behalf
>> >> > of the currently executing thread - if it's a RT thread, you effectively
>> >> > destroy the RT-ness of the thread.  Let's hope no one cares about RT
>> >> > performance on that hardware...
>> >> 
>> >> That's why I suggested to do this only if the needed delay is known to
>> >> be no more than a few bus cycles.  The completion callback is currently
>> >> the only post-transfer interaction we have between the dma and device
>> >> drivers.  To handle an arbitrarily long delay, some new interface will
>> >> be required.
>> >
>> > And now we're back at the point I made a few emails ago about undue
>> > burden which is just about quoted above...
>> 
>> So what do you suggest?  Stick our heads in the sand and pretend
>> everything is perfect?
>
> Look, if you're going to be arsey, don't be surprised if I start getting
> the urge to repeat previous comments.
>
> Let's try and keep this on a technical basis for once, rather than
> decending into insults.

You're the one who constantly insults people.  I'd be happy for you to
stop.

> So, wind back to my original email where I started talking about PL08x
> already doing something along these lines.  Before a DMA user can make
> use of a DMA channel, it has to be requested.  Once a DMA user has
> finished, it can free up the channel.
>
> What this means is that there's already a solution here - but it depends
> how many DMA channels and how many active DMA users there are.  It's
> entirely possible to set the mapping up when a DMA user requests a
> DMA channel, leave it setup, and only tear it down when the channel
> is eventually freed.
>
> At that point, there's no need to spin-wait or sleep to delay the
> tear-down of the channel - and I'd suggest that approach _until_
> such time that there are more users than there are DMA channels.  This
> has minimal overhead, it doesn't screw up RT threads (which include
> IRQ threads), and it doesn't spread the maintanence burden across
> drivers with a new custom API just for one SoC.

I never suggested a custom API for one SoC.

> If (or when) the number of active users exceeds the number of hardware
> DMA channels, then there's a decision to be made:
>
> 1) either limit the number of peripherals that we support DMA on for
>    the SoC.

I don't think people would like being forced to choose between, say,
SATA and NAND flash.

> 2) add the delay or API as necessary and switch to dynamic channel
>    allocation to incoming requests.

A fixed delay doesn't seem right.  Since we don't know the exact amount
required, we'll need to make a guess and make it conservative enough
that it never ends up being too short.  This will most likely end up
delaying things far more than is actually necessary.

The reality of the situation is that the current dmaengine api doesn't
adequately cover all real hardware situations.  You seem to be of the
opinion that fixing this is an "undue burden."

> Until that point is reached, there's no point inventing new APIs for
> something that isn't actually a problem yet.

We're already at that point.  The hardware has many more devices than
physical channels.

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 15:12 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <d996bcdb-2413-770c-84d1-1b12ccd74477@free.fr>

Mason <slash.tmp@free.fr> writes:

> On 25/11/2016 15:17, Russell King - ARM Linux wrote:
>> On Fri, Nov 25, 2016 at 02:03:20PM +0000, M?ns Rullg?rd wrote:
>>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>>>
>>>> On Fri, Nov 25, 2016 at 01:50:35PM +0000, M?ns Rullg?rd wrote:
>>>>> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
>>>>>> It would be unfair to augment the API and add the burden on everyone
>>>>>> for the new API when 99.999% of the world doesn't require it.
>>>>>
>>>>> I don't think making this particular dma driver wait for the descriptor
>>>>> callback to return before reusing a channel quite amounts to a horrid
>>>>> hack.  It certainly wouldn't burden anyone other than the poor drivers
>>>>> for devices connected to it, all of which are specific to Sigma AFAIK.
>>>>
>>>> Except when you stop to think that delaying in a tasklet is exactly
>>>> the same as randomly delaying in an interrupt handler - the tasklet
>>>> runs on the return path back to the parent context of an interrupt
>>>> handler.  Even if you sleep in the tasklet, you're sleeping on behalf
>>>> of the currently executing thread - if it's a RT thread, you effectively
>>>> destroy the RT-ness of the thread.  Let's hope no one cares about RT
>>>> performance on that hardware...
>>>
>>> That's why I suggested to do this only if the needed delay is known to
>>> be no more than a few bus cycles.  The completion callback is currently
>>> the only post-transfer interaction we have between the dma and device
>>> drivers.  To handle an arbitrarily long delay, some new interface will
>>> be required.
>> 
>> And now we're back at the point I made a few emails ago about undue
>> burden which is just about quoted above...
>
> I've had several talks with the HW dev, and I don't think they
> anticipated the need to mux the 3 channels. In their minds,
> customers would choose at most 3 devices to support, and
> assign one channel to each device statically.
>
> In fact, in tango4, supported devices are:
> A) NAND Flash controllers 0 and 1
> NB: the upstream driver only uses controller 0
> B) IDE or SATA controllers 0 and 1
> C) a few crypto HW blocks which do not work as expected (unused)
>
> Customers typically use 1 channel for NAND, maybe 1 for SATA,
> and 1 channel remains unused.

The hardware has two sata controllers, and I have a board that uses both.

-- 
M?ns Rullg?rd

^ permalink raw reply

* [PATCH RESEND 2/2] gpio: axp209: add pinctrl support
From: Maxime Ripard @ 2016-11-25 15:17 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161123141151.25315-3-quentin.schulz@free-electrons.com>

Hi,

On Wed, Nov 23, 2016 at 03:11:51PM +0100, Quentin Schulz wrote:
> The GPIOs present in the AXP209 PMIC have multiple functions. They
> typically allow a pin to be used as GPIO input or output and can also be
> used as ADC or regulator for example.[1]
> 
> This adds the possibility to use all functions of the GPIOs present in
> the AXP209 PMIC thanks to pinctrl subsystem.
> 
> [1] see registers 90H, 92H and 93H at
>     http://dl.linux-sunxi.org/AXP/AXP209_Datasheet_v1.0en.pdf
> 
> Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>

I've said it already face to face, but ideally you should split that
patch into logical changes.

I can see here at least three:
  - Adding the pinctrl features
  - Renaming the structure and functions
  - Removal of a few functions

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161125/3ba59778/attachment.sig>

^ permalink raw reply

* [PATCH] ARM: dts: sunxi: Add num-cs for A20 spi nodes
From: Maxime Ripard @ 2016-11-25 15:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161124210509.be743aae84c26c6c2e666c6e@bidouilliste.com>

On Thu, Nov 24, 2016 at 09:05:09PM +0100, Emmanuel Vadot wrote:
> On Thu, 24 Nov 2016 20:55:17 +0100
> Maxime Ripard <maxime.ripard@free-electrons.com> wrote:
> 
> > On Tue, Nov 22, 2016 at 06:06:16PM +0100, Emmanuel Vadot wrote:
> > > The spi0 controller on the A20 have up to 4 CS (Chip Select) while the
> > > others three only have 1.
> > > Add the num-cs property to each node.
> > > 
> > > Signed-off-by: Emmanuel Vadot <manu@bidouilliste.com>
> > 
> > I don't think we have any code that uses it at the moment. What is the
> > rationale behind this patch?
> > 
> > Thanks!
> > Maxime
> > 
> > -- 
> > Maxime Ripard, Free Electrons
> > Embedded Linux and Kernel engineering
> > http://free-electrons.com
> 
>  Hi Maxime,
> 
>  If num-cs isn't present nothing prevent to start a transfer with a
> non-valid CS pin, resulting in an error.
>  num-cs are default property especially made for this and a SPI driver
> should try to get the property at probe/attach time.

Yes, but as far as I know, our driver doesn't. I'm all in for having
support for that in our driver, but without it, that patch is kind of
useless.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161125/c080bfc9/attachment.sig>

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Mason @ 2016-11-25 15:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <yw1xoa137fvr.fsf@unicorn.mansr.com>

On 25/11/2016 16:12, M?ns Rullg?rd wrote:

> Mason writes:
> 
>> I've had several talks with the HW dev, and I don't think they
>> anticipated the need to mux the 3 channels. In their minds,
>> customers would choose at most 3 devices to support, and
>> assign one channel to each device statically.
>>
>> In fact, in tango4, supported devices are:
>> A) NAND Flash controllers 0 and 1
>> NB: the upstream driver only uses controller 0
>> B) IDE or SATA controllers 0 and 1
>> C) a few crypto HW blocks which do not work as expected (unused)
>>
>> Customers typically use 1 channel for NAND, maybe 1 for SATA,
>> and 1 channel remains unused.
> 
> The hardware has two sata controllers, and I have a board that uses both.

I don't have the tango3 client devices in mind, but
1 NAND + 2 SATA works out alright for 3 channels, right?

Regards.

^ permalink raw reply

* Adding a .platform_init callback to sdhci_arasan_ops
From: Sebastian Frias @ 2016-11-25 15:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

When using the Arasan SDHCI HW IP, there is a set of parameters called
"Hardware initialized registers"

(Table 7, Section "Pin Signals", page 56 of Arasan "SD3.0/SDIO3.0/eMMC4.4
AHB Host Controller", revision 6.0 document)

In some platforms those signals are connected to registers that need to
be programmed at some point for proper driver/HW initialisation.

I found that the 'struct sdhci_ops' contains a '.platform_init' callback
that is called from within 'sdhci_pltfm_init', and that seems a good
candidate for a place to program those registers (*).

Do you agree?

Best regards,

Sebastian


(*): This has been prototyped on 4.7 as working properly.
However, upstream commit:

commit 3ea4666e8d429223fbb39c1dccee7599ef7657d5
Author: Douglas Anderson <dianders@chromium.org>
Date:   Mon Jun 20 10:56:47 2016 -0700

    mmc: sdhci-of-arasan: Properly set corecfg_baseclkfreq on rk3399
...

could affect this solution because of the way the 'sdhci_arasan_of_match'
struct is used after that commit.

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-11-25 15:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cba10848-57b1-c28c-2a60-83eb9da4cc63@free.fr>

Mason <slash.tmp@free.fr> writes:

> On 25/11/2016 16:12, M?ns Rullg?rd wrote:
>
>> Mason writes:
>> 
>>> I've had several talks with the HW dev, and I don't think they
>>> anticipated the need to mux the 3 channels. In their minds,
>>> customers would choose at most 3 devices to support, and
>>> assign one channel to each device statically.
>>>
>>> In fact, in tango4, supported devices are:
>>> A) NAND Flash controllers 0 and 1
>>> NB: the upstream driver only uses controller 0
>>> B) IDE or SATA controllers 0 and 1
>>> C) a few crypto HW blocks which do not work as expected (unused)
>>>
>>> Customers typically use 1 channel for NAND, maybe 1 for SATA,
>>> and 1 channel remains unused.
>> 
>> The hardware has two sata controllers, and I have a board that uses both.
>
> I don't have the tango3 client devices in mind, but
> 1 NAND + 2 SATA works out alright for 3 channels, right?

There are only two usable channels.

Besides, your 3.4 kernel allocates the channels dynamically, sort of,
but since it has a completely custom api, this particular timing issue
doesn't arise there.

-- 
M?ns Rullg?rd

^ permalink raw reply

* [PATCH net-next 0/5] Support Armada 37xx SoC (ARMv8 64-bits) in mvneta driver
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

The Armada 37xx is a new ARMv8 SoC from Marvell using same network
controller as the older Armada 370/38x/XP SoCs. This series adapts the
driver in order to be able to use it on this new SoC. The main changes
are:

- 64-bits support: the first patches allow using the driver on a 64-bit
  architecture.

- MBUS support: the mbus configuration is different on Armada 37xx
  from the older SoCs.

- per cpu interrupt: Armada 37xx do not support per cpu interrupt for
  the NETA IP, the non-per-CPU behavior was added back.

The first item is solved by patches 1 to 3.
The 2 last items are solved by patch 4.
In patch 5 the dt support is added.

Beside Armada 37xx, the series have been tested on Armada XP and
Armada 38x (with Hardware Buffer Management and with Software Buffer
Managment).

Thanks,

Gregory

Gregory CLEMENT (3):
  net: mvneta: Use cacheable memory to store the rx buffer virtual address
  net: mvneta: Only disable mvneta_bm for 64-bits
  ARM64: dts: marvell: Add network support for Armada 3700

Marcin Wojtas (2):
  net: mvneta: Convert to be 64 bits compatible
  net: mvneta: Add network support for Armada 3700 SoC

 Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt |   7 +-
 arch/arm64/boot/dts/marvell/armada-3720-db.dts                    |  23 ++++-
 arch/arm64/boot/dts/marvell/armada-37xx.dtsi                      |  23 ++++-
 drivers/net/ethernet/marvell/Kconfig                              |  10 +-
 drivers/net/ethernet/marvell/mvneta.c                             | 400 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
 5 files changed, 362 insertions(+), 101 deletions(-)

base-commit: 436accebb53021ef7c63535f60bda410aa87c136
-- 
git-series 0.8.10

^ permalink raw reply

* [PATCH net-next 1/5] net: mvneta: Use cacheable memory to store the rx buffer virtual address
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cover.2b146800967005632cd02d4da77397e6e2fdf51f.1480087510.git-series.gregory.clement@free-electrons.com>

Until now the virtual address of the received buffer were stored in the
cookie field of the rx descriptor. However, this field is 32-bits only
which prevents to use the driver on a 64-bits architecture.

With this patch the virtual address is stored in an array not shared with
the hardware (no more need to use the DMA API). Thanks to this, it is
possible to use cache contrary to the access of the rx descriptor member.

The change is done in the swbm path only because the hwbm uses the cookie
field, this also means that currently the hwbm is not usable in 64-bits.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 96 ++++++++++++++++++++++++----
 1 file changed, 84 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 87274d4ab102..b6849f88cab7 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -561,6 +561,9 @@ struct mvneta_rx_queue {
 	u32 pkts_coal;
 	u32 time_coal;
 
+	/* Virtual address of the RX buffer */
+	void  **buf_virt_addr;
+
 	/* Virtual address of the RX DMA descriptors array */
 	struct mvneta_rx_desc *descs;
 
@@ -1573,10 +1576,14 @@ static void mvneta_tx_done_pkts_coal_set(struct mvneta_port *pp,
 
 /* Handle rx descriptor fill by setting buf_cookie and buf_phys_addr */
 static void mvneta_rx_desc_fill(struct mvneta_rx_desc *rx_desc,
-				u32 phys_addr, u32 cookie)
+				u32 phys_addr, void *virt_addr,
+				struct mvneta_rx_queue *rxq)
 {
-	rx_desc->buf_cookie = cookie;
+	int i;
+
 	rx_desc->buf_phys_addr = phys_addr;
+	i = rx_desc - rxq->descs;
+	rxq->buf_virt_addr[i] = virt_addr;
 }
 
 /* Decrement sent descriptors counter */
@@ -1781,7 +1788,8 @@ EXPORT_SYMBOL_GPL(mvneta_frag_free);
 
 /* Refill processing for SW buffer management */
 static int mvneta_rx_refill(struct mvneta_port *pp,
-			    struct mvneta_rx_desc *rx_desc)
+			    struct mvneta_rx_desc *rx_desc,
+			    struct mvneta_rx_queue *rxq)
 
 {
 	dma_addr_t phys_addr;
@@ -1799,7 +1807,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
 		return -ENOMEM;
 	}
 
-	mvneta_rx_desc_fill(rx_desc, phys_addr, (u32)data);
+	mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
 	return 0;
 }
 
@@ -1861,7 +1869,12 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
 
 	for (i = 0; i < rxq->size; i++) {
 		struct mvneta_rx_desc *rx_desc = rxq->descs + i;
-		void *data = (void *)rx_desc->buf_cookie;
+		void *data;
+
+		if (!pp->bm_priv)
+			data = rxq->buf_virt_addr[i];
+		else
+			data = (void *)(uintptr_t)rx_desc->buf_cookie;
 
 		dma_unmap_single(pp->dev->dev.parent, rx_desc->buf_phys_addr,
 				 MVNETA_RX_BUF_SIZE(pp->pkt_size), DMA_FROM_DEVICE);
@@ -1894,12 +1907,13 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo,
 		unsigned char *data;
 		dma_addr_t phys_addr;
 		u32 rx_status, frag_size;
-		int rx_bytes, err;
+		int rx_bytes, err, index;
 
 		rx_done++;
 		rx_status = rx_desc->status;
 		rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
-		data = (unsigned char *)rx_desc->buf_cookie;
+		index = rx_desc - rxq->descs;
+		data = (unsigned char *)rxq->buf_virt_addr[index];
 		phys_addr = rx_desc->buf_phys_addr;
 
 		if (!mvneta_rxq_desc_is_first_last(rx_status) ||
@@ -1938,7 +1952,7 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo,
 		}
 
 		/* Refill processing */
-		err = mvneta_rx_refill(pp, rx_desc);
+		err = mvneta_rx_refill(pp, rx_desc, rxq);
 		if (err) {
 			netdev_err(dev, "Linux processing - Can't refill\n");
 			rxq->missed++;
@@ -2020,7 +2034,7 @@ static int mvneta_rx_hwbm(struct mvneta_port *pp, int rx_todo,
 		rx_done++;
 		rx_status = rx_desc->status;
 		rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE);
-		data = (unsigned char *)rx_desc->buf_cookie;
+		data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
 		phys_addr = rx_desc->buf_phys_addr;
 		pool_id = MVNETA_RX_GET_BM_POOL_ID(rx_desc);
 		bm_pool = &pp->bm_priv->bm_pools[pool_id];
@@ -2708,6 +2722,57 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 	return rx_done;
 }
 
+/* Refill processing for HW buffer management */
+static int mvneta_rx_hwbm_refill(struct mvneta_port *pp,
+				 struct mvneta_rx_desc *rx_desc)
+
+{
+	dma_addr_t phys_addr;
+	void *data;
+
+	data = mvneta_frag_alloc(pp->frag_size);
+	if (!data)
+		return -ENOMEM;
+
+	phys_addr = dma_map_single(pp->dev->dev.parent, data,
+				   MVNETA_RX_BUF_SIZE(pp->pkt_size),
+				   DMA_FROM_DEVICE);
+	if (unlikely(dma_mapping_error(pp->dev->dev.parent, phys_addr))) {
+		mvneta_frag_free(pp->frag_size, data);
+		return -ENOMEM;
+	}
+
+	phys_addr += pp->rx_offset_correction;
+	rx_desc->buf_phys_addr = phys_addr;
+	rx_desc->buf_cookie = (uintptr_t)data;
+
+	return 0;
+}
+
+/* Handle rxq fill: allocates rxq skbs; called when initializing a port */
+static int mvneta_rxq_bm_fill(struct mvneta_port *pp,
+			      struct mvneta_rx_queue *rxq,
+			      int num)
+{
+	int i;
+
+	for (i = 0; i < num; i++) {
+		memset(rxq->descs + i, 0, sizeof(struct mvneta_rx_desc));
+		if (mvneta_rx_hwbm_refill(pp, rxq->descs + i) != 0) {
+			netdev_err(pp->dev, "%s:rxq %d, %d of %d buffs  filled\n",
+				   __func__, rxq->id, i, num);
+			break;
+		}
+	}
+
+	/* Add this number of RX descriptors as non occupied (ready to
+	 * get packets)
+	 */
+	mvneta_rxq_non_occup_desc_add(pp, rxq, i);
+
+	return i;
+}
+
 /* Handle rxq fill: allocates rxq skbs; called when initializing a port */
 static int mvneta_rxq_fill(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 			   int num)
@@ -2716,7 +2781,7 @@ static int mvneta_rxq_fill(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 
 	for (i = 0; i < num; i++) {
 		memset(rxq->descs + i, 0, sizeof(struct mvneta_rx_desc));
-		if (mvneta_rx_refill(pp, rxq->descs + i) != 0) {
+		if (mvneta_rx_refill(pp, rxq->descs + i, rxq) != 0) {
 			netdev_err(pp->dev, "%s:rxq %d, %d of %d buffs  filled\n",
 				__func__, rxq->id, i, num);
 			break;
@@ -2784,14 +2849,21 @@ static int mvneta_rxq_init(struct mvneta_port *pp,
 		mvneta_rxq_buf_size_set(pp, rxq,
 					MVNETA_RX_BUF_SIZE(pp->pkt_size));
 		mvneta_rxq_bm_disable(pp, rxq);
+
+		rxq->buf_virt_addr = devm_kmalloc(pp->dev->dev.parent,
+						  rxq->size * sizeof(void *),
+						  GFP_KERNEL);
+		if (!rxq->buf_virt_addr)
+			return -ENOMEM;
+
+		mvneta_rxq_fill(pp, rxq, rxq->size);
 	} else {
 		mvneta_rxq_bm_enable(pp, rxq);
 		mvneta_rxq_long_pool_set(pp, rxq);
 		mvneta_rxq_short_pool_set(pp, rxq);
+		mvneta_rxq_bm_fill(pp, rxq, rxq->size);
 	}
 
-	mvneta_rxq_fill(pp, rxq, rxq->size);
-
 	return 0;
 }
 
-- 
git-series 0.8.10

^ permalink raw reply related

* [PATCH net-next 2/5] net: mvneta: Convert to be 64 bits compatible
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cover.2b146800967005632cd02d4da77397e6e2fdf51f.1480087510.git-series.gregory.clement@free-electrons.com>

From: Marcin Wojtas <mw@semihalf.com>

Prepare the mvneta driver in order to be usable on the 64 bits platform
such as the Armada 3700.

[gregory.clement at free-electrons.com]: this patch was extract from a larger
one to ease review and maintenance.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index b6849f88cab7..ad3872e07a93 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -296,6 +296,12 @@
 /* descriptor aligned size */
 #define MVNETA_DESC_ALIGNED_SIZE	32
 
+/* Number of bytes to be taken into account by HW when putting incoming data
+ * to the buffers. It is needed in case NET_SKB_PAD exceeds maximum packet
+ * offset supported in MVNETA_RXQ_CONFIG_REG(q) registers.
+ */
+#define MVNETA_RX_PKT_OFFSET_CORRECTION		64
+
 #define MVNETA_RX_PKT_SIZE(mtu) \
 	ALIGN((mtu) + MVNETA_MH_SIZE + MVNETA_VLAN_TAG_LEN + \
 	      ETH_HLEN + ETH_FCS_LEN,			     \
@@ -416,6 +422,7 @@ struct mvneta_port {
 	u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
 
 	u32 indir[MVNETA_RSS_LU_TABLE_SIZE];
+	u16 rx_offset_correction;
 };
 
 /* The mvneta_tx_desc and mvneta_rx_desc structures describe the
@@ -1807,6 +1814,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
 		return -ENOMEM;
 	}
 
+	phys_addr += pp->rx_offset_correction;
 	mvneta_rx_desc_fill(rx_desc, phys_addr, data, rxq);
 	return 0;
 }
@@ -2838,7 +2846,7 @@ static int mvneta_rxq_init(struct mvneta_port *pp,
 	mvreg_write(pp, MVNETA_RXQ_SIZE_REG(rxq->id), rxq->size);
 
 	/* Set Offset */
-	mvneta_rxq_offset_set(pp, rxq, NET_SKB_PAD);
+	mvneta_rxq_offset_set(pp, rxq, NET_SKB_PAD - pp->rx_offset_correction);
 
 	/* Set coalescing pkts and time */
 	mvneta_rx_pkts_coal_set(pp, rxq, rxq->pkts_coal);
@@ -4091,6 +4099,13 @@ static int mvneta_probe(struct platform_device *pdev)
 
 	pp->rxq_def = rxq_def;
 
+	/* Set RX packet offset correction for platforms, whose
+	 * NET_SKB_PAD, exceeds 64B. It should be 64B for 64-bit
+	 * platforms and 0B for 32-bit ones.
+	 */
+	pp->rx_offset_correction =
+		max(0, NET_SKB_PAD - MVNETA_RX_PKT_OFFSET_CORRECTION);
+
 	pp->indir[0] = rxq_def;
 
 	pp->clk = devm_clk_get(&pdev->dev, "core");
-- 
git-series 0.8.10

^ permalink raw reply related

* [PATCH net-next 3/5] net: mvneta: Only disable mvneta_bm for 64-bits
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cover.2b146800967005632cd02d4da77397e6e2fdf51f.1480087510.git-series.gregory.clement@free-electrons.com>

Actually only the mvneta_bm support is not 64-bits compatible.
The mvneta code itself can run on 64-bits architecture.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/net/ethernet/marvell/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig
index 66fd9dbb2ca7..2ccea9dd9248 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -44,6 +44,7 @@ config MVMDIO
 config MVNETA_BM_ENABLE
 	tristate "Marvell Armada 38x/XP network interface BM support"
 	depends on MVNETA
+	depends on !64BIT
 	---help---
 	  This driver supports auxiliary block of the network
 	  interface units in the Marvell ARMADA XP and ARMADA 38x SoC
@@ -58,7 +59,6 @@ config MVNETA
 	tristate "Marvell Armada 370/38x/XP network interface support"
 	depends on PLAT_ORION || COMPILE_TEST
 	depends on HAS_DMA
-	depends on !64BIT
 	select MVMDIO
 	select FIXED_PHY
 	---help---
@@ -71,6 +71,7 @@ config MVNETA
 
 config MVNETA_BM
 	tristate
+	depends on !64BIT
 	default y if MVNETA=y && MVNETA_BM_ENABLE!=n
 	default MVNETA_BM_ENABLE
 	select HWBM
-- 
git-series 0.8.10

^ permalink raw reply related

* [PATCH net-next 4/5] net: mvneta: Add network support for Armada 3700 SoC
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cover.2b146800967005632cd02d4da77397e6e2fdf51f.1480087510.git-series.gregory.clement@free-electrons.com>

From: Marcin Wojtas <mw@semihalf.com>

Armada 3700 is a new ARMv8 SoC from Marvell using same network controller
as older Armada 370/38x/XP. There are however some differences that
needed taking into account when adding support for it:

* open default MBUS window to 4GB of DRAM - Armada 3700 SoC's Mbus
  configuration for network controller has to be done on two levels:
  global and per-port. The first one is inherited from the
  bootloader. The latter can be opened in a default way, leaving
  arbitration to the bus controller.  Hence filled mbus_dram_target_info
  structure is not needed

* make per-CPU operation optional - Recent patches adding RSS and XPS
  support for Armada 38x/XP enabled per-CPU operation of the controller
  by default. Contrary to older SoC's Armada 3700 SoC's network
  controller is not capable of per-CPU processing due to interrupt lines'
  connectivity.  This patch restores non-per-CPU operation, which is now
  optional and depends on neta_armada3700 flag value in mvneta_port
  structure. In order not to complicate the code, separate interrupt
  subroutine is implemented.

For now, on the Armada 3700, RSS is disabled as the current
implementation depend on the per cpu interrupts.

[gregory.clement at free-electrons.com: extract from a larger patch, replace
some ifdef and port to net-next for v4.10]

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt |   7 +-
 drivers/net/ethernet/marvell/Kconfig                              |   7 +-
 drivers/net/ethernet/marvell/mvneta.c                             | 287 +++++++++++++++++++++++++++++++++++++++++++++++++++---------------------
 3 files changed, 214 insertions(+), 87 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index 73be8970815e..7aa840c8768d 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -1,7 +1,10 @@
-* Marvell Armada 370 / Armada XP Ethernet Controller (NETA)
+* Marvell Armada 370 / Armada XP / Armada 3700 Ethernet Controller (NETA)
 
 Required properties:
-- compatible: "marvell,armada-370-neta" or "marvell,armada-xp-neta".
+- compatible: could be one of the followings
+	"marvell,armada-370-neta"
+	"marvell,armada-xp-neta"
+	"marvell,armada-3700-neta"
 - reg: address and length of the register set for the device.
 - interrupts: interrupt for the device
 - phy: See ethernet.txt file in the same directory.
diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig
index 2ccea9dd9248..3b8f11fe5e13 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -56,14 +56,15 @@ config MVNETA_BM_ENABLE
 	  buffer management.
 
 config MVNETA
-	tristate "Marvell Armada 370/38x/XP network interface support"
-	depends on PLAT_ORION || COMPILE_TEST
+	tristate "Marvell Armada 370/38x/XP/37xx network interface support"
+	depends on ARCH_MVEBU || COMPILE_TEST
 	depends on HAS_DMA
 	select MVMDIO
 	select FIXED_PHY
 	---help---
 	  This driver supports the network interface units in the
-	  Marvell ARMADA XP, ARMADA 370 and ARMADA 38x SoC family.
+	  Marvell ARMADA XP, ARMADA 370, ARMADA 38x and
+	  ARMADA 37xx SoC family.
 
 	  Note that this driver is distinct from the mv643xx_eth
 	  driver, which should be used for the older Marvell SoCs
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index ad3872e07a93..77cef5a9de7b 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -397,6 +397,9 @@ struct mvneta_port {
 	spinlock_t lock;
 	bool is_stopped;
 
+	u32 cause_rx_tx;
+	struct napi_struct napi;
+
 	/* Core clock */
 	struct clk *clk;
 	/* AXI clock */
@@ -422,6 +425,9 @@ struct mvneta_port {
 	u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
 
 	u32 indir[MVNETA_RSS_LU_TABLE_SIZE];
+
+	/* Flags for special SoC configurations */
+	bool neta_armada3700;
 	u16 rx_offset_correction;
 };
 
@@ -965,14 +971,9 @@ static int mvneta_mbus_io_win_set(struct mvneta_port *pp, u32 base, u32 wsize,
 	return 0;
 }
 
-/* Assign and initialize pools for port. In case of fail
- * buffer manager will remain disabled for current port.
- */
-static int mvneta_bm_port_init(struct platform_device *pdev,
-			       struct mvneta_port *pp)
+static  int mvneta_bm_port_mbus_init(struct mvneta_port *pp)
 {
-	struct device_node *dn = pdev->dev.of_node;
-	u32 long_pool_id, short_pool_id, wsize;
+	u32 wsize;
 	u8 target, attr;
 	int err;
 
@@ -991,6 +992,25 @@ static int mvneta_bm_port_init(struct platform_device *pdev,
 		netdev_info(pp->dev, "fail to configure mbus window to BM\n");
 		return err;
 	}
+	return 0;
+}
+
+/* Assign and initialize pools for port. In case of fail
+ * buffer manager will remain disabled for current port.
+ */
+static int mvneta_bm_port_init(struct platform_device *pdev,
+			       struct mvneta_port *pp)
+{
+	struct device_node *dn = pdev->dev.of_node;
+	u32 long_pool_id, short_pool_id;
+
+	if (!pp->neta_armada3700) {
+		int ret;
+
+		ret = mvneta_bm_port_mbus_init(pp);
+		if (ret)
+			return ret;
+	}
 
 	if (of_property_read_u32(dn, "bm,pool-long", &long_pool_id)) {
 		netdev_info(pp->dev, "missing long pool id\n");
@@ -1359,22 +1379,27 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
 	for_each_present_cpu(cpu) {
 		int rxq_map = 0, txq_map = 0;
 		int rxq, txq;
+		if (!pp->neta_armada3700) {
+			for (rxq = 0; rxq < rxq_number; rxq++)
+				if ((rxq % max_cpu) == cpu)
+					rxq_map |= MVNETA_CPU_RXQ_ACCESS(rxq);
+
+			for (txq = 0; txq < txq_number; txq++)
+				if ((txq % max_cpu) == cpu)
+					txq_map |= MVNETA_CPU_TXQ_ACCESS(txq);
+
+			/* With only one TX queue we configure a special case
+			 * which will allow to get all the irq on a single
+			 * CPU
+			 */
+			if (txq_number == 1)
+				txq_map = (cpu == pp->rxq_def) ?
+					MVNETA_CPU_TXQ_ACCESS(1) : 0;
 
-		for (rxq = 0; rxq < rxq_number; rxq++)
-			if ((rxq % max_cpu) == cpu)
-				rxq_map |= MVNETA_CPU_RXQ_ACCESS(rxq);
-
-		for (txq = 0; txq < txq_number; txq++)
-			if ((txq % max_cpu) == cpu)
-				txq_map |= MVNETA_CPU_TXQ_ACCESS(txq);
-
-		/* With only one TX queue we configure a special case
-		 * which will allow to get all the irq on a single
-		 * CPU
-		 */
-		if (txq_number == 1)
-			txq_map = (cpu == pp->rxq_def) ?
-				MVNETA_CPU_TXQ_ACCESS(1) : 0;
+		} else {
+			txq_map = MVNETA_CPU_TXQ_ACCESS_ALL_MASK;
+			rxq_map = MVNETA_CPU_RXQ_ACCESS_ALL_MASK;
+		}
 
 		mvreg_write(pp, MVNETA_CPU_MAP(cpu), rxq_map | txq_map);
 	}
@@ -2632,6 +2657,17 @@ static void mvneta_set_rx_mode(struct net_device *dev)
 /* Interrupt handling - the callback for request_irq() */
 static irqreturn_t mvneta_isr(int irq, void *dev_id)
 {
+	struct mvneta_port *pp = (struct mvneta_port *)dev_id;
+
+	mvreg_write(pp, MVNETA_INTR_NEW_MASK, 0);
+	napi_schedule(&pp->napi);
+
+	return IRQ_HANDLED;
+}
+
+/* Interrupt handling - the callback for request_percpu_irq() */
+static irqreturn_t mvneta_percpu_isr(int irq, void *dev_id)
+{
 	struct mvneta_pcpu_port *port = (struct mvneta_pcpu_port *)dev_id;
 
 	disable_percpu_irq(port->pp->dev->irq);
@@ -2679,7 +2715,7 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 	struct mvneta_pcpu_port *port = this_cpu_ptr(pp->ports);
 
 	if (!netif_running(pp->dev)) {
-		napi_complete(&port->napi);
+		napi_complete(napi);
 		return rx_done;
 	}
 
@@ -2708,7 +2744,8 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 	 */
 	rx_queue = fls(((cause_rx_tx >> 8) & 0xff));
 
-	cause_rx_tx |= port->cause_rx_tx;
+	cause_rx_tx |= pp->neta_armada3700 ? pp->cause_rx_tx :
+		port->cause_rx_tx;
 
 	if (rx_queue) {
 		rx_queue = rx_queue - 1;
@@ -2722,11 +2759,27 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 
 	if (budget > 0) {
 		cause_rx_tx = 0;
-		napi_complete(&port->napi);
-		enable_percpu_irq(pp->dev->irq, 0);
+		napi_complete(napi);
+
+		if (pp->neta_armada3700) {
+			unsigned long flags;
+
+			local_irq_save(flags);
+			mvreg_write(pp, MVNETA_INTR_NEW_MASK,
+				    MVNETA_RX_INTR_MASK(rxq_number) |
+				    MVNETA_TX_INTR_MASK(txq_number) |
+				    MVNETA_MISCINTR_INTR_MASK);
+			local_irq_restore(flags);
+		} else {
+			enable_percpu_irq(pp->dev->irq, 0);
+		}
 	}
 
-	port->cause_rx_tx = cause_rx_tx;
+	if (pp->neta_armada3700)
+		pp->cause_rx_tx = cause_rx_tx;
+	else
+		port->cause_rx_tx = cause_rx_tx;
+
 	return rx_done;
 }
 
@@ -3054,11 +3107,16 @@ static void mvneta_start_dev(struct mvneta_port *pp)
 	/* start the Rx/Tx activity */
 	mvneta_port_enable(pp);
 
-	/* Enable polling on the port */
-	for_each_online_cpu(cpu) {
-		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+	if (!pp->neta_armada3700) {
+		/* Enable polling on the port */
+		for_each_online_cpu(cpu) {
+			struct mvneta_pcpu_port *port =
+				per_cpu_ptr(pp->ports, cpu);
 
-		napi_enable(&port->napi);
+			napi_enable(&port->napi);
+		}
+	} else {
+		napi_enable(&pp->napi);
 	}
 
 	/* Unmask interrupts. It has to be done from each CPU */
@@ -3080,10 +3138,15 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
 
 	phy_stop(ndev->phydev);
 
-	for_each_online_cpu(cpu) {
-		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+	if (!pp->neta_armada3700) {
+		for_each_online_cpu(cpu) {
+			struct mvneta_pcpu_port *port =
+				per_cpu_ptr(pp->ports, cpu);
 
-		napi_disable(&port->napi);
+			napi_disable(&port->napi);
+		}
+	} else {
+		napi_disable(&pp->napi);
 	}
 
 	netif_carrier_off(pp->dev);
@@ -3493,31 +3556,37 @@ static int mvneta_open(struct net_device *dev)
 		goto err_cleanup_rxqs;
 
 	/* Connect to port interrupt line */
-	ret = request_percpu_irq(pp->dev->irq, mvneta_isr,
-				 MVNETA_DRIVER_NAME, pp->ports);
+	if (pp->neta_armada3700)
+		ret = request_irq(pp->dev->irq, mvneta_isr, 0,
+				  dev->name, pp);
+	else
+		ret = request_percpu_irq(pp->dev->irq, mvneta_percpu_isr,
+					 dev->name, pp->ports);
 	if (ret) {
 		netdev_err(pp->dev, "cannot request irq %d\n", pp->dev->irq);
 		goto err_cleanup_txqs;
 	}
 
-	/* Enable per-CPU interrupt on all the CPU to handle our RX
-	 * queue interrupts
-	 */
-	on_each_cpu(mvneta_percpu_enable, pp, true);
+	if (!pp->neta_armada3700) {
+		/* Enable per-CPU interrupt on all the CPU to handle our RX
+		 * queue interrupts
+		 */
+		on_each_cpu(mvneta_percpu_enable, pp, true);
 
-	pp->is_stopped = false;
-	/* Register a CPU notifier to handle the case where our CPU
-	 * might be taken offline.
-	 */
-	ret = cpuhp_state_add_instance_nocalls(online_hpstate,
-					       &pp->node_online);
-	if (ret)
-		goto err_free_irq;
+		pp->is_stopped = false;
+		/* Register a CPU notifier to handle the case where our CPU
+		 * might be taken offline.
+		 */
+		ret = cpuhp_state_add_instance_nocalls(online_hpstate,
+						       &pp->node_online);
+		if (ret)
+			goto err_free_irq;
 
-	ret = cpuhp_state_add_instance_nocalls(CPUHP_NET_MVNETA_DEAD,
-					       &pp->node_dead);
-	if (ret)
-		goto err_free_online_hp;
+		ret = cpuhp_state_add_instance_nocalls(CPUHP_NET_MVNETA_DEAD,
+						       &pp->node_dead);
+		if (ret)
+			goto err_free_online_hp;
+	}
 
 	/* In default link is down */
 	netif_carrier_off(pp->dev);
@@ -3533,13 +3602,20 @@ static int mvneta_open(struct net_device *dev)
 	return 0;
 
 err_free_dead_hp:
-	cpuhp_state_remove_instance_nocalls(CPUHP_NET_MVNETA_DEAD,
-					    &pp->node_dead);
+	if (!pp->neta_armada3700)
+		cpuhp_state_remove_instance_nocalls(CPUHP_NET_MVNETA_DEAD,
+						    &pp->node_dead);
 err_free_online_hp:
-	cpuhp_state_remove_instance_nocalls(online_hpstate, &pp->node_online);
+	if (!pp->neta_armada3700)
+		cpuhp_state_remove_instance_nocalls(online_hpstate,
+						    &pp->node_online);
 err_free_irq:
-	on_each_cpu(mvneta_percpu_disable, pp, true);
-	free_percpu_irq(pp->dev->irq, pp->ports);
+	if (pp->neta_armada3700) {
+		free_irq(pp->dev->irq, pp);
+	} else {
+		on_each_cpu(mvneta_percpu_disable, pp, true);
+		free_percpu_irq(pp->dev->irq, pp->ports);
+	}
 err_cleanup_txqs:
 	mvneta_cleanup_txqs(pp);
 err_cleanup_rxqs:
@@ -3552,23 +3628,30 @@ static int mvneta_stop(struct net_device *dev)
 {
 	struct mvneta_port *pp = netdev_priv(dev);
 
-	/* Inform that we are stopping so we don't want to setup the
-	 * driver for new CPUs in the notifiers. The code of the
-	 * notifier for CPU online is protected by the same spinlock,
-	 * so when we get the lock, the notifer work is done.
-	 */
-	spin_lock(&pp->lock);
-	pp->is_stopped = true;
-	spin_unlock(&pp->lock);
+	if (!pp->neta_armada3700) {
+		/* Inform that we are stopping so we don't want to setup the
+		 * driver for new CPUs in the notifiers. The code of the
+		 * notifier for CPU online is protected by the same spinlock,
+		 * so when we get the lock, the notifer work is done.
+		 */
+		spin_lock(&pp->lock);
+		pp->is_stopped = true;
+		spin_unlock(&pp->lock);
 
-	mvneta_stop_dev(pp);
-	mvneta_mdio_remove(pp);
+		mvneta_stop_dev(pp);
+		mvneta_mdio_remove(pp);
 
 	cpuhp_state_remove_instance_nocalls(online_hpstate, &pp->node_online);
 	cpuhp_state_remove_instance_nocalls(CPUHP_NET_MVNETA_DEAD,
 					    &pp->node_dead);
-	on_each_cpu(mvneta_percpu_disable, pp, true);
-	free_percpu_irq(dev->irq, pp->ports);
+		on_each_cpu(mvneta_percpu_disable, pp, true);
+		free_percpu_irq(dev->irq, pp->ports);
+	} else {
+		mvneta_stop_dev(pp);
+		mvneta_mdio_remove(pp);
+		free_irq(dev->irq, pp);
+	}
+
 	mvneta_cleanup_rxqs(pp);
 	mvneta_cleanup_txqs(pp);
 
@@ -3847,6 +3930,11 @@ static int mvneta_ethtool_set_rxfh(struct net_device *dev, const u32 *indir,
 				   const u8 *key, const u8 hfunc)
 {
 	struct mvneta_port *pp = netdev_priv(dev);
+
+	/* Current code for Armada 3700 doesn't support RSS features yet */
+	if (pp->neta_armada3700)
+		return -EOPNOTSUPP;
+
 	/* We require at least one supported parameter to be changed
 	 * and no change in any of the unsupported parameters
 	 */
@@ -3867,6 +3955,10 @@ static int mvneta_ethtool_get_rxfh(struct net_device *dev, u32 *indir, u8 *key,
 {
 	struct mvneta_port *pp = netdev_priv(dev);
 
+	/* Current code for Armada 3700 doesn't support RSS features yet */
+	if (pp->neta_armada3700)
+		return -EOPNOTSUPP;
+
 	if (hfunc)
 		*hfunc = ETH_RSS_HASH_TOP;
 
@@ -3969,16 +4061,29 @@ static void mvneta_conf_mbus_windows(struct mvneta_port *pp,
 	win_enable = 0x3f;
 	win_protect = 0;
 
-	for (i = 0; i < dram->num_cs; i++) {
-		const struct mbus_dram_window *cs = dram->cs + i;
-		mvreg_write(pp, MVNETA_WIN_BASE(i), (cs->base & 0xffff0000) |
-			    (cs->mbus_attr << 8) | dram->mbus_dram_target_id);
+	if (dram) {
+		for (i = 0; i < dram->num_cs; i++) {
+			const struct mbus_dram_window *cs = dram->cs + i;
+
+			mvreg_write(pp, MVNETA_WIN_BASE(i),
+				    (cs->base & 0xffff0000) |
+				    (cs->mbus_attr << 8) |
+				    dram->mbus_dram_target_id);
 
-		mvreg_write(pp, MVNETA_WIN_SIZE(i),
-			    (cs->size - 1) & 0xffff0000);
+			mvreg_write(pp, MVNETA_WIN_SIZE(i),
+				    (cs->size - 1) & 0xffff0000);
 
-		win_enable &= ~(1 << i);
-		win_protect |= 3 << (2 * i);
+			win_enable &= ~(1 << i);
+			win_protect |= 3 << (2 * i);
+		}
+	} else {
+		/* For Armada3700 open default 4GB Mbus window, leaving
+		 * arbitration of target/attribute to a different layer
+		 * of configuration.
+		 */
+		mvreg_write(pp, MVNETA_WIN_SIZE(0), 0xffff0000);
+		win_enable &= ~BIT(0);
+		win_protect = 3;
 	}
 
 	mvreg_write(pp, MVNETA_BASE_ADDR_ENABLE, win_enable);
@@ -4108,6 +4213,10 @@ static int mvneta_probe(struct platform_device *pdev)
 
 	pp->indir[0] = rxq_def;
 
+	/* Get special SoC configurations */
+	if (of_device_is_compatible(dn, "marvell,armada-3700-neta"))
+		pp->neta_armada3700 = true;
+
 	pp->clk = devm_clk_get(&pdev->dev, "core");
 	if (IS_ERR(pp->clk))
 		pp->clk = devm_clk_get(&pdev->dev, NULL);
@@ -4175,7 +4284,11 @@ static int mvneta_probe(struct platform_device *pdev)
 	pp->tx_csum_limit = tx_csum_limit;
 
 	dram_target_info = mv_mbus_dram_info();
-	if (dram_target_info)
+	/* Armada3700 requires setting default configuration of Mbus
+	 * windows, however without using filled mbus_dram_target_info
+	 * structure.
+	 */
+	if (dram_target_info || pp->neta_armada3700)
 		mvneta_conf_mbus_windows(pp, dram_target_info);
 
 	pp->tx_ring_size = MVNETA_MAX_TXD;
@@ -4208,11 +4321,20 @@ static int mvneta_probe(struct platform_device *pdev)
 		goto err_netdev;
 	}
 
-	for_each_present_cpu(cpu) {
-		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+	/* Armada3700 network controller does not support per-cpu
+	 * operation, so only single NAPI should be initialized.
+	 */
+	if (pp->neta_armada3700) {
+		netif_napi_add(dev, &pp->napi, mvneta_poll, NAPI_POLL_WEIGHT);
+	} else {
+		for_each_present_cpu(cpu) {
+			struct mvneta_pcpu_port *port =
+				per_cpu_ptr(pp->ports, cpu);
 
-		netif_napi_add(dev, &port->napi, mvneta_poll, NAPI_POLL_WEIGHT);
-		port->pp = pp;
+			netif_napi_add(dev, &port->napi, mvneta_poll,
+				       NAPI_POLL_WEIGHT);
+			port->pp = pp;
+		}
 	}
 
 	dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
@@ -4297,6 +4419,7 @@ static int mvneta_remove(struct platform_device *pdev)
 static const struct of_device_id mvneta_match[] = {
 	{ .compatible = "marvell,armada-370-neta" },
 	{ .compatible = "marvell,armada-xp-neta" },
+	{ .compatible = "marvell,armada-3700-neta" },
 	{ }
 };
 MODULE_DEVICE_TABLE(of, mvneta_match);
-- 
git-series 0.8.10

^ permalink raw reply related

* [PATCH net-next 5/5] ARM64: dts: marvell: Add network support for Armada 3700
From: Gregory CLEMENT @ 2016-11-25 15:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cover.2b146800967005632cd02d4da77397e6e2fdf51f.1480087510.git-series.gregory.clement@free-electrons.com>

Add neta nodes for network support both in device tree for the SoC and
the board.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 arch/arm64/boot/dts/marvell/armada-3720-db.dts | 23 +++++++++++++++++++-
 arch/arm64/boot/dts/marvell/armada-37xx.dtsi   | 23 +++++++++++++++++++-
 2 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/boot/dts/marvell/armada-3720-db.dts b/arch/arm64/boot/dts/marvell/armada-3720-db.dts
index 1372e9a6aaa4..c8b82e4145de 100644
--- a/arch/arm64/boot/dts/marvell/armada-3720-db.dts
+++ b/arch/arm64/boot/dts/marvell/armada-3720-db.dts
@@ -81,3 +81,26 @@
 &pcie0 {
 	status = "okay";
 };
+
+&mdio {
+	status = "okay";
+	phy0: ethernet-phy at 0 {
+		reg = <0>;
+	};
+
+	phy1: ethernet-phy at 1 {
+		reg = <1>;
+	};
+};
+
+&eth0 {
+	phy-mode = "rgmii-id";
+	phy = <&phy0>;
+	status = "okay";
+};
+
+&eth1 {
+	phy-mode = "rgmii-id";
+	phy = <&phy1>;
+	status = "okay";
+};
diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
index e9bd58793464..3b8eb45bdc76 100644
--- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
@@ -140,6 +140,29 @@
 				};
 			};
 
+			eth0: ethernet at 30000 {
+				   compatible = "marvell,armada-3700-neta";
+				   reg = <0x30000 0x4000>;
+				   interrupts = <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>;
+				   clocks = <&sb_periph_clk 8>;
+				   status = "disabled";
+			};
+
+			mdio: mdio at 32004 {
+				#address-cells = <1>;
+				#size-cells = <0>;
+				compatible = "marvell,orion-mdio";
+				reg = <0x32004 0x4>;
+			};
+
+			eth1: ethernet at 40000 {
+				compatible = "marvell,armada-3700-neta";
+				reg = <0x40000 0x4000>;
+				interrupts = <GIC_SPI 45 IRQ_TYPE_LEVEL_HIGH>;
+				clocks = <&sb_periph_clk 7>;
+				status = "disabled";
+			};
+
 			usb3: usb at 58000 {
 				compatible = "marvell,armada3700-xhci",
 				"generic-xhci";
-- 
git-series 0.8.10

^ permalink raw reply related

* Tearing down DMA transfer setup after DMA client has finished
From: Mason @ 2016-11-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <yw1x8ts78w33.fsf@unicorn.mansr.com>

On 25/11/2016 15:37, M?ns Rullg?rd wrote:

> Mason writes:
> 
>> On 25/11/2016 14:11, M?ns Rullg?rd wrote:
>>
>>> Mason writes:
>>>
>>>> It seems there is a disconnect between what Linux expects - an IRQ
>>>> when the transfer is complete - and the quirks of this HW :-(
>>>>
>>>> On this system, there are MBUS "agents" connected via a "switch box".
>>>> An agent fires an IRQ when it has dealt with its *half* of the transfer.
>>>>
>>>> SOURCE_AGENT <---> SBOX <---> DESTINATION_AGENT
>>>>
>>>> Here are the steps for a transfer, in the general case:
>>>>
>>>> 1) setup the sbox to connect SOURCE TO DEST
>>>> 2) configure source to send N bytes
>>>> 3) configure dest to receive N bytes
>>>>
>>>> When SOURCE_AGENT has sent N bytes, it fires an IRQ
>>>> When DEST_AGENT has received N bytes, it fires an IRQ
>>>> The sbox connection can be torn down only when the destination
>>>> agent has received all bytes.
>>>> (And the twist is that some agents do not have an IRQ line.)
>>>>
>>>> The system provides 3 RAM-to-sbox agents (read channels)
>>>> and 3 sbox-to-RAM agents (write channels).
>>>>
>>>> The NAND Flash controller read and write agents do not have
>>>> IRQ lines.
>>>>
>>>> So for a NAND-to-memory transfer (read from device)
>>>> - nothing happens when the NFC has finished sending N bytes to the sbox
>>>> - the write channel fires an IRQ when it has received N bytes
>>>>
>>>> In that case, one IRQ fires when the transfer is complete,
>>>> like Linux expects.
>>>>
>>>> For a memory-to-NAND transfer (write to device)
>>>> - the read channel fires an IRQ when it has sent N bytes
>>>> - the NFC driver is supposed to poll the NFC to determine
>>>> when the controller has finished writing N bytes
>>>>
>>>> In that case, the IRQ does not indicate that the transfer
>>>> is complete, merely that the sending half has finished
>>>> its part.
>>>
>>> When does your NAND controller signal completion?  When it has received
>>> the DMA data, or only when it has finished the actual write operation?
>>
>> The NAND controller provides a STATUS register.
>> Bit 31 is the CMD_READY bit.
>> This bit goes to 0 when the controller is busy, and to 1
>> when the controller is ready to accept the next command.
>>
>> The NFC driver is doing:
>>
>> 	res = wait_for_completion_timeout(&tx_done, HZ);
>> 	if (res > 0)
>> 		err = readl_poll_timeout(addr, val, val & CMD_READY, 0, 1000);
>>
>> So basically, sleep until the memory agent IRQ falls,
>> then spin until the controller is idle.
> 
> This doesn't answer my question.  Waiting for the entire operation to
> finish isn't necessary.  The dma driver only needs to wait until all the
> data has been received by the nand controller, not until the controller
> is completely finished with the command.  Does the nand controller
> provide an indication for completion of the dma independently of the
> progress of the write command?  The dma glue Sigma added to the
> Designware sata controller does this.

I called the HW dev. He told me the NFC block does not have
buffers to store the incoming data; so they remain in the
MBUS FIFOs until the NFC consumes them, i.e. when it has
finished writing them to a NAND chip, which could take
a "long time" when writing to a slow chip.

So the answer to your question is: "the NAND controller
signals completion only when it has finished the actual
write operation."

>> Did you see that adding a 10 ?s delay at the start of
>> tangox_dma_pchan_detach() makes the system no longer
>> fail (passes an mtd_speedtest).
> 
> Yes, but maybe that's much longer than is actually necessary.

I could instrument my spin loop to record how long we had
to wait between the IRQ and CMD_READY.

Regards.

^ permalink raw reply

* [PATCH] ARM: dts: da850: specify the maximum bandwidth for tilcdc
From: Bartosz Golaszewski @ 2016-11-25 15:37 UTC (permalink / raw)
  To: linux-arm-kernel

It has been determined that the maximum resolution supported correctly
by tilcdc rev1 on da850 SoCs is 800x600 at 60. Due to memory throughput
constraints we must filter out higher modes.

Specify the max-bandwidth property for the display node for
da850-based boards.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/boot/dts/da850.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/da850.dtsi b/arch/arm/boot/dts/da850.dtsi
index 8e30d9b..9b7c444 100644
--- a/arch/arm/boot/dts/da850.dtsi
+++ b/arch/arm/boot/dts/da850.dtsi
@@ -452,6 +452,7 @@
 			compatible = "ti,da850-tilcdc";
 			reg = <0x213000 0x1000>;
 			interrupts = <52>;
+			max-bandwidth = <28800000>;
 			status = "disabled";
 
 			ports {
-- 
2.9.3

^ permalink raw reply related

* [PATCH 5/7] efi: Get the secure boot status [ver #3]
From: David Howells @ 2016-11-25 15:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAKv+Gu9qkPrq1rXN_7=BbZ=8v14Px+oHBqST0OnarUWVwhtZyg@mail.gmail.com>

Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> > +               if (val != 1)
> > +                       return 0;
> 
> val == 0 is better imo, since that will prevent unexpected values from
> being interpreted as 'secure boot disabled'

I've made that change.

David

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox