Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-08 15:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208155051.GA35383@MBP.local>

On 8 December 2016 at 15:50, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Wed, Dec 07, 2016 at 10:14:08AM +0000, Ard Biesheuvel wrote:
>>  void kernel_neon_begin_partial(u32 num_regs)
>>  {
>> -     if (in_interrupt()) {
>> -             struct fpsimd_partial_state *s = this_cpu_ptr(
>> -                     in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
>> +     struct fpsimd_partial_state *s;
>> +     int level;
>> +
>> +     preempt_disable();
>> +
>> +     level = this_cpu_read(kernel_neon_nesting_level);
>> +     BUG_ON(level > 2);
>> +
>> +     if (level > 0) {
>> +             s = this_cpu_ptr(nested_fpsimdstate);
>>
>> -             BUG_ON(num_regs > 32);
>> -             fpsimd_save_partial_state(s, roundup(num_regs, 2));
>> +             WARN_ON_ONCE(num_regs > 32);
>> +             num_regs = min(roundup(num_regs, 2), 32U);
>> +
>> +             fpsimd_save_partial_state(&s[level - 1], num_regs);
>>       } else {
>>               /*
>>                * Save the userland FPSIMD state if we have one and if we
>> @@ -241,24 +256,29 @@ void kernel_neon_begin_partial(u32 num_regs)
>>                * that there is no longer userland FPSIMD state in the
>>                * registers.
>>                */
>> -             preempt_disable();
>>               if (current->mm &&
>>                   !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
>>                       fpsimd_save_state(&current->thread.fpsimd_state);
>>               this_cpu_write(fpsimd_last_state, NULL);
>>       }
>> +     this_cpu_write(kernel_neon_nesting_level, level + 1);
>>  }
>
> I'm slightly confused with the potential race with an interrupt here.
> Let's say the above is running in the process context, sets the
> TIF_FOREIGN_FPSTATE but is interrupted before fpsimd_save_state(). The
> interrupt handler calling kernel_neon_begin_partial() is seeing level 0
> and TIF_FOREIGN_FPSTATE and decides that it is safe to corrupt the Neon
> state without any further saving.
>
> I think the kernel_neon_nesting_level should be incremented early on in
> this function.
>

Good point, I hadn't considered that.

^ permalink raw reply

* [bug report v4.8] fs/locks.c: kernel oops during posix lock stress test
From: Ming Lei @ 2016-12-08 15:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480340400.2606.10.camel@poochiereds.net>

Hi,

On Mon, Nov 28, 2016 at 9:40 PM, Jeff Layton <jlayton@poochiereds.net> wrote:
> On Mon, 2016-11-28 at 11:10 +0800, Ming Lei wrote:
>> Hi Guys,
>>
>> When I run stress-ng via the following steps on one ARM64 dual
>> socket system(Cavium Thunder), the kernel oops[1] can often be
>> triggered after running the stress test for several hours(sometimes
>> it may take longer):
>>
>> - git clone git://kernel.ubuntu.com/cking/stress-ng.git
>> - apply the attachment patch which just makes the posix file
>> lock stress test more aggressive
>> - run the test via '~/git/stress-ng$./stress-ng --lockf 128 --aggressive'
>>
>>
>> From the oops log, looks one garbage file_lock node is got
>> from the linked list of 'ctx->flc_posix' when the issue happens.
>>
>> BTW, the issue isn't observed on single socket Cavium Thunder yet,
>> and the same issue can be seen on Ubuntu Xenial(v4.4 based kernel)
>> too.
>>
>> Thanks,
>> Ming
>>
>
> Some questions just for clarification:
>
> - I assume this is being run on a local fs of some sort? ext4 or xfs or
> something?
>
> - have you seen this on any other arch, besides ARM?
>
> The file locking code does do some lockless checking to see whether the
> i_flctx is even present and whether the list is empty in
> locks_remove_posix. It's possible we have some barrier problems there,

I have used ebpf trace to see what is going on when 'stress-ng --lockf'
is running, and almost all exported symbols in fs/locks.c are covered.

Except for locks_alloc/locks_free/locks_copy/locks_init, the only observable
symbols are fcntl_setlk, vfs_lock_file and locks_remove_posix, but
locks_remove_posix() is just run at the begining and ending of the
test.

So seems not related with locks_remove_posix().

Then looks only fcntl_setlk() is running from different contexts
during the test,
but in this path, the 'ctx->flc_lock' is always held when operating the list.
That said it is very strange to see the list corrupted even though it is
protected by the lock.

Thanks,
Ming

> but I don't quite see how that would cause us to have a corrupt lock on
> the flc_posix list.
>

^ permalink raw reply

* [PATCH] drm: zte: add overlay plane support
From: Ville Syrjälä @ 2016-12-08 15:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481166761-5100-1-git-send-email-shawnguo@kernel.org>

On Thu, Dec 08, 2016 at 11:12:41AM +0800, Shawn Guo wrote:
<snip>
> +static void zx_vl_plane_atomic_update(struct drm_plane *plane,
> +				      struct drm_plane_state *old_state)
> +{
> +	struct zx_plane *zplane = to_zx_plane(plane);
> +	struct drm_framebuffer *fb = plane->state->fb;
> +	struct drm_gem_cma_object *cma_obj;
> +	void __iomem *layer = zplane->layer;
> +	void __iomem *hbsc = zplane->hbsc;
> +	void __iomem *paddr_reg;
> +	dma_addr_t paddr;
> +	u32 src_x, src_y, src_w, src_h;
> +	u32 dst_x, dst_y, dst_w, dst_h;
> +	uint32_t format;
> +	u32 fmt;
> +	int num_planes;
> +	int i;
> +
> +	if (!fb)
> +		return;
> +
> +	format = fb->pixel_format;
> +
> +	src_x = plane->state->src_x >> 16;
> +	src_y = plane->state->src_y >> 16;
> +	src_w = plane->state->src_w >> 16;
> +	src_h = plane->state->src_h >> 16;
> +
> +	dst_x = plane->state->crtc_x;
> +	dst_y = plane->state->crtc_y;
> +	dst_w = plane->state->crtc_w;
> +	dst_h = plane->state->crtc_h;

This shouls use the clipped coordiantes.

> +
> +	/* Set up data address registers for Y, Cb and Cr planes */
> +	num_planes = drm_format_num_planes(format);
> +	paddr_reg = layer + VL_Y;
> +	for (i = 0; i < num_planes; i++) {
> +		cma_obj = drm_fb_cma_get_gem_obj(fb, i);
> +		paddr = cma_obj->paddr + fb->offsets[i];
> +		paddr += src_y * fb->pitches[i];
> +		paddr += src_x * drm_format_plane_cpp(format, i);
> +		zx_writel(paddr_reg, paddr);
> +		paddr_reg += 4;
> +	}
> +


-- 
Ville Syrj?l?
Intel OTC

^ permalink raw reply

* [PATCH] dmaengine: pl330: do not generate unaligned access
From: Vinod Koul @ 2016-12-08 15:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481116660-27837-1-git-send-email-vladimir.murzin@arm.com>

On Wed, Dec 07, 2016 at 01:17:40PM +0000, Vladimir Murzin wrote:
> When PL330 is used with !MMU the following fault is seen:
> 
> Unhandled fault: alignment exception (0x801) at 0x8f26a002
> Internal error: : 801 [#1] ARM
> Modules linked in:
> CPU: 0 PID: 640 Comm: dma0chan0-copy0 Not tainted 4.8.0-6a82063-clean+ #1600
> Hardware name: ARM-Versatile Express
> task: 8f1baa80 task.stack: 8e6fe000
> PC is at _setup_req+0x4c/0x350
> LR is at 0x8f2cbc00
> pc : [<801ea538>]    lr : [<8f2cbc00>]    psr: 60000093
> sp : 8e6ffdc0  ip : 00000000  fp : 00000000
> r10: 00000000  r9 : 8f2cba10  r8 : 8f2cbc00
> r7 : 80000013  r6 : 8f21a050  r5 : 8f21a000  r4 : 8f2ac800
> r3 : 8e6ffe18  r2 : 00944251  r1 : ffffffbc  r0 : 8f26a000
> Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 00c5387c
> Process dma0chan0-copy0 (pid: 640, stack limit = 0x8e6fe210)
> Stack: (0x8e6ffdc0 to 0x8e700000)
> fdc0: 00000001 60000093 00000000 8f2cba10 8f26a000 00000004 8f0ae000 8f2cbc00
> fde0: 8f0ae000 8f2ac800 8f21a000 8f21a050 80000013 8f2cbc00 8f2cba10 00000000
> fe00: 60000093 801ebca0 8e6ffe18 000013ff 40000093 00000000 00944251 8f2ac800
> fe20: a0000013 8f2b1320 00001986 00000000 00000001 000013ff 8f1e4f00 8f2cba10
> fe40: 8e6fff6c 801e9044 00000003 00000000 fef98c80 002faf07 8e6ffe7c 00000000
> fe60: 00000002 00000000 00001986 8f1f158d 8f1e4f00 80568de4 00000002 00000000
> fe80: 00001986 8f1f53ff 40000001 80580500 8f1f158d 8001e00c 00000000 cfdfdfdf
> fea0: fdae2a25 00000001 00000004 8e6fe000 00000008 00000010 00000000 00000005
> fec0: 8f2b1330 8f2b1334 8e6ffe80 8e6ffe8c 00001986 00000000 8f21a014 00000001
> fee0: 8e6ffe60 8e6ffe78 00000002 00000000 000013ff 00000001 80568de4 8f1e8018
> ff00: 0000158d 8055ec30 00000001 803f6b00 00001986 8f2cba10 fdae2a25 00000001
> ff20: 8f1baca8 8e6fff24 8e6fff24 00000000 8e6fff24 ac6f3037 00000000 00000000
> ff40: 00000000 8e6fe000 8f1e4f40 00000000 8f1e4f40 8f1e4f00 801e84ec 00000000
> ff60: 00000000 00000000 00000000 80031714 dfdfdfcf 00000000 dfdfdfcf 8f1e4f00
> ff80: 00000000 8e6fff84 8e6fff84 00000000 8e6fff90 8e6fff90 8e6fffac 8f1e4f40
> ffa0: 80031640 00000000 00000000 8000f548 00000000 00000000 00000000 00000000
> ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 dfdfdfcf cfdfdfdf
> [<801ea538>] (_setup_req) from [<801ebca0>] (pl330_tasklet+0x41c/0x490)
> [<801ebca0>] (pl330_tasklet) from [<801e9044>] (dmatest_func+0xb58/0x149c)
> [<801e9044>] (dmatest_func) from [<80031714>] (kthread+0xd4/0xec)
> [<80031714>] (kthread) from [<8000f548>] (ret_from_fork+0x14/0x2c)
> Code: e3a03001 e3e01043 e5c03001 e59d3048 (e5802002)
> 
> This happens because _emit_{ADDH,MOV,GO) accessing to unaligned data
> while writing to buffer. Fix it with writing to buffer byte by byte.

Applied, now.

Although I didn't really like duplicating code for writing bytes, that could
be made a common fn

Thanks
-- 
~Vinod

^ permalink raw reply

* [PATCH] PM / Domains: Fix compatible for domain idle state
From: Rob Herring @ 2016-12-08 16:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526421.aCsca4sLnR@aspire.rjw.lan>

On Wed, Dec 7, 2016 at 6:13 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, December 01, 2016 10:21:59 PM Rafael J. Wysocki wrote:
>> On Tuesday, November 29, 2016 09:47:03 AM Ulf Hansson wrote:
>> > On 10 November 2016 at 20:58, Rob Herring <robh@kernel.org> wrote:
>> > > On Mon, Nov 07, 2016 at 12:14:28PM +0100, Ulf Hansson wrote:
>> > >> On 3 November 2016 at 22:54, Lina Iyer <lina.iyer@linaro.org> wrote:
>> > >> > Re-using idle state definition provided by arm,idle-state for domain
>> > >> > idle states creates a lot of confusion and limits further evolution of
>> > >> > the domain idle definition. To keep things clear and simple, define a
>> > >> > idle states for domain using a new compatible "domain-idle-state".
>> > >> >
>> > >> > Fix existing PM domains code to look for the newly defined compatible.
>> > >> >
>> > >> > Cc: <devicetree@vger.kernel.org>
>> > >> > Cc: Rob Herring <robh@kernel.org>
>> > >> > Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
>> > >> > ---
>> > >> >  .../bindings/power/domain-idle-state.txt           | 33 ++++++++++++++++++++++
>> > >> >  .../devicetree/bindings/power/power_domain.txt     |  8 +++---
>> > >> >  drivers/base/power/domain.c                        |  2 +-
>> > >> >  3 files changed, 38 insertions(+), 5 deletions(-)
>> > >> >  create mode 100644 Documentation/devicetree/bindings/power/domain-idle-state.txt
>> > >> >
>> > >> > diff --git a/Documentation/devicetree/bindings/power/domain-idle-state.txt b/Documentation/devicetree/bindings/power/domain-idle-state.txt
>> > >> > new file mode 100644
>> > >> > index 0000000..eefc7ed
>> > >> > --- /dev/null
>> > >> > +++ b/Documentation/devicetree/bindings/power/domain-idle-state.txt
>> > >> > @@ -0,0 +1,33 @@
>> > >> > +PM Domain Idle State Node:
>> > >> > +
>> > >> > +A domain idle state node represents the state parameters that will be used to
>> > >> > +select the state when there are no active components in the domain.
>> > >> > +
>> > >> > +The state node has the following parameters -
>> > >> > +
>> > >> > +- compatible:
>> > >> > +       Usage: Required
>> > >> > +       Value type: <string>
>> > >> > +       Definition: Must be "domain-idle-state".
>> > >> > +
>> > >> > +- entry-latency-us
>> > >> > +       Usage: Required
>> > >> > +       Value type: <prop-encoded-array>
>> > >> > +       Definition: u32 value representing worst case latency in
>> > >> > +                   microseconds required to enter the idle state.
>> > >> > +                   The exit-latency-us duration may be guaranteed
>> > >> > +                   only after entry-latency-us has passed.
>> > >>
>> > >> As we anyway are going to change this, why not use an u64 and have the
>> > >> value in ns instead of us?
>> > >
>> > > I can't imagine that you would need more resolution or range. For times
>> > > less than 1us, s/w and register access times are going to dominate the
>> > > time.
>> > >
>> > > Unless there is a real need, I'd keep alignment with the existing
>> > > binding.
>> >
>> > Rob, are you fine with this? I thought it would be great to get this
>> > in for 4.10 rc1.
>>
>> Rob, any objections here?
>
> Well, no objections, so applied.

Sorry, just found my ack sitting in my drafts. Thought I had sent it.

Rob

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Vinod Koul @ 2016-12-08 16:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <3cbad5e4-0569-fdab-3ebd-28fae7d2a174@free.fr>

On Thu, Dec 08, 2016 at 04:43:53PM +0100, Mason wrote:
> On 08/12/2016 16:40, Vinod Koul wrote:
> 
> > FWIW look at ALSA-dmaengine lib, thereby every audio driver that uses it. I
> > could find other examples and go on and on, but that's besides the point and
> > looks like you don't want to listen to people telling you something..
> 
> Hello Vinod,
> 
> FWIW, your system clock seems to be 10 minutes ahead ;-)

Ah thanks for pointing, fixed now :)
> 
> I am willing to learn. I have a few outstanding questions
> in the thread. Could you have a look at them?

Sure thing, I have replied/ing to few things, feel free to ping me if I miss
something today..

-- 
~Vinod

^ permalink raw reply

* [GIT PULL] ARM: mvebu: fixes for v4.9 (#2)
From: Gregory CLEMENT @ 2016-12-08 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Here is the second pull request for fixes for mvebu for v4.9.

Gregory

The following changes since commit 8d897006fe9206d64cbe353310be26d7c911e69d:

  arm64: dts: marvell: add unique identifiers for Armada A8k SPI controllers (2016-11-09 09:44:08 +0100)

are available in the git repository at:

  git://git.infradead.org/linux-mvebu.git tags/mvebu-fixes-4.9-2

for you to fetch changes up to b150e59f1d217473b3eab354a9e5e9fe907868ff:

  ARM: dts: orion5x: fix number of sata port for linkstation ls-gl (2016-12-05 16:43:05 +0100)

----------------------------------------------------------------
mvebu fixes for 4.9 (part 2)

Fix number of sata port for linkstation ls-gl: without this the hard
drive is not detected as it was before the dt conversion

----------------------------------------------------------------
Roger Shimizu (1):
      ARM: dts: orion5x: fix number of sata port for linkstation ls-gl

 arch/arm/boot/dts/orion5x-linkstation-lsgl.dts | 4 ++++
 1 file changed, 4 insertions(+)

^ permalink raw reply

* [PATCH 1/1] arm64: mm: add config options for page table configuration
From: Scott Branden @ 2016-12-08 16:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208100014.GE33075@MBP.local>

Hi Catalin,

On 16-12-08 02:00 AM, Catalin Marinas wrote:
> On Wed, Dec 07, 2016 at 11:40:00AM -0800, Scott Branden wrote:
>> Make MAX_PHYSMEM_BITS and SECTIONS_SIZE_BITS configurable by adding
>> config options.
>> Default to current settings currently defined in sparesmem.h.
>> For systems wishing to save memory the config options can be overridden.
>> Example, changing MAX_PHYSMEM_BITS from 48 to 36 at the same time as
>> changing SECTION_SIZE_BITS from 30 to 26 frees 13MB of memory.
>
> I'm not keen on such change, it's a big departure from the single Image
> aims.
A single Image is not entirely possible when the system needs to be 
tuned for memory usage, boot time, and other performance related issues. 
  These are key features in embedded systems vs. general purpose computers.

I would rather reduce SECTION_SIZE_BITS permanently where
> feasible, like in this patch:
>
> http://lkml.kernel.org/r/1465821119-3384-1-git-send-email-jszhang at marvell.com
>
This patch does not meet my requirements as I need SECTION_SIZE_BITS to 
be set to 28 to reduce memory and to allow memory hotplug to allocate a 
256 MB section.  My patch future proofs the tuning of the parameters by 
allowing any section size to be made.  I could combine the patch you 
list such that SECTION_SIZE_BITS defaults to 30 when 
CONFIG_ARM64_64_PAGES is selected and 27 otherwise.  Should it default 
to something else for 16K and 4K pages?

In terms of MAX_PHYSMEM_BITS, if our SoCs only use 40 (or less) bits I 
would also like the configuration functionality.  This allows us to make 
the SECTION_SIZE_BITS smaller.

Regards,
Scott

^ permalink raw reply

* [Linaro-acpi] [PATCH V1 1/2] PCI: thunder: Enable ACPI PCI controller for ThunderX pass2.x silicon version
From: Robert Richter @ 2016-12-08 16:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161202162743.GB9903@bhelgaas-glaptop.roam.corp.google.com>

On 02.12.16 10:27:43, Bjorn Helgaas wrote:
> On Fri, Dec 02, 2016 at 11:45:00AM +0100, Robert Richter wrote:
> > On 02.12.16 11:06:24, Tomasz Nowicki wrote:
> > > On 02.12.2016 07:42, Duc Dang wrote:
> > 
> > > >@@ -98,16 +98,16 @@ struct mcfg_fixup {
> > > >        { "CAVIUM", "THUNDERX", rev, seg, MCFG_BUS_ANY,                 \
> > > >        &pci_thunder_ecam_ops }
> > > >        /* SoC pass1.x */
> > > >-   THUNDER_PEM_QUIRK(2,  0),       /* off-chip devices */
> > > >-   THUNDER_PEM_QUIRK(2,  1),       /* off-chip devices */
> > > >-   THUNDER_ECAM_QUIRK(2,  0),
> > > >-   THUNDER_ECAM_QUIRK(2,  1),
> > > >-   THUNDER_ECAM_QUIRK(2,  2),
> > > >-   THUNDER_ECAM_QUIRK(2,  3),
> > > >-   THUNDER_ECAM_QUIRK(2, 10),
> > > >-   THUNDER_ECAM_QUIRK(2, 11),
> > > >-   THUNDER_ECAM_QUIRK(2, 12),
> > > >-   THUNDER_ECAM_QUIRK(2, 13),
> > > >+ THUNDER_PEM_QUIRK(2, 0UL),  /* off-chip devices */
> > > >+ THUNDER_PEM_QUIRK(2, 1UL),  /* off-chip devices */
> > > >+ THUNDER_ECAM_QUIRK(2, 0UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 1UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 2UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 3UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 10UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 11UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 12UL),
> > > >+ THUNDER_ECAM_QUIRK(2, 13UL),
> > > >
> > > 
> > > The UL suffix is needed for *THUNDER_PEM_QUIRK* only. THUNDER_ECAM_QUIRK is
> > > fine.
> > 
> > We should better make the type cast part of the macro.
> > 
> > + this:
> > 
> > ---
> > #define THUNDER_MCFG_RES(addr, node) \
> >        DEFINE_RES_MEM(addr + (node << 44), 0x39 * SZ_16M)
> > ---
> > 
> > The args in the macro need parentheses.
> 
> Would you mind sending me a little incremental patch doing what you
> want?  I could try myself, but since I don't have an arm64 cross-build
> setup, I'm working in the dark.

Your current branch looks good.

 5d06f9125ec0 PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy

Thanks,

-Robert

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-12-08 16:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208155043.GE6408@localhost>

Vinod Koul <vinod.koul@intel.com> writes:

> On Thu, Dec 08, 2016 at 12:20:30PM +0000, M?ns Rullg?rd wrote:
>> >
>> > I'm far from claiming that drivers/tty/serial/sh-sci.c is perfect, but
>> > it does request DMA channels at open time, not at probe time.
>> 
>> In the part quoted above, I said most drivers request dma channels in
>> their probe or open functions.  For the purposes of this discussion,
>> that distinction is irrelevant.  In either case, the channel is held
>> indefinitely.  
>
> And the answer was it is wrong and not _all_ do that!!

Show me one that doesn't.

>> If this wasn't the correct way to use the dmaengine,
>> there would be no need for the virt-dma helpers which are specifically
>> designed for cases the one currently at hand.
>
> That is incorrect.
>
> virt-dma helps to have multiple request from various clients. For many
> controllers which implement a SW mux, they can transfer data for one client
> and then next transfer can be for some other one.
>
> This allows better utilization of dma channels and helps in case where we
> have fewer dma channels than users.

Which is *exactly* the situation we have here.

I have no idea what you're arguing for or against any more.  Perhaps
you're just arguing.

>> The only problem we have is that nobody envisioned hardware where the
>> dma engine indicates completion slightly too soon.  I suspect there's a
>> fifo or such somewhere, and the interrupt is triggered when the last
>> byte has been placed in the fifo rather than when it has been removed
>> which would have been more correct.
>
> That is pretty common hardware optimization but usually hardware shows up
> with flush commands to let in flight transactions be completed.

Well, this hardware seems to lack that particular feature.  Pretending
otherwise isn't helping.

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Vinod Koul @ 2016-12-08 16:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <91b0d10c-1bc2-c3e1-4088-f4ad9adcd6c0@free.fr>

On Thu, Dec 08, 2016 at 11:54:51AM +0100, Mason wrote:
> On 08/12/2016 11:39, Vinod Koul wrote:
> 
> > On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
> >
> >> Vinod Koul <vinod.koul@intel.com> writes:
> >>
> >>> On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
> >>>>
> >>>> That's not going to work very well.  Device drivers typically request
> >>>> dma channels in their probe functions or when the device is opened.
> >>>> This means that reserving one of the few channels there will inevitably
> >>>> make some other device fail to operate.
> >>>
> >>> No that doesn't make sense at all, you should get a channel only when you
> >>> want to use it and not in probe!
> >>
> >> Tell that to just about every single driver ever written.
> > 
> > Not really, few do yes which is wrong but not _all_ do that.
> 
> Vinod,
> 
> Could you explain something to me in layman's terms?
> 
> I have a NAND Flash Controller driver that depends on the
> DMA driver under discussion.
> 
> Suppose I move the dma_request_chan() call from the driver's
> probe function, to the actual DMA transfer function.
> 
> I would want dma_request_chan() to put the calling thread
> to sleep until a channel becomes available (possibly with
> a timeout value).
> 
> But Maxime told me dma_request_chan() will just return
> -EBUSY if no channels are available.

That is correct

> Am I supposed to busy wait in my driver's DMA function
> until a channel becomes available?

If someone else is using the channels then the bust wait will not help, so
in this case you should fall back to PIO, few drivers do that

> I don't understand how the multiplexing of few memory
> channels to many clients is supposed to happen efficiently?

To make it efficient, disregarding your Sbox HW issue, the solution is
virtual channels. You can delink physical channels and virtual channels. If
one has SW controlled MUX then a channel can service any client. For few
controllers request lines are hard wired so they cant use any channel. But
if you dont have this restriction then driver can queue up many transactions
from different controllers.

-- 
~Vinod

^ permalink raw reply

* [PATCH] MAINTAINERS: Add Patchwork URL to Samsung Exynos entry
From: Krzysztof Kozlowski @ 2016-12-08 16:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <F4BA1C22-7E6C-4791-9B0D-37B3F647E0EB@gmail.com>

On Thu, Dec 08, 2016 at 11:24:43AM +0900, Kukjin Kim wrote:
> 2016. 12. 8. 02:18 Krzysztof Kozlowski <krzk@kernel.org> wrote:
> 
> > I use Patchwork for handling incoming patches. Put its address here so
> > submitters could know what is in the queue.
> 
> > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > ---
> > MAINTAINERS | 1 +
> > 1 file changed, 1 insertion(+)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 191887bdc49b..ec5137c39572 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1689,6 +1689,7 @@ M:    Krzysztof Kozlowski <krzk@kernel.org>
> > R:    Javier Martinez Canillas <javier@osg.samsung.com>
> > L:    linux-arm-kernel at lists.infradead.org (moderated for non-subscribers)
> > L:    linux-samsung-soc at vger.kernel.org (moderated for non-subscribers)
> > +Q:    https://patchwork.kernel.org/project/linux-samsung-soc/list/
> 
> to use http would be better instead of https?

I think HTTPS is the preferred way. HTTP is insecure, easy to eavesdrop
so whenever it is possible we should choose HTTPS.
> 
> then,
> 
> Acked-by: Kukjin Kim <kgene@kernel.org>

Thanks!

Best regards,
Krzysztof

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-12-08 16:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208154018.GD6408@localhost>

Vinod Koul <vinod.koul@intel.com> writes:

> On Thu, Dec 08, 2016 at 11:44:53AM +0000, M?ns Rullg?rd wrote:
>> Vinod Koul <vinod.koul@intel.com> writes:
>> 
>> > On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
>> >> Vinod Koul <vinod.koul@intel.com> writes:
>> >> 
>> >> > On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
>> >> >> 
>> >> >> That's not going to work very well.  Device drivers typically request
>> >> >> dma channels in their probe functions or when the device is opened.
>> >> >> This means that reserving one of the few channels there will inevitably
>> >> >> make some other device fail to operate.
>> >> >
>> >> > No that doesnt make sense at all, you should get a channel only when you
>> >> > want to use it and not in probe!
>> >> 
>> >> Tell that to just about every single driver ever written.
>> >
>> > Not really, few do yes which is wrong but not _all_ do that.
>> 
>> Every driver I ever looked at does.  Name one you consider "correct."
>
> That only tells me you haven't looked enough and want to rant!
>
> FWIW look at ALSA-dmaengine lib, thereby every audio driver that uses it. I
> could find other examples and go on and on, but that's besides the point and
> looks like you don't want to listen to people telling you something..

ALSA is a particularly poor example.  ALSA drivers request a dma channel
at some point between the device being opened and playback (or capture)
starting.  They then set up a cyclic transfer that runs continuously
until playback is stopped.  It's a completely different model of
operation than we're talking about here.

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-12-08 16:48 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208163755.GH6408@localhost>

Vinod Koul <vinod.koul@intel.com> writes:

> To make it efficient, disregarding your Sbox HW issue, the solution is
> virtual channels. You can delink physical channels and virtual channels. If
> one has SW controlled MUX then a channel can service any client. For few
> controllers request lines are hard wired so they cant use any channel. But
> if you dont have this restriction then driver can queue up many transactions
> from different controllers.

Have you been paying attention at all?  This exactly what the driver
ALREADY DOES.

-- 
M?ns Rullg?rd

^ permalink raw reply

* [PATCH v3] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-08 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, we allow kernel mode NEON in softirq or hardirq context by
stacking and unstacking a slice of the NEON register file for each call
to kernel_neon_begin() and kernel_neon_end(), respectively.

Given that
a) a CPU typically spends most of its time in userland, during which time
   no kernel mode NEON in process context is in progress,
b) a CPU spends most of its time in the kernel doing other things than
   kernel mode NEON when it gets interrupted to perform kernel mode NEON
   in softirq context

the stacking and subsequent unstacking is only necessary if we are
interrupting a thread while it is performing kernel mode NEON in process
context, which means that in all other cases, we can simply preserve the
userland FPSIMD state once, and only restore it upon return to userland,
even if we are being invoked from softirq or hardirq context.

So instead of checking whether we are running in interrupt context, keep
track of the level of nested kernel mode NEON calls in progress, and only
perform the eager stack/unstack if the level exceeds 1.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
v3:
- avoid corruption by concurrent invocations of kernel_neon_begin()/_end()

v2:
- BUG() on unexpected values of the nesting level
- relax the BUG() on num_regs>32 to a WARN, given that nothing actually
  breaks in that case

 arch/arm64/kernel/fpsimd.c | 47 ++++++++++++++------
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 394c61db5566..536ddab9316d 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -220,20 +220,35 @@ void fpsimd_flush_task_state(struct task_struct *t)
 
 #ifdef CONFIG_KERNEL_MODE_NEON
 
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+/*
+ * Although unlikely, it is possible for three kernel mode NEON contexts to
+ * be live at the same time: process context, softirq context and hardirq
+ * context. So while the userland context is stashed in the thread's fpsimd
+ * state structure, we need two additional levels of storage.
+ */
+static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);
+static DEFINE_PER_CPU(atomic_t, kernel_neon_nesting_level);
 
 /*
  * Kernel-side NEON support functions
  */
 void kernel_neon_begin_partial(u32 num_regs)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+	struct fpsimd_partial_state *s;
+	int level;
+
+	preempt_disable();
+
+	level = atomic_inc_return(this_cpu_ptr(&kernel_neon_nesting_level));
+	BUG_ON(level > 3);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
 
-		BUG_ON(num_regs > 32);
-		fpsimd_save_partial_state(s, roundup(num_regs, 2));
+		WARN_ON_ONCE(num_regs > 32);
+		num_regs = min(roundup(num_regs, 2), 32U);
+
+		fpsimd_save_partial_state(&s[level - 2], num_regs);
 	} else {
 		/*
 		 * Save the userland FPSIMD state if we have one and if we
@@ -241,7 +256,6 @@ void kernel_neon_begin_partial(u32 num_regs)
 		 * that there is no longer userland FPSIMD state in the
 		 * registers.
 		 */
-		preempt_disable();
 		if (current->mm &&
 		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
 			fpsimd_save_state(&current->thread.fpsimd_state);
@@ -252,13 +266,18 @@ EXPORT_SYMBOL(kernel_neon_begin_partial);
 
 void kernel_neon_end(void)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
-		fpsimd_load_partial_state(s);
-	} else {
-		preempt_enable();
+	struct fpsimd_partial_state *s;
+	int level;
+
+	level = atomic_read(this_cpu_ptr(&kernel_neon_nesting_level));
+	BUG_ON(level < 1);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
+		fpsimd_load_partial_state(&s[level - 2]);
 	}
+	atomic_dec(this_cpu_ptr(&kernel_neon_nesting_level));
+	preempt_enable();
 }
 EXPORT_SYMBOL(kernel_neon_end);
 
-- 
2.7.4

^ permalink raw reply related

* Applied "ASoC: zte: spdif: correct ZX_SPDIF_CLK_RAT define" to the asoc tree
From: Mark Brown @ 2016-12-08 16:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481186655-8213-2-git-send-email-shawnguo@kernel.org>

The patch

   ASoC: zte: spdif: correct ZX_SPDIF_CLK_RAT define

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 44b1c9a6e7a2692e761e35ada2ffe84b20c2a377 Mon Sep 17 00:00:00 2001
From: Shawn Guo <shawn.guo@linaro.org>
Date: Thu, 8 Dec 2016 16:44:15 +0800
Subject: [PATCH] ASoC: zte: spdif: correct ZX_SPDIF_CLK_RAT define

The macro ZX_SPDIF_CLK_RAT should be 2 instead of 4.  With this
fix, we can get correct audio output on HDMI through SPDIF interface.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Jun Nie <jun.nie@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/zte/zx-spdif.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/zte/zx-spdif.c b/sound/soc/zte/zx-spdif.c
index 26265ce4caca..9fa6463ce5d7 100644
--- a/sound/soc/zte/zx-spdif.c
+++ b/sound/soc/zte/zx-spdif.c
@@ -71,7 +71,7 @@
 #define ZX_VALID_RIGHT_TRACK		(2 << 0)
 #define ZX_VALID_TRACK_MASK		(3 << 0)

-#define ZX_SPDIF_CLK_RAT		(4 * 32)
+#define ZX_SPDIF_CLK_RAT		(2 * 32)

 struct zx_spdif_info {
 	struct snd_dmaengine_dai_dma_data	dma_data;
-- 
2.10.2

^ permalink raw reply related

* Applied "ASoC: zte: spdif and i2s drivers are not zx296702 specific" to the asoc tree
From: Mark Brown @ 2016-12-08 16:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481186655-8213-1-git-send-email-shawnguo@kernel.org>

The patch

   ASoC: zte: spdif and i2s drivers are not zx296702 specific

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From de7975c2a42de889e2b3fd2f7d46f899ad8ccd45 Mon Sep 17 00:00:00 2001
From: Shawn Guo <shawn.guo@linaro.org>
Date: Thu, 8 Dec 2016 16:44:14 +0800
Subject: [PATCH] ASoC: zte: spdif and i2s drivers are not zx296702 specific

ZTE ZX SPDIF and I2S drivers can work on not only ZX296702 but also
other ZTE ZX family SoCs like ZX296718, which is an arm64 platform.
Let's make a few renaming and tweak the Kconfig a bit to get the drivers
available for other ZTE ZX platforms.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Jun Nie <jun.nie@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 sound/soc/zte/Kconfig                          | 16 ++++++++--------
 sound/soc/zte/Makefile                         |  4 ++--
 sound/soc/zte/{zx296702-i2s.c => zx-i2s.c}     |  0
 sound/soc/zte/{zx296702-spdif.c => zx-spdif.c} |  0
 4 files changed, 10 insertions(+), 10 deletions(-)
 rename sound/soc/zte/{zx296702-i2s.c => zx-i2s.c} (100%)
 rename sound/soc/zte/{zx296702-spdif.c => zx-spdif.c} (100%)

diff --git a/sound/soc/zte/Kconfig b/sound/soc/zte/Kconfig
index c47eb25e441f..6d8a90d36315 100644
--- a/sound/soc/zte/Kconfig
+++ b/sound/soc/zte/Kconfig
@@ -1,17 +1,17 @@
-config ZX296702_SPDIF
-	tristate "ZX296702 spdif"
-	depends on SOC_ZX296702 || COMPILE_TEST
+config ZX_SPDIF
+	tristate "ZTE ZX SPDIF Driver Support"
+	depends on ARCH_ZX || COMPILE_TEST
 	depends on COMMON_CLK
 	select SND_SOC_GENERIC_DMAENGINE_PCM
 	help
 	  Say Y or M if you want to add support for codecs attached to the
-	  zx296702 spdif interface
+	  ZTE ZX SPDIF interface
 
-config ZX296702_I2S
-	tristate "ZX296702 i2s"
-	depends on SOC_ZX296702 || COMPILE_TEST
+config ZX_I2S
+	tristate "ZTE ZX I2S Driver Support"
+	depends on ARCH_ZX || COMPILE_TEST
 	depends on COMMON_CLK
 	select SND_SOC_GENERIC_DMAENGINE_PCM
 	help
 	  Say Y or M if you want to add support for codecs attached to the
-	  zx296702 i2s interface
+	  ZTE ZX I2S interface
diff --git a/sound/soc/zte/Makefile b/sound/soc/zte/Makefile
index 254ed2c8c1a0..77768f5fd10c 100644
--- a/sound/soc/zte/Makefile
+++ b/sound/soc/zte/Makefile
@@ -1,2 +1,2 @@
-obj-$(CONFIG_ZX296702_SPDIF)	+= zx296702-spdif.o
-obj-$(CONFIG_ZX296702_I2S)	+= zx296702-i2s.o
+obj-$(CONFIG_ZX_SPDIF)	+= zx-spdif.o
+obj-$(CONFIG_ZX_I2S)	+= zx-i2s.o
diff --git a/sound/soc/zte/zx296702-i2s.c b/sound/soc/zte/zx-i2s.c
similarity index 100%
rename from sound/soc/zte/zx296702-i2s.c
rename to sound/soc/zte/zx-i2s.c
diff --git a/sound/soc/zte/zx296702-spdif.c b/sound/soc/zte/zx-spdif.c
similarity index 100%
rename from sound/soc/zte/zx296702-spdif.c
rename to sound/soc/zte/zx-spdif.c
-- 
2.10.2

^ permalink raw reply related

* Applied "spi: Add support for Armada 3700 SPI Controller" to the spi tree
From: Mark Brown @ 2016-12-08 16:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208145847.7794-2-romain.perier@free-electrons.com>

The patch

   spi: Add support for Armada 3700 SPI Controller

has been applied to the spi tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 5762ab71eb24a8393a167361510d7ca5e18c99f9 Mon Sep 17 00:00:00 2001
From: Romain Perier <romain.perier@free-electrons.com>
Date: Thu, 8 Dec 2016 15:58:44 +0100
Subject: [PATCH] spi: Add support for Armada 3700 SPI Controller

Marvell Armada 3700 SoC comprises an SPI Controller. This Controller
supports up to 4 SPI slave devices, with dedicated chip selects,supports
SPI mode 0/1/2 and 3, CPIO or Fifo mode with DMA transfers and different
SPI transfer mode (Single, Dual or Quad).

This commit adds basic driver support for FIFO mode. In this mode,
dedicated registers are used to store the instruction, the address, the
read mode and the data. Write and Read FIFO are used to store the
outcoming or incoming data. The data FIFOs are accessible via DMA or by
the CPU. Only the CPU is supported for now.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 drivers/spi/Kconfig           |   7 +
 drivers/spi/Makefile          |   1 +
 drivers/spi/spi-armada-3700.c | 923 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 931 insertions(+)
 create mode 100644 drivers/spi/spi-armada-3700.c

diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
index b7995474148c..1faad2ce4b4b 100644
--- a/drivers/spi/Kconfig
+++ b/drivers/spi/Kconfig
@@ -67,6 +67,13 @@ config SPI_ATH79
 	  This enables support for the SPI controller present on the
 	  Atheros AR71XX/AR724X/AR913X SoCs.
 
+config SPI_ARMADA_3700
+	tristate "Marvell Armada 3700 SPI Controller"
+	depends on (ARCH_MVEBU && OF) || COMPILE_TEST
+	help
+	  This enables support for the SPI controller present on the
+	  Marvell Armada 3700 SoCs.
+
 config SPI_ATMEL
 	tristate "Atmel SPI Controller"
 	depends on HAS_DMA
diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
index aa939d955521..140ca45aa9d2 100644
--- a/drivers/spi/Makefile
+++ b/drivers/spi/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_SPI_LOOPBACK_TEST)		+= spi-loopback-test.o
 
 # SPI master controller drivers (bus)
 obj-$(CONFIG_SPI_ALTERA)		+= spi-altera.o
+obj-$(CONFIG_SPI_ARMADA_3700)		+= spi-armada-3700.o
 obj-$(CONFIG_SPI_ATMEL)			+= spi-atmel.o
 obj-$(CONFIG_SPI_ATH79)			+= spi-ath79.o
 obj-$(CONFIG_SPI_AU1550)		+= spi-au1550.o
diff --git a/drivers/spi/spi-armada-3700.c b/drivers/spi/spi-armada-3700.c
new file mode 100644
index 000000000000..e89da0af45d2
--- /dev/null
+++ b/drivers/spi/spi-armada-3700.c
@@ -0,0 +1,923 @@
+/*
+ * Marvell Armada-3700 SPI controller driver
+ *
+ * Copyright (C) 2016 Marvell Ltd.
+ *
+ * Author: Wilson Ding <dingwei@marvell.com>
+ * Author: Romain Perier <romain.perier@free-electrons.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/clk.h>
+#include <linux/completion.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/of_device.h>
+#include <linux/pinctrl/consumer.h>
+#include <linux/spi/spi.h>
+
+#define DRIVER_NAME			"armada_3700_spi"
+
+#define A3700_SPI_TIMEOUT		10
+
+/* SPI Register Offest */
+#define A3700_SPI_IF_CTRL_REG		0x00
+#define A3700_SPI_IF_CFG_REG		0x04
+#define A3700_SPI_DATA_OUT_REG		0x08
+#define A3700_SPI_DATA_IN_REG		0x0C
+#define A3700_SPI_IF_INST_REG		0x10
+#define A3700_SPI_IF_ADDR_REG		0x14
+#define A3700_SPI_IF_RMODE_REG		0x18
+#define A3700_SPI_IF_HDR_CNT_REG	0x1C
+#define A3700_SPI_IF_DIN_CNT_REG	0x20
+#define A3700_SPI_IF_TIME_REG		0x24
+#define A3700_SPI_INT_STAT_REG		0x28
+#define A3700_SPI_INT_MASK_REG		0x2C
+
+/* A3700_SPI_IF_CTRL_REG */
+#define A3700_SPI_EN			BIT(16)
+#define A3700_SPI_ADDR_NOT_CONFIG	BIT(12)
+#define A3700_SPI_WFIFO_OVERFLOW	BIT(11)
+#define A3700_SPI_WFIFO_UNDERFLOW	BIT(10)
+#define A3700_SPI_RFIFO_OVERFLOW	BIT(9)
+#define A3700_SPI_RFIFO_UNDERFLOW	BIT(8)
+#define A3700_SPI_WFIFO_FULL		BIT(7)
+#define A3700_SPI_WFIFO_EMPTY		BIT(6)
+#define A3700_SPI_RFIFO_FULL		BIT(5)
+#define A3700_SPI_RFIFO_EMPTY		BIT(4)
+#define A3700_SPI_WFIFO_RDY		BIT(3)
+#define A3700_SPI_RFIFO_RDY		BIT(2)
+#define A3700_SPI_XFER_RDY		BIT(1)
+#define A3700_SPI_XFER_DONE		BIT(0)
+
+/* A3700_SPI_IF_CFG_REG */
+#define A3700_SPI_WFIFO_THRS		BIT(28)
+#define A3700_SPI_RFIFO_THRS		BIT(24)
+#define A3700_SPI_AUTO_CS		BIT(20)
+#define A3700_SPI_DMA_RD_EN		BIT(18)
+#define A3700_SPI_FIFO_MODE		BIT(17)
+#define A3700_SPI_SRST			BIT(16)
+#define A3700_SPI_XFER_START		BIT(15)
+#define A3700_SPI_XFER_STOP		BIT(14)
+#define A3700_SPI_INST_PIN		BIT(13)
+#define A3700_SPI_ADDR_PIN		BIT(12)
+#define A3700_SPI_DATA_PIN1		BIT(11)
+#define A3700_SPI_DATA_PIN0		BIT(10)
+#define A3700_SPI_FIFO_FLUSH		BIT(9)
+#define A3700_SPI_RW_EN			BIT(8)
+#define A3700_SPI_CLK_POL		BIT(7)
+#define A3700_SPI_CLK_PHA		BIT(6)
+#define A3700_SPI_BYTE_LEN		BIT(5)
+#define A3700_SPI_CLK_PRESCALE		BIT(0)
+#define A3700_SPI_CLK_PRESCALE_MASK	(0x1f)
+
+#define A3700_SPI_WFIFO_THRS_BIT	28
+#define A3700_SPI_RFIFO_THRS_BIT	24
+#define A3700_SPI_FIFO_THRS_MASK	0x7
+
+#define A3700_SPI_DATA_PIN_MASK		0x3
+
+/* A3700_SPI_IF_HDR_CNT_REG */
+#define A3700_SPI_DUMMY_CNT_BIT		12
+#define A3700_SPI_DUMMY_CNT_MASK	0x7
+#define A3700_SPI_RMODE_CNT_BIT		8
+#define A3700_SPI_RMODE_CNT_MASK	0x3
+#define A3700_SPI_ADDR_CNT_BIT		4
+#define A3700_SPI_ADDR_CNT_MASK		0x7
+#define A3700_SPI_INSTR_CNT_BIT		0
+#define A3700_SPI_INSTR_CNT_MASK	0x3
+
+/* A3700_SPI_IF_TIME_REG */
+#define A3700_SPI_CLK_CAPT_EDGE		BIT(7)
+
+/* Flags and macros for struct a3700_spi */
+#define A3700_INSTR_CNT			1
+#define A3700_ADDR_CNT			3
+#define A3700_DUMMY_CNT			1
+
+struct a3700_spi {
+	struct spi_master *master;
+	void __iomem *base;
+	struct clk *clk;
+	unsigned int irq;
+	unsigned int flags;
+	bool xmit_data;
+	const u8 *tx_buf;
+	u8 *rx_buf;
+	size_t buf_len;
+	u8 byte_len;
+	u32 wait_mask;
+	struct completion done;
+	u32 addr_cnt;
+	u32 instr_cnt;
+	size_t hdr_cnt;
+};
+
+static u32 spireg_read(struct a3700_spi *a3700_spi, u32 offset)
+{
+	return readl(a3700_spi->base + offset);
+}
+
+static void spireg_write(struct a3700_spi *a3700_spi, u32 offset, u32 data)
+{
+	writel(data, a3700_spi->base + offset);
+}
+
+static void a3700_spi_auto_cs_unset(struct a3700_spi *a3700_spi)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val &= ~A3700_SPI_AUTO_CS;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static void a3700_spi_activate_cs(struct a3700_spi *a3700_spi, unsigned int cs)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+	val |= (A3700_SPI_EN << cs);
+	spireg_write(a3700_spi, A3700_SPI_IF_CTRL_REG, val);
+}
+
+static void a3700_spi_deactivate_cs(struct a3700_spi *a3700_spi,
+				    unsigned int cs)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+	val &= ~(A3700_SPI_EN << cs);
+	spireg_write(a3700_spi, A3700_SPI_IF_CTRL_REG, val);
+}
+
+static int a3700_spi_pin_mode_set(struct a3700_spi *a3700_spi,
+				  unsigned int pin_mode)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val &= ~(A3700_SPI_INST_PIN | A3700_SPI_ADDR_PIN);
+	val &= ~(A3700_SPI_DATA_PIN0 | A3700_SPI_DATA_PIN1);
+
+	switch (pin_mode) {
+	case 1:
+		break;
+	case 2:
+		val |= A3700_SPI_DATA_PIN0;
+		break;
+	case 4:
+		val |= A3700_SPI_DATA_PIN1;
+		break;
+	default:
+		dev_err(&a3700_spi->master->dev, "wrong pin mode %u", pin_mode);
+		return -EINVAL;
+	}
+
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	return 0;
+}
+
+static void a3700_spi_fifo_mode_set(struct a3700_spi *a3700_spi)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val |= A3700_SPI_FIFO_MODE;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static void a3700_spi_mode_set(struct a3700_spi *a3700_spi,
+			       unsigned int mode_bits)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+
+	if (mode_bits & SPI_CPOL)
+		val |= A3700_SPI_CLK_POL;
+	else
+		val &= ~A3700_SPI_CLK_POL;
+
+	if (mode_bits & SPI_CPHA)
+		val |= A3700_SPI_CLK_PHA;
+	else
+		val &= ~A3700_SPI_CLK_PHA;
+
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static void a3700_spi_clock_set(struct a3700_spi *a3700_spi,
+				unsigned int speed_hz, u16 mode)
+{
+	u32 val;
+	u32 prescale;
+
+	prescale = DIV_ROUND_UP(clk_get_rate(a3700_spi->clk), speed_hz);
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val = val & ~A3700_SPI_CLK_PRESCALE_MASK;
+
+	val = val | (prescale & A3700_SPI_CLK_PRESCALE_MASK);
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	if (prescale <= 2) {
+		val = spireg_read(a3700_spi, A3700_SPI_IF_TIME_REG);
+		val |= A3700_SPI_CLK_CAPT_EDGE;
+		spireg_write(a3700_spi, A3700_SPI_IF_TIME_REG, val);
+	}
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val &= ~(A3700_SPI_CLK_POL | A3700_SPI_CLK_PHA);
+
+	if (mode & SPI_CPOL)
+		val |= A3700_SPI_CLK_POL;
+
+	if (mode & SPI_CPHA)
+		val |= A3700_SPI_CLK_PHA;
+
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static void a3700_spi_bytelen_set(struct a3700_spi *a3700_spi, unsigned int len)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	if (len == 4)
+		val |= A3700_SPI_BYTE_LEN;
+	else
+		val &= ~A3700_SPI_BYTE_LEN;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	a3700_spi->byte_len = len;
+}
+
+static int a3700_spi_fifo_flush(struct a3700_spi *a3700_spi)
+{
+	int timeout = A3700_SPI_TIMEOUT;
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val |= A3700_SPI_FIFO_FLUSH;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	while (--timeout) {
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		if (!(val & A3700_SPI_FIFO_FLUSH))
+			return 0;
+		udelay(1);
+	}
+
+	return -ETIMEDOUT;
+}
+
+static int a3700_spi_init(struct a3700_spi *a3700_spi)
+{
+	struct spi_master *master = a3700_spi->master;
+	u32 val;
+	int i, ret = 0;
+
+	/* Reset SPI unit */
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val |= A3700_SPI_SRST;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	udelay(A3700_SPI_TIMEOUT);
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val &= ~A3700_SPI_SRST;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	/* Disable AUTO_CS and deactivate all chip-selects */
+	a3700_spi_auto_cs_unset(a3700_spi);
+	for (i = 0; i < master->num_chipselect; i++)
+		a3700_spi_deactivate_cs(a3700_spi, i);
+
+	/* Enable FIFO mode */
+	a3700_spi_fifo_mode_set(a3700_spi);
+
+	/* Set SPI mode */
+	a3700_spi_mode_set(a3700_spi, master->mode_bits);
+
+	/* Reset counters */
+	spireg_write(a3700_spi, A3700_SPI_IF_HDR_CNT_REG, 0);
+	spireg_write(a3700_spi, A3700_SPI_IF_DIN_CNT_REG, 0);
+
+	/* Mask the interrupts and clear cause bits */
+	spireg_write(a3700_spi, A3700_SPI_INT_MASK_REG, 0);
+	spireg_write(a3700_spi, A3700_SPI_INT_STAT_REG, ~0U);
+
+	return ret;
+}
+
+static irqreturn_t a3700_spi_interrupt(int irq, void *dev_id)
+{
+	struct spi_master *master = dev_id;
+	struct a3700_spi *a3700_spi;
+	u32 cause;
+
+	a3700_spi = spi_master_get_devdata(master);
+
+	/* Get interrupt causes */
+	cause = spireg_read(a3700_spi, A3700_SPI_INT_STAT_REG);
+
+	if (!cause || !(a3700_spi->wait_mask & cause))
+		return IRQ_NONE;
+
+	/* mask and acknowledge the SPI interrupts */
+	spireg_write(a3700_spi, A3700_SPI_INT_MASK_REG, 0);
+	spireg_write(a3700_spi, A3700_SPI_INT_STAT_REG, cause);
+
+	/* Wake up the transfer */
+	if (a3700_spi->wait_mask & cause)
+		complete(&a3700_spi->done);
+
+	return IRQ_HANDLED;
+}
+
+static bool a3700_spi_wait_completion(struct spi_device *spi)
+{
+	struct a3700_spi *a3700_spi;
+	unsigned int timeout;
+	unsigned int ctrl_reg;
+	unsigned long timeout_jiffies;
+
+	a3700_spi = spi_master_get_devdata(spi->master);
+
+	/* SPI interrupt is edge-triggered, which means an interrupt will
+	 * be generated only when detecting a specific status bit changed
+	 * from '0' to '1'. So when we start waiting for a interrupt, we
+	 * need to check status bit in control reg first, if it is already 1,
+	 * then we do not need to wait for interrupt
+	 */
+	ctrl_reg = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+	if (a3700_spi->wait_mask & ctrl_reg)
+		return true;
+
+	reinit_completion(&a3700_spi->done);
+
+	spireg_write(a3700_spi, A3700_SPI_INT_MASK_REG,
+		     a3700_spi->wait_mask);
+
+	timeout_jiffies = msecs_to_jiffies(A3700_SPI_TIMEOUT);
+	timeout = wait_for_completion_timeout(&a3700_spi->done,
+					      timeout_jiffies);
+
+	a3700_spi->wait_mask = 0;
+
+	if (timeout)
+		return true;
+
+	/* there might be the case that right after we checked the
+	 * status bits in this routine and before start to wait for
+	 * interrupt by wait_for_completion_timeout, the interrupt
+	 * happens, to avoid missing it we need to double check
+	 * status bits in control reg, if it is already 1, then
+	 * consider that we have the interrupt successfully and
+	 * return true.
+	 */
+	ctrl_reg = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+	if (a3700_spi->wait_mask & ctrl_reg)
+		return true;
+
+	spireg_write(a3700_spi, A3700_SPI_INT_MASK_REG, 0);
+
+	return true;
+}
+
+static bool a3700_spi_transfer_wait(struct spi_device *spi,
+				    unsigned int bit_mask)
+{
+	struct a3700_spi *a3700_spi;
+
+	a3700_spi = spi_master_get_devdata(spi->master);
+	a3700_spi->wait_mask = bit_mask;
+
+	return a3700_spi_wait_completion(spi);
+}
+
+static void a3700_spi_fifo_thres_set(struct a3700_spi *a3700_spi,
+				     unsigned int bytes)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val &= ~(A3700_SPI_FIFO_THRS_MASK << A3700_SPI_RFIFO_THRS_BIT);
+	val |= (bytes - 1) << A3700_SPI_RFIFO_THRS_BIT;
+	val &= ~(A3700_SPI_FIFO_THRS_MASK << A3700_SPI_WFIFO_THRS_BIT);
+	val |= (7 - bytes) << A3700_SPI_WFIFO_THRS_BIT;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static void a3700_spi_transfer_setup(struct spi_device *spi,
+				    struct spi_transfer *xfer)
+{
+	struct a3700_spi *a3700_spi;
+	unsigned int byte_len;
+
+	a3700_spi = spi_master_get_devdata(spi->master);
+
+	a3700_spi_clock_set(a3700_spi, xfer->speed_hz, spi->mode);
+
+	byte_len = xfer->bits_per_word >> 3;
+
+	a3700_spi_fifo_thres_set(a3700_spi, byte_len);
+}
+
+static void a3700_spi_set_cs(struct spi_device *spi, bool enable)
+{
+	struct a3700_spi *a3700_spi = spi_master_get_devdata(spi->master);
+
+	if (!enable)
+		a3700_spi_activate_cs(a3700_spi, spi->chip_select);
+	else
+		a3700_spi_deactivate_cs(a3700_spi, spi->chip_select);
+}
+
+static void a3700_spi_header_set(struct a3700_spi *a3700_spi)
+{
+	u32 instr_cnt = 0, addr_cnt = 0, dummy_cnt = 0;
+	u32 val = 0;
+
+	/* Clear the header registers */
+	spireg_write(a3700_spi, A3700_SPI_IF_INST_REG, 0);
+	spireg_write(a3700_spi, A3700_SPI_IF_ADDR_REG, 0);
+	spireg_write(a3700_spi, A3700_SPI_IF_RMODE_REG, 0);
+
+	/* Set header counters */
+	if (a3700_spi->tx_buf) {
+		if (a3700_spi->buf_len <= a3700_spi->instr_cnt) {
+			instr_cnt = a3700_spi->buf_len;
+		} else if (a3700_spi->buf_len <= (a3700_spi->instr_cnt +
+						  a3700_spi->addr_cnt)) {
+			instr_cnt = a3700_spi->instr_cnt;
+			addr_cnt = a3700_spi->buf_len - instr_cnt;
+		} else if (a3700_spi->buf_len <= a3700_spi->hdr_cnt) {
+			instr_cnt = a3700_spi->instr_cnt;
+			addr_cnt = a3700_spi->addr_cnt;
+			/* Need to handle the normal write case with 1 byte
+			 * data
+			 */
+			if (!a3700_spi->tx_buf[instr_cnt + addr_cnt])
+				dummy_cnt = a3700_spi->buf_len - instr_cnt -
+					    addr_cnt;
+		}
+		val |= ((instr_cnt & A3700_SPI_INSTR_CNT_MASK)
+			<< A3700_SPI_INSTR_CNT_BIT);
+		val |= ((addr_cnt & A3700_SPI_ADDR_CNT_MASK)
+			<< A3700_SPI_ADDR_CNT_BIT);
+		val |= ((dummy_cnt & A3700_SPI_DUMMY_CNT_MASK)
+			<< A3700_SPI_DUMMY_CNT_BIT);
+	}
+	spireg_write(a3700_spi, A3700_SPI_IF_HDR_CNT_REG, val);
+
+	/* Update the buffer length to be transferred */
+	a3700_spi->buf_len -= (instr_cnt + addr_cnt + dummy_cnt);
+
+	/* Set Instruction */
+	val = 0;
+	while (instr_cnt--) {
+		val = (val << 8) | a3700_spi->tx_buf[0];
+		a3700_spi->tx_buf++;
+	}
+	spireg_write(a3700_spi, A3700_SPI_IF_INST_REG, val);
+
+	/* Set Address */
+	val = 0;
+	while (addr_cnt--) {
+		val = (val << 8) | a3700_spi->tx_buf[0];
+		a3700_spi->tx_buf++;
+	}
+	spireg_write(a3700_spi, A3700_SPI_IF_ADDR_REG, val);
+}
+
+static int a3700_is_wfifo_full(struct a3700_spi *a3700_spi)
+{
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+	return (val & A3700_SPI_WFIFO_FULL);
+}
+
+static int a3700_spi_fifo_write(struct a3700_spi *a3700_spi)
+{
+	u32 val;
+	int i = 0;
+
+	while (!a3700_is_wfifo_full(a3700_spi) && a3700_spi->buf_len) {
+		val = 0;
+		if (a3700_spi->buf_len >= 4) {
+			val = cpu_to_le32(*(u32 *)a3700_spi->tx_buf);
+			spireg_write(a3700_spi, A3700_SPI_DATA_OUT_REG, val);
+
+			a3700_spi->buf_len -= 4;
+			a3700_spi->tx_buf += 4;
+		} else {
+			/*
+			 * If the remained buffer length is less than 4-bytes,
+			 * we should pad the write buffer with all ones. So that
+			 * it avoids overwrite the unexpected bytes following
+			 * the last one.
+			 */
+			val = GENMASK(31, 0);
+			while (a3700_spi->buf_len) {
+				val &= ~(0xff << (8 * i));
+				val |= *a3700_spi->tx_buf++ << (8 * i);
+				i++;
+				a3700_spi->buf_len--;
+
+				spireg_write(a3700_spi, A3700_SPI_DATA_OUT_REG,
+					     val);
+			}
+			break;
+		}
+	}
+
+	return 0;
+}
+
+static int a3700_is_rfifo_empty(struct a3700_spi *a3700_spi)
+{
+	u32 val = spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG);
+
+	return (val & A3700_SPI_RFIFO_EMPTY);
+}
+
+static int a3700_spi_fifo_read(struct a3700_spi *a3700_spi)
+{
+	u32 val;
+
+	while (!a3700_is_rfifo_empty(a3700_spi) && a3700_spi->buf_len) {
+		val = spireg_read(a3700_spi, A3700_SPI_DATA_IN_REG);
+		if (a3700_spi->buf_len >= 4) {
+			u32 data = le32_to_cpu(val);
+			memcpy(a3700_spi->rx_buf, &data, 4);
+
+			a3700_spi->buf_len -= 4;
+			a3700_spi->rx_buf += 4;
+		} else {
+			/*
+			 * When remain bytes is not larger than 4, we should
+			 * avoid memory overwriting and just write the left rx
+			 * buffer bytes.
+			 */
+			while (a3700_spi->buf_len) {
+				*a3700_spi->rx_buf = val & 0xff;
+				val >>= 8;
+
+				a3700_spi->buf_len--;
+				a3700_spi->rx_buf++;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static void a3700_spi_transfer_abort_fifo(struct a3700_spi *a3700_spi)
+{
+	int timeout = A3700_SPI_TIMEOUT;
+	u32 val;
+
+	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+	val |= A3700_SPI_XFER_STOP;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+	while (--timeout) {
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		if (!(val & A3700_SPI_XFER_START))
+			break;
+		udelay(1);
+	}
+
+	a3700_spi_fifo_flush(a3700_spi);
+
+	val &= ~A3700_SPI_XFER_STOP;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+}
+
+static int a3700_spi_prepare_message(struct spi_master *master,
+				     struct spi_message *message)
+{
+	struct a3700_spi *a3700_spi = spi_master_get_devdata(master);
+	struct spi_device *spi = message->spi;
+	int ret;
+
+	ret = clk_enable(a3700_spi->clk);
+	if (ret) {
+		dev_err(&spi->dev, "failed to enable clk with error %d\n", ret);
+		return ret;
+	}
+
+	/* Flush the FIFOs */
+	ret = a3700_spi_fifo_flush(a3700_spi);
+	if (ret)
+		return ret;
+
+	a3700_spi_bytelen_set(a3700_spi, 4);
+
+	return 0;
+}
+
+static int a3700_spi_transfer_one(struct spi_master *master,
+				  struct spi_device *spi,
+				  struct spi_transfer *xfer)
+{
+	struct a3700_spi *a3700_spi = spi_master_get_devdata(master);
+	int ret = 0, timeout = A3700_SPI_TIMEOUT;
+	unsigned int nbits = 0;
+	u32 val;
+
+	a3700_spi_transfer_setup(spi, xfer);
+
+	a3700_spi->tx_buf  = xfer->tx_buf;
+	a3700_spi->rx_buf  = xfer->rx_buf;
+	a3700_spi->buf_len = xfer->len;
+
+	/* SPI transfer headers */
+	a3700_spi_header_set(a3700_spi);
+
+	if (xfer->tx_buf)
+		nbits = xfer->tx_nbits;
+	else if (xfer->rx_buf)
+		nbits = xfer->rx_nbits;
+
+	a3700_spi_pin_mode_set(a3700_spi, nbits);
+
+	if (xfer->rx_buf) {
+		/* Set read data length */
+		spireg_write(a3700_spi, A3700_SPI_IF_DIN_CNT_REG,
+			     a3700_spi->buf_len);
+		/* Start READ transfer */
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		val &= ~A3700_SPI_RW_EN;
+		val |= A3700_SPI_XFER_START;
+		spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+	} else if (xfer->tx_buf) {
+		/* Start Write transfer */
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		val |= (A3700_SPI_XFER_START | A3700_SPI_RW_EN);
+		spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+
+		/*
+		 * If there are data to be written to the SPI device, xmit_data
+		 * flag is set true; otherwise the instruction in SPI_INSTR does
+		 * not require data to be written to the SPI device, then
+		 * xmit_data flag is set false.
+		 */
+		a3700_spi->xmit_data = (a3700_spi->buf_len != 0);
+	}
+
+	while (a3700_spi->buf_len) {
+		if (a3700_spi->tx_buf) {
+			/* Wait wfifo ready */
+			if (!a3700_spi_transfer_wait(spi,
+						     A3700_SPI_WFIFO_RDY)) {
+				dev_err(&spi->dev,
+					"wait wfifo ready timed out\n");
+				ret = -ETIMEDOUT;
+				goto error;
+			}
+			/* Fill up the wfifo */
+			ret = a3700_spi_fifo_write(a3700_spi);
+			if (ret)
+				goto error;
+		} else if (a3700_spi->rx_buf) {
+			/* Wait rfifo ready */
+			if (!a3700_spi_transfer_wait(spi,
+						     A3700_SPI_RFIFO_RDY)) {
+				dev_err(&spi->dev,
+					"wait rfifo ready timed out\n");
+				ret = -ETIMEDOUT;
+				goto error;
+			}
+			/* Drain out the rfifo */
+			ret = a3700_spi_fifo_read(a3700_spi);
+			if (ret)
+				goto error;
+		}
+	}
+
+	/*
+	 * Stop a write transfer in fifo mode:
+	 *	- wait all the bytes in wfifo to be shifted out
+	 *	 - set XFER_STOP bit
+	 *	- wait XFER_START bit clear
+	 *	- clear XFER_STOP bit
+	 * Stop a read transfer in fifo mode:
+	 *	- the hardware is to reset the XFER_START bit
+	 *	   after the number of bytes indicated in DIN_CNT
+	 *	   register
+	 *	- just wait XFER_START bit clear
+	 */
+	if (a3700_spi->tx_buf) {
+		if (a3700_spi->xmit_data) {
+			/*
+			 * If there are data written to the SPI device, wait
+			 * until SPI_WFIFO_EMPTY is 1 to wait for all data to
+			 * transfer out of write FIFO.
+			 */
+			if (!a3700_spi_transfer_wait(spi,
+						     A3700_SPI_WFIFO_EMPTY)) {
+				dev_err(&spi->dev, "wait wfifo empty timed out\n");
+				return -ETIMEDOUT;
+			}
+		} else {
+			/*
+			 * If the instruction in SPI_INSTR does not require data
+			 * to be written to the SPI device, wait until SPI_RDY
+			 * is 1 for the SPI interface to be in idle.
+			 */
+			if (!a3700_spi_transfer_wait(spi, A3700_SPI_XFER_RDY)) {
+				dev_err(&spi->dev, "wait xfer ready timed out\n");
+				return -ETIMEDOUT;
+			}
+		}
+
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		val |= A3700_SPI_XFER_STOP;
+		spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+	}
+
+	while (--timeout) {
+		val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
+		if (!(val & A3700_SPI_XFER_START))
+			break;
+		udelay(1);
+	}
+
+	if (timeout == 0) {
+		dev_err(&spi->dev, "wait transfer start clear timed out\n");
+		ret = -ETIMEDOUT;
+		goto error;
+	}
+
+	val &= ~A3700_SPI_XFER_STOP;
+	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
+	goto out;
+
+error:
+	a3700_spi_transfer_abort_fifo(a3700_spi);
+out:
+	spi_finalize_current_transfer(master);
+
+	return ret;
+}
+
+static int a3700_spi_unprepare_message(struct spi_master *master,
+				       struct spi_message *message)
+{
+	struct a3700_spi *a3700_spi = spi_master_get_devdata(master);
+
+	clk_disable(a3700_spi->clk);
+
+	return 0;
+}
+
+static const struct of_device_id a3700_spi_dt_ids[] = {
+	{ .compatible = "marvell,armada-3700-spi", .data = NULL },
+	{},
+};
+
+MODULE_DEVICE_TABLE(of, a3700_spi_dt_ids);
+
+static int a3700_spi_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device_node *of_node = dev->of_node;
+	struct resource *res;
+	struct spi_master *master;
+	struct a3700_spi *spi;
+	u32 num_cs = 0;
+	int ret = 0;
+
+	master = spi_alloc_master(dev, sizeof(*spi));
+	if (!master) {
+		dev_err(dev, "master allocation failed\n");
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (of_property_read_u32(of_node, "num-cs", &num_cs)) {
+		dev_err(dev, "could not find num-cs\n");
+		ret = -ENXIO;
+		goto error;
+	}
+
+	master->bus_num = pdev->id;
+	master->dev.of_node = of_node;
+	master->mode_bits = SPI_MODE_3;
+	master->num_chipselect = num_cs;
+	master->bits_per_word_mask = SPI_BPW_MASK(8) | SPI_BPW_MASK(32);
+	master->prepare_message =  a3700_spi_prepare_message;
+	master->transfer_one = a3700_spi_transfer_one;
+	master->unprepare_message = a3700_spi_unprepare_message;
+	master->set_cs = a3700_spi_set_cs;
+	master->flags = SPI_MASTER_HALF_DUPLEX;
+	master->mode_bits |= (SPI_RX_DUAL | SPI_RX_DUAL |
+			      SPI_RX_QUAD | SPI_TX_QUAD);
+
+	platform_set_drvdata(pdev, master);
+
+	spi = spi_master_get_devdata(master);
+	memset(spi, 0, sizeof(struct a3700_spi));
+
+	spi->master = master;
+	spi->instr_cnt = A3700_INSTR_CNT;
+	spi->addr_cnt = A3700_ADDR_CNT;
+	spi->hdr_cnt = A3700_INSTR_CNT + A3700_ADDR_CNT +
+		       A3700_DUMMY_CNT;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	spi->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(spi->base)) {
+		ret = PTR_ERR(spi->base);
+		goto error;
+	}
+
+	spi->irq = platform_get_irq(pdev, 0);
+	if (spi->irq < 0) {
+		dev_err(dev, "could not get irq: %d\n", spi->irq);
+		ret = -ENXIO;
+		goto error;
+	}
+
+	init_completion(&spi->done);
+
+	spi->clk = devm_clk_get(dev, NULL);
+	if (IS_ERR(spi->clk)) {
+		dev_err(dev, "could not find clk: %ld\n", PTR_ERR(spi->clk));
+		goto error;
+	}
+
+	ret = clk_prepare(spi->clk);
+	if (ret) {
+		dev_err(dev, "could not prepare clk: %d\n", ret);
+		goto error;
+	}
+
+	ret = a3700_spi_init(spi);
+	if (ret)
+		goto error_clk;
+
+	ret = devm_request_irq(dev, spi->irq, a3700_spi_interrupt, 0,
+			       dev_name(dev), master);
+	if (ret) {
+		dev_err(dev, "could not request IRQ: %d\n", ret);
+		goto error_clk;
+	}
+
+	ret = devm_spi_register_master(dev, master);
+	if (ret) {
+		dev_err(dev, "Failed to register master\n");
+		goto error_clk;
+	}
+
+	return 0;
+
+error_clk:
+	clk_disable_unprepare(spi->clk);
+error:
+	spi_master_put(master);
+out:
+	return ret;
+}
+
+static int a3700_spi_remove(struct platform_device *pdev)
+{
+	struct spi_master *master = platform_get_drvdata(pdev);
+	struct a3700_spi *spi = spi_master_get_devdata(master);
+
+	clk_unprepare(spi->clk);
+	spi_master_put(master);
+
+	return 0;
+}
+
+static struct platform_driver a3700_spi_driver = {
+	.driver = {
+		.name	= DRIVER_NAME,
+		.owner	= THIS_MODULE,
+		.of_match_table = of_match_ptr(a3700_spi_dt_ids),
+	},
+	.probe		= a3700_spi_probe,
+	.remove		= a3700_spi_remove,
+};
+
+module_platform_driver(a3700_spi_driver);
+
+MODULE_DESCRIPTION("Armada-3700 SPI driver");
+MODULE_AUTHOR("Wilson Ding <dingwei@marvell.com>");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:" DRIVER_NAME);
-- 
2.10.2

^ permalink raw reply related

* Applied "spi: armada-3700: Add documentation for the Armada 3700 SPI Controller" to the spi tree
From: Mark Brown @ 2016-12-08 16:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161129143939.3191-4-romain.perier@free-electrons.com>

The patch

   spi: armada-3700: Add documentation for the Armada 3700 SPI Controller

has been applied to the spi tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 4049537742b3ed39fac4da10d31f3171a2ee9a3e Mon Sep 17 00:00:00 2001
From: Romain Perier <romain.perier@free-electrons.com>
Date: Thu, 8 Dec 2016 15:58:45 +0100
Subject: [PATCH] spi: armada-3700: Add documentation for the Armada 3700 SPI
 Controller

This adds the devicetree bindings documentation for the SPI controller
present in the Marvell Armada 3700 SoCs.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 .../devicetree/bindings/spi/spi-armada-3700.txt    | 25 ++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/spi/spi-armada-3700.txt

diff --git a/Documentation/devicetree/bindings/spi/spi-armada-3700.txt b/Documentation/devicetree/bindings/spi/spi-armada-3700.txt
new file mode 100644
index 000000000000..1564aa8c02cd
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/spi-armada-3700.txt
@@ -0,0 +1,25 @@
+* Marvell Armada 3700 SPI Controller
+
+Required Properties:
+
+- compatible: should be "marvell,armada-3700-spi"
+- reg: physical base address of the controller and length of memory mapped
+       region.
+- interrupts: The interrupt number. The interrupt specifier format depends on
+	      the interrupt controller and of its driver.
+- clocks: Must contain the clock source, usually from the North Bridge clocks.
+- num-cs: The number of chip selects that is supported by this SPI Controller
+- #address-cells: should be 1.
+- #size-cells: should be 0.
+
+Example:
+
+	spi0: spi at 10600 {
+		compatible = "marvell,armada-3700-spi";
+		#address-cells = <1>;
+		#size-cells = <0>;
+		reg = <0x10600 0x5d>;
+		clocks = <&nb_perih_clk 7>;
+		interrupts = <GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>;
+		num-cs = <4>;
+	};
-- 
2.10.2

^ permalink raw reply related

* [RFC v3 00/10] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions
From: Alex Williamson @ 2016-12-08 17:01 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <cd16fc5c-8649-0bfa-d67d-8f257aa38bd6@arm.com>

On Thu, 8 Dec 2016 13:14:04 +0000
Robin Murphy <robin.murphy@arm.com> wrote:
> On 08/12/16 09:36, Auger Eric wrote:
> > 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no
> >    My current series does not expose them in iommu group sysfs.
> >    I understand we can expose the RMRR regions in the iomm group sysfs
> >    without necessarily supporting RMRR requiring device assignment.
> >    We can also add this support later.  
> 
> As you say, reporting them doesn't necessitate allowing device
> assignment, and it's information which can already be easily grovelled
> out of dmesg (for intel-iommu at least) - there doesn't seem to be any
> need to hide them, but the x86 folks can have the final word on that.

Eric and I talked about this and I don't see the value in identifying
an RMRR as anything other than a reserved range for a device.  It's not
userspace's job to maintain an identify mapped range for the device,
and it can't be trusted to do so anyway.  It does throw a kink in the
machinery though as an RMRR is a reserved memory range unique to a
device.  It doesn't really fit into a monolithic /sys/class/iommu view
of global reserved regions as an RMRR is only relevant to the device
paths affected.

Another kink is that sometimes we know what the RMRR is for, know that
it's irrelevant for our use case, and ignore it.  This is true for USB
and Intel graphics use cases of RMRRs.

Also, aside from the above mentioned cases, devices with RMRRs are
currently excluded from participating in the IOMMU API by the
intel-iommu driver and I expect this to continue in the general case
regardless of whether the ranges are more easily exposed to userspace.
ARM may have to deal with mangling a guest memory map due to lack of
any standard layout, de facto or otherwise, but for x86 I don't think
it's worth the migration and hotplug implications.  Thanks,

Alex

^ permalink raw reply

* [RFC PATCH v3 07/11] arm64: compat: Add a 32-bit vDSO
From: Nathan Lynch @ 2016-12-08 17:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <0707aecb-6638-2447-cf4a-9c30fc08dae5@arm.com>

Kevin Brodsky <kevin.brodsky@arm.com> writes:
> On 07/12/16 18:51, Nathan Lynch wrote:
>> Kevin Brodsky <kevin.brodsky@arm.com> writes:
>>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
>>> new file mode 100644
>>> index 000000000000..53c3d1f82b26
>>> --- /dev/null
>>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
>> As I said at LPC last month, I'm not excited to have arch/arm's
>> vgettimeofday.c copied into arch/arm64 and tweaked; I'd rather see this
>> part of the implementation shared between arch/arm and arch/arm64
>> somehow, even if there's not an elegant way to do so.
>>
>> The situation which must be avoided is one where the arch/arm64 compat
>> VDSO incompatibly diverges from the arch/arm VDSO.  That becomes much
>> less likely if there's only one copy of the userspace-exposed code to
>> maintain.
>
> I still agree this is very suboptimal. However, I also think this is by far the most 
> straightforward solution, and I would like to stick to it *as a first step*. If you 
> diff this "tweaked" vgettimeofday.c with the original one, you'll see that it is 
> really not obvious to share even parts of vgettimeofday.c in the current state of arm 
> and arm64.

After adapting your get_vdso_data() to arch/arm/vdso, I come up with the
differences below, which seems surmountable mostly with the addition of
appropriate accessor functions for the data page, or changing the field
names to match.  So I can't say I'm persuaded.

I'm happy to review and facilitate changes to code in arch/arm/vdso to
make it possible to share it with arm64 for its compat VDSO.

 vgettimeofday.c |   84 ++++++++++++++++++++++++--------------------------------
 1 file changed, 37 insertions(+), 47 deletions(-)

--- arch/arm/vdso/vgettimeofday.c	2016-12-07 11:24:35.043856904 -0600
+++ kb-vgettimeofday.c	2016-12-08 10:02:14.031896779 -0600
@@ -15,20 +15,23 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/clocksource.h>
 #include <linux/compiler.h>
-#include <linux/hrtimer.h>
 #include <linux/time.h>
-#include <asm/arch_timer.h>
-#include <asm/barrier.h>
-#include <asm/bug.h>
-#include <asm/page.h>
 #include <asm/unistd.h>
 #include <asm/vdso_datapage.h>
 
-#ifndef CONFIG_AEABI
-#error This code depends on AEABI system call conventions
-#endif
+#include "aarch32-barrier.h"
 
+/*
+ * We use the hidden visibility to prevent the compiler from generating a GOT
+ * relocation. Not only is going through a GOT useless (the entry couldn't and
+ * musn't be overridden by another library), it does not even work: the linker
+ * cannot generate an absolute address to the data page.
+ *
+ * With the hidden visibility, the compiler simply generates a PC-relative
+ * relocation (R_ARM_REL32), and this is what we need.
+ */
 extern const struct vdso_data _vdso_data __attribute__((visibility("hidden")));
 
 static inline const struct vdso_data *get_vdso_data(void)
@@ -52,13 +55,11 @@
 	return ret;
 }
 
-#define __get_datapage() get_vdso_data()
-
 static notrace u32 __vdso_read_begin(const struct vdso_data *vdata)
 {
 	u32 seq;
 repeat:
-	seq = ACCESS_ONCE(vdata->seq_count);
+	seq = ACCESS_ONCE(vdata->tb_seq_count);
 	if (seq & 1) {
 		cpu_relax();
 		goto repeat;
@@ -72,26 +73,30 @@
 
 	seq = __vdso_read_begin(vdata);
 
-	smp_rmb(); /* Pairs with smp_wmb in vdso_write_end */
+	aarch32_smp_rmb();
 	return seq;
 }
 
 static notrace int vdso_read_retry(const struct vdso_data *vdata, u32 start)
 {
-	smp_rmb(); /* Pairs with smp_wmb in vdso_write_begin */
-	return vdata->seq_count != start;
+	aarch32_smp_rmb();
+	return vdata->tb_seq_count != start;
 }
 
+/*
+ * Note: only AEABI is supported by the compat layer, we can assume AEABI
+ * syscall conventions are used.
+ */
 static notrace long clock_gettime_fallback(clockid_t _clkid,
 					   struct timespec *_ts)
 {
 	register struct timespec *ts asm("r1") = _ts;
 	register clockid_t clkid asm("r0") = _clkid;
 	register long ret asm ("r0");
-	register long nr asm("r7") = __NR_clock_gettime;
+	register long nr asm("r7") = __NR_compat_clock_gettime;
 
 	asm volatile(
-	"	swi #0\n"
+	"	svc #0\n"
 	: "=r" (ret)
 	: "r" (clkid), "r" (ts), "r" (nr)
 	: "memory");
@@ -138,25 +143,27 @@
 	return 0;
 }
 
-#ifdef CONFIG_ARM_ARCH_TIMER
-
 static notrace u64 get_ns(const struct vdso_data *vdata)
 {
 	u64 cycle_delta;
 	u64 cycle_now;
 	u64 nsec;
 
-	cycle_now = arch_counter_get_cntvct();
+	/* AArch32 implementation of arch_counter_get_cntvct() */
+	isb();
+	asm volatile("mrrc p15, 1, %Q0, %R0, c14" : "=r" (cycle_now));
 
-	cycle_delta = (cycle_now - vdata->cs_cycle_last) & vdata->cs_mask;
+	/* The virtual counter provides 56 significant bits. */
+	cycle_delta = (cycle_now - vdata->cs_cycle_last) & CLOCKSOURCE_MASK(56);
 
-	nsec = (cycle_delta * vdata->cs_mult) + vdata->xtime_clock_snsec;
+	nsec = (cycle_delta * vdata->cs_mono_mult) + vdata->xtime_clock_nsec;
 	nsec >>= vdata->cs_shift;
 
 	return nsec;
 }
 
-static notrace int do_realtime(struct timespec *ts, const struct vdso_data *vdata)
+static notrace int do_realtime(struct timespec *ts,
+			       const struct vdso_data *vdata)
 {
 	u64 nsecs;
 	u32 seq;
@@ -164,7 +171,7 @@
 	do {
 		seq = vdso_read_begin(vdata);
 
-		if (!vdata->tk_is_cntvct)
+		if (vdata->use_syscall)
 			return -1;
 
 		ts->tv_sec = vdata->xtime_clock_sec;
@@ -178,7 +185,8 @@
 	return 0;
 }
 
-static notrace int do_monotonic(struct timespec *ts, const struct vdso_data *vdata)
+static notrace int do_monotonic(struct timespec *ts,
+				const struct vdso_data *vdata)
 {
 	struct timespec tomono;
 	u64 nsecs;
@@ -187,7 +195,7 @@
 	do {
 		seq = vdso_read_begin(vdata);
 
-		if (!vdata->tk_is_cntvct)
+		if (vdata->use_syscall)
 			return -1;
 
 		ts->tv_sec = vdata->xtime_clock_sec;
@@ -205,27 +213,11 @@
 	return 0;
 }
 
-#else /* CONFIG_ARM_ARCH_TIMER */
-
-static notrace int do_realtime(struct timespec *ts, const struct vdso_data *vdata)
-{
-	return -1;
-}
-
-static notrace int do_monotonic(struct timespec *ts, const struct vdso_data *vdata)
-{
-	return -1;
-}
-
-#endif /* CONFIG_ARM_ARCH_TIMER */
-
 notrace int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
 {
-	const struct vdso_data *vdata;
+	const struct vdso_data *vdata = get_vdso_data();
 	int ret = -1;
 
-	vdata = __get_datapage();
-
 	switch (clkid) {
 	case CLOCK_REALTIME_COARSE:
 		ret = do_realtime_coarse(ts, vdata);
@@ -255,10 +247,10 @@
 	register struct timezone *tz asm("r1") = _tz;
 	register struct timeval *tv asm("r0") = _tv;
 	register long ret asm ("r0");
-	register long nr asm("r7") = __NR_gettimeofday;
+	register long nr asm("r7") = __NR_compat_gettimeofday;
 
 	asm volatile(
-	"	swi #0\n"
+	"	svc #0\n"
 	: "=r" (ret)
 	: "r" (tv), "r" (tz), "r" (nr)
 	: "memory");
@@ -269,11 +261,9 @@
 notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
 {
 	struct timespec ts;
-	const struct vdso_data *vdata;
+	const struct vdso_data *vdata = get_vdso_data();
 	int ret;
 
-	vdata = __get_datapage();
-
 	ret = do_realtime(&ts, vdata);
 	if (ret)
 		return gettimeofday_fallback(tv, tz);

^ permalink raw reply

* [PATCH 0/3] rtc: armada38x: Few improvement and cleanup
From: Gregory CLEMENT @ 2016-12-08 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

this series brings some improvement and cleanup for the armada 38x
RTC.

The errata for this RTC gave us more information on how to work around
it. The first patch implement it.

The second patch convert the driver to the time64_t usage.

And the last patch make the driver really usable on 64 bits
architecture.

These last two patches are a preliminary step allowing using this
driver on the Armada 7K/8K which are 64 bits ARM SoCs which use the
same RTC IP.

Thanks,

Gregory

Gregory CLEMENT (2):
  rtc: armada38x: Convert to time64_t
  rtc: armada38x: Prepare for being use on 64 bits

Shaker Daibes (1):
  rtc: armada38x: improve RTC errata implementation

 drivers/rtc/rtc-armada38x.c | 145 ++++++++++++++++++++++++++++++--------------
 1 file changed, 100 insertions(+), 45 deletions(-)

-- 
2.10.2

^ permalink raw reply

* [PATCH 1/3] rtc: armada38x: improve RTC errata implementation
From: Gregory CLEMENT @ 2016-12-08 17:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208171010.29446-1-gregory.clement@free-electrons.com>

From: Shaker Daibes <shaker@marvell.com>

According to FE-3124064:
The device supports CPU write and read access to the RTC time register.
However, due to this errata, read from RTC TIME register may fail.

Workaround:
General configuration:
1. Configure the RTC Mbus Bridge Timing Control register (offset 0x184A0)
   to value 0xFD4D4FFF
   Write RTC WRCLK Period to its maximum value (0x3FF)
   Write RTC WRCLK setup to 0x53 (default value )
   Write RTC WRCLK High Time to 0x53 (default value )
   Write RTC Read Output Delay to its maximum value (0x1F)
   Mbus - Read All Byte Enable to 0x1 (default value )
2. Configure the RTC Test Configuration Register (offset 0xA381C) bit3
   to '1' (Reserved, Marvell internal)

For any RTC register read operation:
1. Read the requested register 100 times.
2. Find the result that appears most frequently and use this result
   as the correct value.

For any RTC register write operation:
1. Issue two dummy writes of 0x0 to the RTC Status register (offset
   0xA3800).
2. Write the time to the RTC Time register (offset 0xA380C).

[gregory.clement at free-electrons.com: cosmetic changes and fix issues for
interrupt in original patch]

Reviewed-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/rtc/rtc-armada38x.c | 109 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 85 insertions(+), 24 deletions(-)

diff --git a/drivers/rtc/rtc-armada38x.c b/drivers/rtc/rtc-armada38x.c
index 9a3f2a6f512e..a0859286a4c4 100644
--- a/drivers/rtc/rtc-armada38x.c
+++ b/drivers/rtc/rtc-armada38x.c
@@ -29,12 +29,21 @@
 #define RTC_TIME	    0xC
 #define RTC_ALARM1	    0x10
 
+#define SOC_RTC_BRIDGE_TIMING_CTL   0x0
+#define SOC_RTC_PERIOD_OFFS		0
+#define SOC_RTC_PERIOD_MASK		(0x3FF << SOC_RTC_PERIOD_OFFS)
+#define SOC_RTC_READ_DELAY_OFFS		26
+#define SOC_RTC_READ_DELAY_MASK		(0x1F << SOC_RTC_READ_DELAY_OFFS)
+
 #define SOC_RTC_INTERRUPT   0x8
 #define SOC_RTC_ALARM1		BIT(0)
 #define SOC_RTC_ALARM2		BIT(1)
 #define SOC_RTC_ALARM1_MASK	BIT(2)
 #define SOC_RTC_ALARM2_MASK	BIT(3)
 
+
+#define SAMPLE_NR 100
+
 struct armada38x_rtc {
 	struct rtc_device   *rtc_dev;
 	void __iomem	    *regs;
@@ -47,32 +56,85 @@ struct armada38x_rtc {
  * According to the datasheet, the OS should wait 5us after every
  * register write to the RTC hard macro so that the required update
  * can occur without holding off the system bus
+ * According to errata FE-3124064, Write to any RTC register
+ * may fail. As a workaround, before writing to RTC
+ * register, issue a dummy write of 0x0 twice to RTC Status
+ * register.
  */
+
 static void rtc_delayed_write(u32 val, struct armada38x_rtc *rtc, int offset)
 {
+	writel(0, rtc->regs + RTC_STATUS);
+	writel(0, rtc->regs + RTC_STATUS);
 	writel(val, rtc->regs + offset);
 	udelay(5);
 }
 
+/* Update RTC-MBUS bridge timing parameters */
+static void rtc_update_mbus_timing_params(struct armada38x_rtc *rtc)
+{
+	uint32_t reg;
+
+	reg = readl(rtc->regs_soc + SOC_RTC_BRIDGE_TIMING_CTL);
+	reg &= ~SOC_RTC_PERIOD_MASK;
+	reg |= 0x3FF << SOC_RTC_PERIOD_OFFS; /* Maximum value */
+	reg &= ~SOC_RTC_READ_DELAY_MASK;
+	reg |= 0x1F << SOC_RTC_READ_DELAY_OFFS; /* Maximum value */
+	writel(reg, rtc->regs_soc + SOC_RTC_BRIDGE_TIMING_CTL);
+}
+
+struct str_value_to_freq {
+	unsigned long value;
+	u8 freq;
+} __packed;
+
+static unsigned long read_rtc_register_wa(struct armada38x_rtc *rtc, u8 rtc_reg)
+{
+	unsigned long value_array[SAMPLE_NR], i, j, value;
+	unsigned long max = 0, index_max = SAMPLE_NR - 1;
+	struct str_value_to_freq value_to_freq[SAMPLE_NR];
+
+	for (i = 0; i < SAMPLE_NR; i++) {
+		value_to_freq[i].freq = 0;
+		value_array[i] = readl(rtc->regs + rtc_reg);
+	}
+	for (i = 0; i < SAMPLE_NR; i++) {
+		value = value_array[i];
+		/*
+		 * if value appears in value_to_freq array so add the
+		 * counter of value, if didn't appear yet in counters
+		 * array then allocate new member of value_to_freq
+		 * array with counter = 1
+		 */
+		for (j = 0; j < SAMPLE_NR; j++) {
+			if (value_to_freq[j].freq == 0 ||
+					value_to_freq[j].value == value)
+				break;
+			if (j == (SAMPLE_NR - 1))
+				break;
+		}
+		if (value_to_freq[j].freq == 0)
+			value_to_freq[j].value = value;
+		value_to_freq[j].freq++;
+		/* find the most common result */
+		if (max < value_to_freq[j].freq) {
+			index_max = j;
+			max = value_to_freq[j].freq;
+		}
+	}
+	return value_to_freq[index_max].value;
+}
+
 static int armada38x_rtc_read_time(struct device *dev, struct rtc_time *tm)
 {
 	struct armada38x_rtc *rtc = dev_get_drvdata(dev);
-	unsigned long time, time_check, flags;
+	unsigned long time, flags;
 
 	spin_lock_irqsave(&rtc->lock, flags);
-	time = readl(rtc->regs + RTC_TIME);
-	/*
-	 * WA for failing time set attempts. As stated in HW ERRATA if
-	 * more than one second between two time reads is detected
-	 * then read once again.
-	 */
-	time_check = readl(rtc->regs + RTC_TIME);
-	if ((time_check - time) > 1)
-		time_check = readl(rtc->regs + RTC_TIME);
-
+	time = read_rtc_register_wa(rtc, RTC_TIME);
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
-	rtc_time_to_tm(time_check, tm);
+	rtc_time_to_tm(time, tm);
 
 	return 0;
 }
@@ -87,16 +149,9 @@ static int armada38x_rtc_set_time(struct device *dev, struct rtc_time *tm)
 
 	if (ret)
 		goto out;
-	/*
-	 * According to errata FE-3124064, Write to RTC TIME register
-	 * may fail. As a workaround, after writing to RTC TIME
-	 * register, issue a dummy write of 0x0 twice to RTC Status
-	 * register.
-	 */
+
 	spin_lock_irqsave(&rtc->lock, flags);
 	rtc_delayed_write(time, rtc, RTC_TIME);
-	rtc_delayed_write(0, rtc, RTC_STATUS);
-	rtc_delayed_write(0, rtc, RTC_STATUS);
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
 out:
@@ -111,8 +166,8 @@ static int armada38x_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alrm)
 
 	spin_lock_irqsave(&rtc->lock, flags);
 
-	time = readl(rtc->regs + RTC_ALARM1);
-	val = readl(rtc->regs + RTC_IRQ1_CONF) & RTC_IRQ1_AL_EN;
+	time = read_rtc_register_wa(rtc, RTC_ALARM1);
+	val = read_rtc_register_wa(rtc, RTC_IRQ1_CONF) & RTC_IRQ1_AL_EN;
 
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
@@ -182,7 +237,7 @@ static irqreturn_t armada38x_rtc_alarm_irq(int irq, void *data)
 	val = readl(rtc->regs_soc + SOC_RTC_INTERRUPT);
 
 	writel(val & ~SOC_RTC_ALARM1, rtc->regs_soc + SOC_RTC_INTERRUPT);
-	val = readl(rtc->regs + RTC_IRQ1_CONF);
+	val = read_rtc_register_wa(rtc, RTC_IRQ1_CONF);
 	/* disable all the interrupts for alarm 1 */
 	rtc_delayed_write(0, rtc, RTC_IRQ1_CONF);
 	/* Ack the event */
@@ -196,7 +251,6 @@ static irqreturn_t armada38x_rtc_alarm_irq(int irq, void *data)
 		else
 			event |= RTC_PF;
 	}
-
 	rtc_update_irq(rtc->rtc_dev, 1, event);
 
 	return IRQ_HANDLED;
@@ -253,6 +307,9 @@ static __init int armada38x_rtc_probe(struct platform_device *pdev)
 	if (rtc->irq != -1)
 		device_init_wakeup(&pdev->dev, 1);
 
+	/* Update RTC-MBUS bridge timing parameters */
+	rtc_update_mbus_timing_params(rtc);
+
 	rtc->rtc_dev = devm_rtc_device_register(&pdev->dev, pdev->name,
 					&armada38x_rtc_ops, THIS_MODULE);
 	if (IS_ERR(rtc->rtc_dev)) {
@@ -260,6 +317,7 @@ static __init int armada38x_rtc_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "Failed to register RTC device: %d\n", ret);
 		return ret;
 	}
+
 	return 0;
 }
 
@@ -280,6 +338,9 @@ static int armada38x_rtc_resume(struct device *dev)
 	if (device_may_wakeup(dev)) {
 		struct armada38x_rtc *rtc = dev_get_drvdata(dev);
 
+		/* Update RTC-MBUS bridge timing parameters */
+		rtc_update_mbus_timing_params(rtc);
+
 		return disable_irq_wake(rtc->irq);
 	}
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH 2/3] rtc: armada38x: Convert to time64_t
From: Gregory CLEMENT @ 2016-12-08 17:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208171010.29446-1-gregory.clement@free-electrons.com>

It is one more step to remove the deprecated functions rtc_time_to_tm
and rtc_tm_to_time.

And as bonus it simplifies a little the driver.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/rtc/rtc-armada38x.c | 42 ++++++++++++++++++------------------------
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/drivers/rtc/rtc-armada38x.c b/drivers/rtc/rtc-armada38x.c
index a0859286a4c4..b8a74ffaae80 100644
--- a/drivers/rtc/rtc-armada38x.c
+++ b/drivers/rtc/rtc-armada38x.c
@@ -128,13 +128,14 @@ static unsigned long read_rtc_register_wa(struct armada38x_rtc *rtc, u8 rtc_reg)
 static int armada38x_rtc_read_time(struct device *dev, struct rtc_time *tm)
 {
 	struct armada38x_rtc *rtc = dev_get_drvdata(dev);
-	unsigned long time, flags;
+	unsigned long flags;
+	time64_t time;
 
 	spin_lock_irqsave(&rtc->lock, flags);
-	time = read_rtc_register_wa(rtc, RTC_TIME);
+	time = (time64_t)read_rtc_register_wa(rtc, RTC_TIME);
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
-	rtc_time_to_tm(time, tm);
+	rtc_time64_to_tm(time, tm);
 
 	return 0;
 }
@@ -142,37 +143,34 @@ static int armada38x_rtc_read_time(struct device *dev, struct rtc_time *tm)
 static int armada38x_rtc_set_time(struct device *dev, struct rtc_time *tm)
 {
 	struct armada38x_rtc *rtc = dev_get_drvdata(dev);
-	int ret = 0;
-	unsigned long time, flags;
-
-	ret = rtc_tm_to_time(tm, &time);
+	unsigned long flags;
+	time64_t time;
 
-	if (ret)
-		goto out;
+	time = rtc_tm_to_time64(tm);
 
 	spin_lock_irqsave(&rtc->lock, flags);
-	rtc_delayed_write(time, rtc, RTC_TIME);
+	rtc_delayed_write((u32)time, rtc, RTC_TIME);
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
-out:
-	return ret;
+	return 0;
 }
 
 static int armada38x_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alrm)
 {
 	struct armada38x_rtc *rtc = dev_get_drvdata(dev);
-	unsigned long time, flags;
+	unsigned long flags;
+	time64_t time;
 	u32 val;
 
 	spin_lock_irqsave(&rtc->lock, flags);
 
-	time = read_rtc_register_wa(rtc, RTC_ALARM1);
+	time = (time64_t)read_rtc_register_wa(rtc, RTC_ALARM1);
 	val = read_rtc_register_wa(rtc, RTC_IRQ1_CONF) & RTC_IRQ1_AL_EN;
 
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
 	alrm->enabled = val ? 1 : 0;
-	rtc_time_to_tm(time,  &alrm->time);
+	rtc_time64_to_tm(time,  &alrm->time);
 
 	return 0;
 }
@@ -180,18 +178,15 @@ static int armada38x_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alrm)
 static int armada38x_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm)
 {
 	struct armada38x_rtc *rtc = dev_get_drvdata(dev);
-	unsigned long time, flags;
-	int ret = 0;
+	unsigned long flags;
+	time64_t time;
 	u32 val;
 
-	ret = rtc_tm_to_time(&alrm->time, &time);
-
-	if (ret)
-		goto out;
+	time = rtc_tm_to_time64(&alrm->time);
 
 	spin_lock_irqsave(&rtc->lock, flags);
 
-	rtc_delayed_write(time, rtc, RTC_ALARM1);
+	rtc_delayed_write((u32)time, rtc, RTC_ALARM1);
 
 	if (alrm->enabled) {
 			rtc_delayed_write(RTC_IRQ1_AL_EN, rtc, RTC_IRQ1_CONF);
@@ -202,8 +197,7 @@ static int armada38x_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm)
 
 	spin_unlock_irqrestore(&rtc->lock, flags);
 
-out:
-	return ret;
+	return 0;
 }
 
 static int armada38x_rtc_alarm_irq_enable(struct device *dev,
-- 
2.10.2

^ permalink raw reply related

* [PATCH 3/3] rtc: armada38x: Prepare for being use on 64 bits
From: Gregory CLEMENT @ 2016-12-08 17:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208171010.29446-1-gregory.clement@free-electrons.com>

The drivers are supposed to be portable, however there are few assumption
done here about the unsigned long size. Make sure we use the accurate
width for the variable.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
 drivers/rtc/rtc-armada38x.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/rtc/rtc-armada38x.c b/drivers/rtc/rtc-armada38x.c
index b8a74ffaae80..c4138130febf 100644
--- a/drivers/rtc/rtc-armada38x.c
+++ b/drivers/rtc/rtc-armada38x.c
@@ -84,14 +84,14 @@ static void rtc_update_mbus_timing_params(struct armada38x_rtc *rtc)
 }
 
 struct str_value_to_freq {
-	unsigned long value;
+	u32 value;
 	u8 freq;
 } __packed;
 
-static unsigned long read_rtc_register_wa(struct armada38x_rtc *rtc, u8 rtc_reg)
+static u32 read_rtc_register_wa(struct armada38x_rtc *rtc, u8 rtc_reg)
 {
-	unsigned long value_array[SAMPLE_NR], i, j, value;
-	unsigned long max = 0, index_max = SAMPLE_NR - 1;
+	int i, j, index_max = SAMPLE_NR - 1;
+	u32 value_array[SAMPLE_NR], value, max = 0;
 	struct str_value_to_freq value_to_freq[SAMPLE_NR];
 
 	for (i = 0; i < SAMPLE_NR; i++) {
-- 
2.10.2

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox