LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Bug in reclaim logic with exhausted nodes?
From: Nishanth Aravamudan @ 2014-03-11 21:06 UTC (permalink / raw)
  To: linux-mm; +Cc: cl, rientjes, linuxppc-dev, anton, mgorman

We have seen the following situation on a test system:

2-node system, each node has 32GB of memory.

2 gigantic (16GB) pages reserved at boot-time, both of which are
allocated from node 1.

SLUB notices this:

[    0.000000] SLUB: Unable to allocate memory from node 1
[    0.000000] SLUB: Allocating a useless per node structure in order to
be able to continue

After boot, user then did:

echo 24 > /proc/sys/vm/nr_hugepages

And tasks are stuck:

[<c0000000010980b8>] kexec_stack+0xb8/0x8000
[<c0000000000144d0>] .__switch_to+0x1c0/0x390
[<c0000000001ac708>] .throttle_direct_reclaim.isra.31+0x238/0x2c0
[<c0000000001b0b34>] .try_to_free_pages+0xb4/0x210
[<c0000000001a2f1c>] .__alloc_pages_nodemask+0x75c/0xb00
[<c0000000001eafb0>] .alloc_fresh_huge_page+0x70/0x150
[<c0000000001eb2d0>] .set_max_huge_pages.part.37+0x130/0x2f0
[<c0000000001eb7c8>] .hugetlb_sysctl_handler_common+0x168/0x180
[<c0000000002ae21c>] .proc_sys_call_handler+0xfc/0x120
[<c00000000021dcc0>] .vfs_write+0xe0/0x260
[<c00000000021e8c8>] .SyS_write+0x58/0xd0
[<c000000000009e7c>] syscall_exit+0x0/0x7c

[<c00000004f9334b0>] 0xc00000004f9334b0
[<c0000000000144d0>] .__switch_to+0x1c0/0x390
[<c0000000001ac708>] .throttle_direct_reclaim.isra.31+0x238/0x2c0
[<c0000000001b0b34>] .try_to_free_pages+0xb4/0x210
[<c0000000001a2f1c>] .__alloc_pages_nodemask+0x75c/0xb00
[<c0000000001eafb0>] .alloc_fresh_huge_page+0x70/0x150
[<c0000000001eb2d0>] .set_max_huge_pages.part.37+0x130/0x2f0
[<c0000000001eb7c8>] .hugetlb_sysctl_handler_common+0x168/0x180
[<c0000000002ae21c>] .proc_sys_call_handler+0xfc/0x120
[<c00000000021dcc0>] .vfs_write+0xe0/0x260
[<c00000000021e8c8>] .SyS_write+0x58/0xd0
[<c000000000009e7c>] syscall_exit+0x0/0x7c

[<c00000004f91f440>] 0xc00000004f91f440
[<c0000000000144d0>] .__switch_to+0x1c0/0x390
[<c0000000001ac708>] .throttle_direct_reclaim.isra.31+0x238/0x2c0
[<c0000000001b0b34>] .try_to_free_pages+0xb4/0x210
[<c0000000001a2f1c>] .__alloc_pages_nodemask+0x75c/0xb00
[<c0000000001eafb0>] .alloc_fresh_huge_page+0x70/0x150
[<c0000000001eb2d0>] .set_max_huge_pages.part.37+0x130/0x2f0
[<c0000000001eb54c>] .nr_hugepages_store_common.isra.39+0xbc/0x1b0
[<c0000000003662cc>] .kobj_attr_store+0x2c/0x50
[<c0000000002b2c2c>] .sysfs_write_file+0xec/0x1c0
[<c00000000021dcc0>] .vfs_write+0xe0/0x260
[<c00000000021e8c8>] .SyS_write+0x58/0xd0
[<c000000000009e7c>] syscall_exit+0x0/0x7c

kswapd1 is also pegged at this point at 100% cpu.

If we go in and manually:

echo 24 >
/sys/devices/system/node/node0/hugepages/hugepages-16384kB/nr_hugepages

rather than relying on the interleaving allocator from the sysctl, the
allocation succeeds (and the echo returns immediately).

I think we are hitting the following:

mm/hugetlb.c::alloc_fresh_huge_page_node():

        page = alloc_pages_exact_node(nid,
                htlb_alloc_mask(h)|__GFP_COMP|__GFP_THISNODE|
                                                __GFP_REPEAT|__GFP_NOWARN,
                huge_page_order(h));

include/linux/gfp.h:

#define GFP_THISNODE    (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)

and mm/page_alloc.c::__alloc_pages_slowpath():

        /*
         * GFP_THISNODE (meaning __GFP_THISNODE, __GFP_NORETRY and
         * __GFP_NOWARN set) should not cause reclaim since the subsystem
         * (f.e. slab) using GFP_THISNODE may choose to trigger reclaim
         * using a larger set of nodes after it has established that the
         * allowed per node queues are empty and that nodes are
         * over allocated.
         */
        if (IS_ENABLED(CONFIG_NUMA) &&
                        (gfp_mask & GFP_THISNODE) == GFP_THISNODE)
                goto nopage;

so we *do* reclaim in this callpath. Under my reading, since node1 is
exhausted, no matter how much work kswapd1 does, it will never reclaim
memory from node1 to satisfy a 16M page allocation request (or any
other, for that matter).

I see the following possible changes/fixes, but am unsure if
a) my analysis is right
b) which is best.

1) Since we did notice early in boot that (in this case) node 1 was
exhausted, perhaps we should mark it as such there somehow, and if a
__GFP_THISNODE allocation request comes through on such a node, we
immediately fallthrough to nopage?

2) There is the following check
        /*
         * For order > PAGE_ALLOC_COSTLY_ORDER, if __GFP_REPEAT is
         * specified, then we retry until we no longer reclaim any pages
         * (above), or we've reclaimed an order of pages at least as
         * large as the allocation's order. In both cases, if the
         * allocation still fails, we stop retrying.
         */
        if (gfp_mask & __GFP_REPEAT && pages_reclaimed < (1 << order))
                return 1;

I wonder if we should add a check to also be sure that the pages we are
reclaiming, if __GFP_THISNODE is set, are from the right node?

       if (gfp_mask & __GFP_THISNODE && the progress we have made is on
       		the node requested?)

3) did_some_progress could be updated to track where the progress is
occuring, and if we are in __GFP_THISNODE allocation request and we
didn't make any progress on the correct node, we fail the allocation?

I think this situation could be reproduced (and am working on it) by
exhausting a NUMA node with 16M hugepages and then using the generic
RR allocator to ask for more. Other node exhaustion cases probably
exist, but since we can't swap the hugepages, it seems like the most
straightforward way to try and reproduce it.

Any thoughts on this? Am I way off base?

Thanks,
Nish

^ permalink raw reply

* Re: [PATCH v3 00/52] CPU hotplug: Fix issues with callback registration
From: Andrew Morton @ 2014-03-11 22:07 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-arch, ego, walken, linux, linux-pm, peterz, rusty, rjw,
	oleg, linux-kernel, linuxppc-dev, paulus, tj, tglx, paulmck,
	mingo
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>

On Tue, 11 Mar 2014 02:03:52 +0530 "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> wrote:

> Hi,
> 
> Many subsystems and drivers have the need to register CPU hotplug callbacks
> from their init routines and also perform initialization for the CPUs that are
> already online. But unfortunately there is no race-free way to achieve this
> today.
> 
> For example, consider this piece of code:
> 
> 	get_online_cpus();
> 
> 	for_each_online_cpu(cpu)
> 		init_cpu(cpu);
> 
> 	register_cpu_notifier(&foobar_cpu_notifier);
> 
> 	put_online_cpus();
> 
> This is not safe because there is a possibility of an ABBA deadlock involving
> the cpu_add_remove_lock and the cpu_hotplug.lock.
> 
>           CPU 0                                         CPU 1
>           -----                                         -----
> 
>    Acquire cpu_hotplug.lock
>    [via get_online_cpus()]
> 
>                                               CPU online/offline operation
>                                               takes cpu_add_remove_lock
>                                               [via cpu_maps_update_begin()]
> 
>    Try to acquire
>    cpu_add_remove_lock
>    [via register_cpu_notifier()]
> 
>                                               CPU online/offline operation
>                                               tries to acquire cpu_hotplug.lock
>                                               [via cpu_hotplug_begin()]

Can't we fix this by using a different (ie: new) lock to protect
cpu_chain?

^ permalink raw reply

* Re: [PATCH] powerpc/powernv: Infrastructure to support OPAL async completion
From: Stewart Smith @ 2014-03-11 23:34 UTC (permalink / raw)
  To: Neelesh Gupta, linuxppc-dev
In-Reply-To: <20140307053010.3365.28513.stgit@neelegup-tp-t420.in.ibm.com>

Neelesh Gupta <neelegup@linux.vnet.ibm.com> writes:
> diff --git a/arch/powerpc/platforms/powernv/opal-async.c b/arch/powerpc/platforms/powernv/opal-async.c
> new file mode 100644
> index 0000000..cd0c135
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/opal-async.c
> @@ -0,0 +1,203 @@
> +/*
> + * PowerNV OPAL asynchronous completion interfaces
> + *
> + * Copyright 2013 IBM Corp.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#undef DEBUG

this strikes me as slightly odd... are you sure it's correct?

^ permalink raw reply

* Re: [PATCH 3/9] powerpc/rcpm: add RCPM driver
From: Scott Wood @ 2014-03-11 23:42 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-3-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:57 +0800, Chenhui Zhao wrote:
> diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
> index b756f3d..3fdf9f3 100644
> --- a/arch/powerpc/platforms/85xx/corenet_generic.c
> +++ b/arch/powerpc/platforms/85xx/corenet_generic.c
> @@ -56,6 +56,8 @@ void __init corenet_gen_setup_arch(void)
>  
>  	swiotlb_detect_4g();
>  
> +	fsl_rcpm_init();
> +
>  	pr_info("%s board from Freescale Semiconductor\n", ppc_md.name);

RCPM is not board-specific.  Why is this in board code?

> +static void rcpm_v1_cpu_enter_state(int cpu, int state)
> +{
> +	unsigned int hw_cpu = get_hard_smp_processor_id(cpu);
> +	unsigned int mask = 1 << hw_cpu;
> +
> +	switch (state) {
> +	case E500_PM_PH10:
> +		setbits32(&rcpm_v1_regs->cdozcr, mask);
> +		break;
> +	case E500_PM_PH15:
> +		setbits32(&rcpm_v1_regs->cnapcr, mask);
> +		break;
> +	default:
> +		pr_err("Unknown cpu PM state\n");
> +		break;
> +	}
> +}

Put __func__ in error messages -- and for "unknown value" type messages,
print the value.


> +static int rcpm_v1_plat_enter_state(int state)
> +{
> +	u32 *pmcsr_reg = &rcpm_v1_regs->powmgtcsr;
> +	int ret = 0;
> +	int result;
> +
> +	switch (state) {
> +	case PLAT_PM_SLEEP:
> +		setbits32(pmcsr_reg, RCPM_POWMGTCSR_SLP);
> +
> +		/* At this point, the device is in sleep mode. */
> +
> +		/* Upon resume, wait for RCPM_POWMGTCSR_SLP bit to be clear. */
> +		result = spin_event_timeout(
> +		  !(in_be32(pmcsr_reg) & RCPM_POWMGTCSR_SLP), 10000, 10);
> +		if (!result) {
> +			pr_err("%s: timeout waiting for SLP bit to be cleared\n",
> +			  __func__);

Why are you indenting continuation lines with only two spaces (and yet
still not aligning with anything)?

> +			ret = -ETIMEDOUT;
> +		}
> +		break;
> +	default:
> +		pr_err("Unsupported platform PM state\n");
> +		ret = -EINVAL;
> +	}
> +
> +	return ret;
> +}
> +
> +static void rcpm_v1_freeze_time_base(int freeze)
> +{
> +	u32 *tben_reg = &rcpm_v1_regs->ctbenr;
> +	static u32 mask;
> +
> +	if (freeze) {
> +		mask = in_be32(tben_reg);
> +		clrbits32(tben_reg, mask);
> +	} else {
> +		setbits32(tben_reg, mask);
> +	}
> +
> +	/* read back to push the previous write */
> +	in_be32(tben_reg);
> +}
> +
> +static void rcpm_v2_freeze_time_base(int freeze)
> +{
> +	u32 *tben_reg = &rcpm_v2_regs->pctbenr;
> +	static u32 mask;
> +
> +	if (freeze) {
> +		mask = in_be32(tben_reg);
> +		clrbits32(tben_reg, mask);
> +	} else {
> +		setbits32(tben_reg, mask);
> +	}
> +
> +	/* read back to push the previous write */
> +	in_be32(tben_reg);
> +}

It looks like the only difference between these two functions is how you
calculate tben_reg -- factor the rest out into a single function.

> +int fsl_rcpm_init(void)
> +{
> +	struct device_node *np;
> +
> +	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-2.0");
> +	if (np) {
> +		rcpm_v2_regs = of_iomap(np, 0);
> +		of_node_put(np);
> +		if (!rcpm_v2_regs)
> +			return -ENOMEM;
> +
> +		qoriq_pm_ops = &qoriq_rcpm_v2_ops;
> +
> +	} else {
> +		np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-1.0");
> +		if (np) {
> +			rcpm_v1_regs = of_iomap(np, 0);
> +			of_node_put(np);
> +			if (!rcpm_v1_regs)
> +				return -ENOMEM;
> +
> +			qoriq_pm_ops = &qoriq_rcpm_v1_ops;
> +
> +		} else {
> +			pr_err("%s: can't find the rcpm node.\n", __func__);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return 0;
> +}

Why isn't this a proper platform driver?

-Scott

^ permalink raw reply

* Re: [PATCH 4/9] powerpc/85xx: support CPU hotplug for e500mc and e5500
From: Scott Wood @ 2014-03-11 23:48 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-4-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index ac2621a..f3f4401 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -405,8 +405,12 @@ void generic_cpu_die(unsigned int cpu)
>  
>  	for (i = 0; i < 100; i++) {
>  		smp_rmb();
> -		if (per_cpu(cpu_state, cpu) == CPU_DEAD)
> +		if (per_cpu(cpu_state, cpu) == CPU_DEAD) {
> +#ifdef CONFIG_PPC64
> +			paca[cpu].cpu_start = 0;
> +#endif

Why wasn't this needed by previous ppc64 machines?

> diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
> index 2e5911e..0047883 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -19,6 +19,7 @@
>  #include <linux/kexec.h>
>  #include <linux/highmem.h>
>  #include <linux/cpu.h>
> +#include <linux/smp.h>
>  
>  #include <asm/machdep.h>
>  #include <asm/pgtable.h>
> @@ -46,6 +47,17 @@ static u64 timebase;
>  static int tb_req;
>  static int tb_valid;
>  
> +#ifdef CONFIG_PPC_E500MC
> +/* specify the cpu PM state when cpu dies, PH15/NAP is the default */
> +int qoriq_cpu_die_state = E500_PM_PH15;
> +#endif

static?  Is there any way to modify this other than modifying source
code?

BTW, QorIQ doesn't imply an e500mc derivative.

> @@ -125,6 +138,34 @@ static void mpc85xx_take_timebase(void)
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> +#ifdef CONFIG_PPC_E500MC
> +static void qoriq_cpu_die(void)
> +{
> +	unsigned int cpu = smp_processor_id();
> +
> +	local_irq_disable();
> +#ifdef CONFIG_PPC64
> +	__hard_irq_disable();
> +#endif

Why this instead of one call to hard_irq_disable() (no leading
underscores)?

-Scott

^ permalink raw reply

* Re: [PATCH 5/9] powerpc/85xx: disable irq by hardware when suspend for 64-bit
From: Scott Wood @ 2014-03-11 23:51 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-5-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> In 64-bit mode, kernel just clears the irq soft-enable flag
> in struct paca_struct to disable external irqs. But, in
> the case of suspend, irqs should be disabled by hardware.
> Therefore, hook a function to ppc_md.suspend_disable_irqs
> to really disable irqs.
> 
> Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> ---
>  arch/powerpc/platforms/85xx/corenet_generic.c |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
> index 3fdf9f3..983d81f 100644
> --- a/arch/powerpc/platforms/85xx/corenet_generic.c
> +++ b/arch/powerpc/platforms/85xx/corenet_generic.c
> @@ -32,6 +32,13 @@
>  #include <sysdev/fsl_pci.h>
>  #include "smp.h"
>  
> +#if defined(CONFIG_PPC64) && defined(CONFIG_SUSPEND)
> +static void fsl_suspend_disable_irqs(void)
> +{
> +	__hard_irq_disable();
> +}
> +#endif

Why the underscore version?  Don't you want PACA_IRQ_HARD_DIS to be set?

If hard disabling is appropriate here, shouldn't we do it in
generic_suspend_disable_irqs()?

Are there any existing platforms that supply a
ppc_md.suspend_disable_irqs()?  I don't see any when grepping.

-Scott

^ permalink raw reply

* Re: [PATCH 6/9] powerpc/85xx: support sleep feature on QorIQ SoCs with RCPM
From: Scott Wood @ 2014-03-12  0:00 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-6-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> In sleep mode, the clocks of e500 cores and unused IP blocks is
> turned off. The IP blocks which are allowed to wake up the processor
> are still running.
> 
> The sleep mode is equal to the Standby state in Linux. Use the
> command to enter sleep mode:
>   echo standby > /sys/power/state
> 
> Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> ---
>  arch/powerpc/Kconfig                   |    4 +-
>  arch/powerpc/platforms/85xx/Makefile   |    3 +
>  arch/powerpc/platforms/85xx/qoriq_pm.c |   78 ++++++++++++++++++++++++++++++++
>  3 files changed, 83 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/platforms/85xx/qoriq_pm.c
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 05f6323..e1d6510 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -222,7 +222,7 @@ config ARCH_HIBERNATION_POSSIBLE
>  config ARCH_SUSPEND_POSSIBLE
>  	def_bool y
>  	depends on ADB_PMU || PPC_EFIKA || PPC_LITE5200 || PPC_83xx || \
> -		   (PPC_85xx && !PPC_E500MC) || PPC_86xx || PPC_PSERIES \
> +		   FSL_SOC_BOOKE || PPC_86xx || PPC_PSERIES \
>  		   || 44x || 40x
>  
>  config PPC_DCR_NATIVE
> @@ -709,7 +709,7 @@ config FSL_PCI
>  config FSL_PMC
>  	bool
>  	default y
> -	depends on SUSPEND && (PPC_85xx || PPC_86xx)
> +	depends on SUSPEND && (PPC_85xx && !PPC_E500MC || PPC_86xx)

Don't mix && and || without parentheses.

Maybe convert this into being selected (similar to FSL_RCPM), rather
than default y?

> diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
> index 25cebe7..7fae817 100644
> --- a/arch/powerpc/platforms/85xx/Makefile
> +++ b/arch/powerpc/platforms/85xx/Makefile
> @@ -2,6 +2,9 @@
>  # Makefile for the PowerPC 85xx linux kernel.
>  #
>  obj-$(CONFIG_SMP) += smp.o
> +ifeq ($(CONFIG_FSL_CORENET_RCPM), y)
> +obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o
> +endif

There should probably be a kconfig symbol for this.

> diff --git a/arch/powerpc/platforms/85xx/qoriq_pm.c b/arch/powerpc/platforms/85xx/qoriq_pm.c
> new file mode 100644
> index 0000000..915b13b
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/qoriq_pm.c
> @@ -0,0 +1,78 @@
> +/*
> + * Support Power Management feature
> + *
> + * Copyright 2014 Freescale Semiconductor Inc.
> + *
> + * Author: Chenhui Zhao <chenhui.zhao@freescale.com>
> + *
> + * This program is free software; you can redistribute	it and/or modify it
> + * under  the terms of	the GNU General	 Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/suspend.h>
> +#include <linux/of_platform.h>
> +
> +#include <sysdev/fsl_soc.h>
> +
> +#define FSL_SLEEP		0x1
> +#define FSL_DEEP_SLEEP		0x2

FSL_DEEP_SLEEP is unused.

> +
> +/* specify the sleep state of the present platform */
> +int sleep_pm_state;
> +/* supported sleep modes by the present platform */
> +static unsigned int sleep_modes;

Why is one signed and the other unsigned?

> +
> +static int qoriq_suspend_enter(suspend_state_t state)
> +{
> +	int ret = 0;
> +
> +	switch (state) {
> +	case PM_SUSPEND_STANDBY:
> +
> +		if (cur_cpu_spec->cpu_flush_caches)
> +			cur_cpu_spec->cpu_flush_caches();
> +
> +		ret = qoriq_pm_ops->plat_enter_state(sleep_pm_state);
> +
> +		break;
> +
> +	default:
> +		ret = -EINVAL;
> +
> +	}
> +
> +	return ret;
> +}
> +
> +static int qoriq_suspend_valid(suspend_state_t state)
> +{
> +	if (state == PM_SUSPEND_STANDBY && (sleep_modes & FSL_SLEEP))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +static const struct platform_suspend_ops qoriq_suspend_ops = {
> +	.valid = qoriq_suspend_valid,
> +	.enter = qoriq_suspend_enter,
> +};
> +
> +static int __init qoriq_suspend_init(void)
> +{
> +	struct device_node *np;
> +
> +	sleep_modes = FSL_SLEEP;
> +	sleep_pm_state = PLAT_PM_SLEEP;
> +
> +	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-2.0");
> +	if (np)
> +		sleep_pm_state = PLAT_PM_LPM20;
> +
> +	suspend_set_ops(&qoriq_suspend_ops);
> +
> +	return 0;
> +}
> +arch_initcall(qoriq_suspend_init);

Why is this not a platform driver?  If fsl_pmc can do it...

-Scott

^ permalink raw reply

* Re: [PATCH 7/9] fsl: add EPU FSM configuration for deep sleep
From: Scott Wood @ 2014-03-12  0:08 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-7-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> From: Hongbo Zhang <hongbo.zhang@freescale.com>
> 
> In the last stage of deep sleep, software will trigger a Finite
> State Machine (FSM) to control the hardware precedure, such as
> board isolation, killing PLLs, removing power, and so on.
> 
> When the system is waked up by an interrupt, the FSM controls the
> hardware to complete the early resume precedure.
> 
> This patch configure the EPU FSM preparing for deep sleep.
> 
> Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
> Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>

Couldn't this be part of qoriq_pm.c?

> diff --git a/arch/powerpc/sysdev/fsl_soc.h b/arch/powerpc/sysdev/fsl_soc.h
> index 9b9a34a..eb83a30 100644
> --- a/arch/powerpc/sysdev/fsl_soc.h
> +++ b/arch/powerpc/sysdev/fsl_soc.h
> @@ -69,5 +69,8 @@ extern const struct fsl_pm_ops *qoriq_pm_ops;
>  
>  extern int fsl_rcpm_init(void);
>  
> +extern void fsl_dp_fsm_setup(void *dcsr_base);
> +extern void fsl_dp_fsm_clean(void *dcsr_base);

__iomem

> +
>  #endif
>  #endif
> diff --git a/drivers/platform/Kconfig b/drivers/platform/Kconfig
> index 09fde58..6539e6d 100644
> --- a/drivers/platform/Kconfig
> +++ b/drivers/platform/Kconfig
> @@ -6,3 +6,7 @@ source "drivers/platform/goldfish/Kconfig"
>  endif
>  
>  source "drivers/platform/chrome/Kconfig"
> +
> +if FSL_SOC
> +source "drivers/platform/fsl/Kconfig"
> +endif

Chrome doesn't need an ifdef -- why does this?

> diff --git a/drivers/platform/Makefile b/drivers/platform/Makefile
> index 3656b7b..37c6f72 100644
> --- a/drivers/platform/Makefile
> +++ b/drivers/platform/Makefile
> @@ -6,3 +6,4 @@ obj-$(CONFIG_X86)		+= x86/
>  obj-$(CONFIG_OLPC)		+= olpc/
>  obj-$(CONFIG_GOLDFISH)		+= goldfish/
>  obj-$(CONFIG_CHROME_PLATFORMS)	+= chrome/
> +obj-$(CONFIG_FSL_SOC)		+= fsl/
> diff --git a/drivers/platform/fsl/Kconfig b/drivers/platform/fsl/Kconfig
> new file mode 100644
> index 0000000..72ed053
> --- /dev/null
> +++ b/drivers/platform/fsl/Kconfig
> @@ -0,0 +1,10 @@
> +#
> +# Freescale Specific Power Management Drivers
> +#
> +
> +config FSL_SLEEP_FSM
> +	bool
> +	help
> +	  This driver configures a hardware FSM (Finite State Machine) for deep sleep.
> +	  The FSM is used to finish clean-ups at the last stage of system entering deep
> +	  sleep, and also wakes up system when a wake up event happens.
> diff --git a/drivers/platform/fsl/Makefile b/drivers/platform/fsl/Makefile
> new file mode 100644
> index 0000000..d99ca0e
> --- /dev/null
> +++ b/drivers/platform/fsl/Makefile
> @@ -0,0 +1,5 @@
> +#
> +# Makefile for linux/drivers/platform/fsl
> +# Freescale Specific Power Management Drivers
> +#
> +obj-$(CONFIG_FSL_SLEEP_FSM)	+= sleep_fsm.o

Why is this here while the other stuff is in arch/powerpc/sysdev?

> +/* Block offsets */
> +#define	RCPM_BLOCK_OFFSET	0x00022000
> +#define	EPU_BLOCK_OFFSET	0x00000000
> +#define	NPC_BLOCK_OFFSET	0x00001000

Why don't these block offsets come from the device tree?

> +static void *g_dcsr_base;

__iomem

> +	/* Configure the EPU Counters */
> +	epu_write(EPCCR15, 0x92840000);
> +	epu_write(EPCCR14, 0x92840000);
> +	epu_write(EPCCR12, 0x92840000);
> +	epu_write(EPCCR11, 0x92840000);
> +	epu_write(EPCCR10, 0x92840000);
> +	epu_write(EPCCR9, 0x92840000);
> +	epu_write(EPCCR8, 0x92840000);
> +	epu_write(EPCCR5, 0x92840000);
> +	epu_write(EPCCR4, 0x92840000);
> +	epu_write(EPCCR2, 0x92840000);
> +
> +	/* Configure the SCUs Inputs */
> +	epu_write(EPSMCR15, 0x76000000);
> +	epu_write(EPSMCR14, 0x00000031);
> +	epu_write(EPSMCR13, 0x00003100);
> +	epu_write(EPSMCR12, 0x7F000000);
> +	epu_write(EPSMCR11, 0x31740000);
> +	epu_write(EPSMCR10, 0x65000030);
> +	epu_write(EPSMCR9, 0x00003000);
> +	epu_write(EPSMCR8, 0x64300000);
> +	epu_write(EPSMCR7, 0x30000000);
> +	epu_write(EPSMCR6, 0x7C000000);
> +	epu_write(EPSMCR5, 0x00002E00);
> +	epu_write(EPSMCR4, 0x002F0000);
> +	epu_write(EPSMCR3, 0x2F000000);
> +	epu_write(EPSMCR2, 0x6C700000);

Where do these magic numbers come from?  Which chips are they valid for?

> +void fsl_dp_fsm_clean(void *dcsr_base)
> +{
> +
> +	epu_write(EPEVTCR2, 0);
> +	epu_write(EPEVTCR9, 0);
> +
> +	epu_write(EPGCR, 0);
> +	epu_write(EPECR15, 0);
> +
> +	rcpm_write(CSTTACR0, 0);
> +	rcpm_write(CG1CR0, 0);
> +
> +}

Don't put blank lines at the beginning/end of a block.

-Scott

^ permalink raw reply

* Re: [PATCH 8/9] powerpc/85xx: add save/restore functions for core registers
From: Scott Wood @ 2014-03-12  0:45 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-8-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> From: Wang Dongsheng <dongsheng.wang@freescale.com>
> 
> Add booke_cpu_state_save() and booke_cpu_state_restore() functions which can be
> used to save/restore CPU's registers in the case of deep sleep and hibernation.
> 
> Supported processors: E6500, E5500, E500MC, E500v2 and E500v1.
> 
> Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
> Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> ---
>  arch/powerpc/include/asm/booke_save_regs.h |   96 ++++++++
>  arch/powerpc/kernel/Makefile               |    1 +
>  arch/powerpc/kernel/booke_save_regs.S      |  361 ++++++++++++++++++++++++++++
>  3 files changed, 458 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/booke_save_regs.h
>  create mode 100644 arch/powerpc/kernel/booke_save_regs.S
> 
> diff --git a/arch/powerpc/include/asm/booke_save_regs.h b/arch/powerpc/include/asm/booke_save_regs.h
> new file mode 100644
> index 0000000..87c357a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/booke_save_regs.h
> @@ -0,0 +1,96 @@
> +/*
> + *  Save/restore e500 series core registers

Filename says booke, comment says e500.

Filename and comment also fail to point out that this is specifically
for standby/suspend, not for hibernate which is implemented in
swsusp_booke.S/swsusp_asm64.S.

> + *
> + * Author: Wang Dongsheng <dongsheng.wang@freescale.com>
> + *
> + * Copyright 2014 Freescale Semiconductor Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __ASM_FSL_SLEEP_H
> +#define __ASM_FSL_SLEEP_H
> +
> +/*
> + * 8 bytes for each register, which is compatible with
> + * both 32-bit and 64-bit registers
> + *
> + * Acronyms:
> + *	dw(data width)	0x08
> + *
> + * Map:
> + * General-Purpose Registers
> + *	GPR1(sp)		0
> + *	GPR2			0x8		(dw * 1)
> + *	GPR13 - GPR31		0x10 ~ 0xa0	(dw * 2 ~ dw * 20)

Putting these values in a comment separate from the code that defines it
sounds like a good way to get a mismatch between the two.

> + * Foating-point registers
> + *	FPR14 - FPR31		0xa8 ~ 0x130	(dw * 21 ~ dw * 38)

Call enable_kernel_fp() or similar, rather than saving FP regs here.
Likewise with altivec.  And why starting at FPR14?  Volatile versus
nonvolatile is irrelevant because Linux doesn't participate in the FP
ABI.  Everything is non-volatile *if* we have a user FP context
resident, and everything is volatile otherwise.

> + * Timer Registers
> + *	TCR			0x168		(dw * 45)
> + *	TB(64bit)		0x170		(dw * 46)
> + *	TBU(32bit)		0x178		(dw * 47)
> + *	TBL(32bit)		0x180		(dw * 48)

Why are you saving TBU/L separate from TB?  They're the same thing.

> + * Interrupt Registers
> + *	IVPR			0x188		(dw * 49)
> + *	IVOR0 - IVOR15		0x190 ~ 0x208	(dw * 50 ~ dw * 65)
> + *	IVOR32 - IVOR41		0x210 ~ 0x258	(dw * 66 ~ dw * 75)

What about IVOR42 (LRAT error)?

> + * Software-Use Registers
> + *	SPRG1			0x260		(dw * 76), 64-bit need to save.
> + *	SPRG3			0x268		(dw * 77), 32-bit need to save.

What about "CPU and NUMA node for VDSO getcpu" on 64-bit?  Currently
SPRG3, but it will need to change for critical interrupt support.

> + * MMU Registers
> + *	PID0 - PID2		0x270 ~ 0x280	(dw * 78 ~ dw * 80)

PID1/PID2 are e500v1/v2 only -- and Linux doesn't use them outside of
KVM (and you're not in KVM when you're running this code).

Are we ever going to have a non-zero PID at this point?

> + * Debug Registers
> + *	DBCR0 - DBCR2		0x288 ~ 0x298	(dw * 81 ~ dw * 83)
> + *	IAC1 - IAC4		0x2a0 ~ 0x2b8	(dw * 84 ~ dw * 87)
> + *	DAC1 - DAC2		0x2c0 ~ 0x2c8	(dw * 88 ~ dw * 89)
> + *
> + */

IAC3-4 are not implemented on e500.

Do we really need to save the debug registers?  We're not going to be in
a debugged process when we do suspend.  If the concern is kgdb, it
probably needs to be told to get out of the way during suspend for other
reasons.

> +#define SR_GPR1			0x000
> +#define SR_GPR2			0x008
> +#define SR_GPR13		0x010
> +#define SR_FPR14		0x0a8
> +#define SR_CR			0x138
> +#define SR_LR			0x140
> +#define SR_MSR			0x148

These are very vague names to be putting in a header file.

> +/*
> + * hibernation and deepsleep save/restore different number of registers,
> + * use these flags to indicate.
> + */
> +#define HIBERNATION_FLAG	1
> +#define DEEPSLEEP_FLAG		2

Again, namespacing -- but why is hibernation using this at all?  What's
wrong with the existing hibernation support?

> diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> index fcc9a89..64acae6 100644
> --- a/arch/powerpc/kernel/Makefile
> +++ b/arch/powerpc/kernel/Makefile
> @@ -73,6 +73,7 @@ obj-$(CONFIG_HIBERNATION)	+= swsusp_booke.o
>  else
>  obj-$(CONFIG_HIBERNATION)	+= swsusp_$(CONFIG_WORD_SIZE).o
>  endif
> +obj-$(CONFIG_E500)		+= booke_save_regs.o

Shouldn't this depend on whether suspend is enabled?

> diff --git a/arch/powerpc/kernel/booke_save_regs.S b/arch/powerpc/kernel/booke_save_regs.S
> new file mode 100644
> index 0000000..9ccd576
> --- /dev/null
> +++ b/arch/powerpc/kernel/booke_save_regs.S
> @@ -0,0 +1,361 @@
> +/*
> + * Freescale Power Management, Save/Restore core state
> + *
> + * Copyright 2014 Freescale Semiconductor, Inc.
> + * Author: Wang Dongsheng <dongsheng.wang@freescale.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#include <asm/ppc_asm.h>
> +#include <asm/booke_save_regs.h>
> +
> +/*
> + * Supported Core List:
> + * E500v1, E500v2, E500MC, E5500, E6500.
> + */
> +
> +/* Save/Restore register operation define */
> +#define do_save_gpr_reg(reg, addr) \
> +	PPC_STL reg, addr(r10)
> +
> +#define do_save_fpr_reg(reg, addr) \
> +	stfd	reg, addr(r10)
> +
> +#define do_save_spr_reg(reg, addr) \
> +	mfspr	r0, SPRN_##reg ;\
> +	PPC_STL r0, addr(r10)
> +
> +#define do_save_special_reg(special, addr) \
> +	mf##special	r0 ;\
> +	PPC_STL r0, addr(r10)
> +
> +#define do_restore_gpr_reg(reg, addr) \
> +	PPC_LL reg, addr(r10)
> +
> +#define do_restore_fpr_reg(reg, addr) \
> +	lfd	reg, addr(r10)
> +
> +#define do_restore_spr_reg(reg, addr) \
> +	PPC_LL r0, addr(r10) ;\
> +	mtspr	SPRN_##reg, r0
> +
> +#define do_restore_special_reg(special, addr) \
> +	PPC_LL r0, addr(r10) ;\
> +	mt##special	r0
> +
> +#define do_sr_general_gpr_regs(func) \
> +	do_##func##_gpr_reg(r1, SR_GPR1) ;\
> +	do_##func##_gpr_reg(r2, SR_GPR2) ;\
> +	do_##func##_gpr_reg(r13, SR_GPR13 + 0x00) ;\
> +	do_##func##_gpr_reg(r14, SR_GPR13 + 0x08) ;\
> +	do_##func##_gpr_reg(r15, SR_GPR13 + 0x10) ;\
> +	do_##func##_gpr_reg(r16, SR_GPR13 + 0x18) ;\
> +	do_##func##_gpr_reg(r17, SR_GPR13 + 0x20) ;\
> +	do_##func##_gpr_reg(r18, SR_GPR13 + 0x28) ;\
> +	do_##func##_gpr_reg(r19, SR_GPR13 + 0x30) ;\
> +	do_##func##_gpr_reg(r20, SR_GPR13 + 0x38) ;\
> +	do_##func##_gpr_reg(r21, SR_GPR13 + 0x40) ;\
> +	do_##func##_gpr_reg(r22, SR_GPR13 + 0x48) ;\
> +	do_##func##_gpr_reg(r23, SR_GPR13 + 0x50) ;\
> +	do_##func##_gpr_reg(r24, SR_GPR13 + 0x58) ;\
> +	do_##func##_gpr_reg(r25, SR_GPR13 + 0x60) ;\
> +	do_##func##_gpr_reg(r26, SR_GPR13 + 0x68) ;\
> +	do_##func##_gpr_reg(r27, SR_GPR13 + 0x70) ;\
> +	do_##func##_gpr_reg(r28, SR_GPR13 + 0x78) ;\
> +	do_##func##_gpr_reg(r29, SR_GPR13 + 0x80) ;\
> +	do_##func##_gpr_reg(r30, SR_GPR13 + 0x88) ;\
> +	do_##func##_gpr_reg(r31, SR_GPR13 + 0x90)
> +
> +#define do_sr_general_pcr_regs(func) \
> +	do_##func##_spr_reg(EPCR, SR_EPCR) ;\
> +	do_##func##_spr_reg(HID0, SR_HID0 + 0x00)
> +
> +#define do_sr_e500_pcr_regs(func) \
> +	do_##func##_spr_reg(HID1, SR_HID0 + 0x08)

PCR?

In the comments you said "Only e500, e500v2 need to save HID0 - HID1",
yet you save HID0 in the "general" macro.

> +#define do_sr_save_tb_regs \
> +	do_save_spr_reg(TBRU, SR_TBU) ;\
> +	do_save_spr_reg(TBRL, SR_TBL)

What if TBU increments between those two reads?  Use the standard
sequence to read the timebase.

> +#define do_sr_interrupt_regs(func) \
> +	do_##func##_spr_reg(IVPR, SR_IVPR) ;\
> +	do_##func##_spr_reg(IVOR0, SR_IVOR0 + 0x00) ;\
> +	do_##func##_spr_reg(IVOR1, SR_IVOR0 + 0x08) ;\
> +	do_##func##_spr_reg(IVOR2, SR_IVOR0 + 0x10) ;\
> +	do_##func##_spr_reg(IVOR3, SR_IVOR0 + 0x18) ;\
> +	do_##func##_spr_reg(IVOR4, SR_IVOR0 + 0x20) ;\
> +	do_##func##_spr_reg(IVOR5, SR_IVOR0 + 0x28) ;\
> +	do_##func##_spr_reg(IVOR6, SR_IVOR0 + 0x30) ;\
> +	do_##func##_spr_reg(IVOR7, SR_IVOR0 + 0x38) ;\
> +	do_##func##_spr_reg(IVOR8, SR_IVOR0 + 0x40) ;\
> +	do_##func##_spr_reg(IVOR10, SR_IVOR0 + 0x50) ;\
> +	do_##func##_spr_reg(IVOR11, SR_IVOR0 + 0x58) ;\
> +	do_##func##_spr_reg(IVOR12, SR_IVOR0 + 0x60) ;\
> +	do_##func##_spr_reg(IVOR13, SR_IVOR0 + 0x68) ;\
> +	do_##func##_spr_reg(IVOR14, SR_IVOR0 + 0x70) ;\
> +	do_##func##_spr_reg(IVOR15, SR_IVOR0 + 0x78)
> +
> +#define do_e500_sr_interrupt_regs(func) \
> +	do_##func##_spr_reg(IVOR32, SR_IVOR32 + 0x00) ;\
> +	do_##func##_spr_reg(IVOR33, SR_IVOR32 + 0x08) ;\
> +	do_##func##_spr_reg(IVOR34, SR_IVOR32 + 0x10)
> +
> +#define do_e500mc_sr_interrupt_regs(func) \
> +	do_##func##_spr_reg(IVOR9, SR_IVOR0 + 0x48) ;\
> +	do_##func##_spr_reg(IVOR35, SR_IVOR32 + 0x18) ;\
> +	do_##func##_spr_reg(IVOR36, SR_IVOR32 + 0x20) ;\
> +	do_##func##_spr_reg(IVOR37, SR_IVOR32 + 0x28) ;\
> +	do_##func##_spr_reg(IVOR38, SR_IVOR32 + 0x30) ;\
> +	do_##func##_spr_reg(IVOR39, SR_IVOR32 + 0x38) ;\
> +	do_##func##_spr_reg(IVOR40, SR_IVOR32 + 0x40) ;\
> +	do_##func##_spr_reg(IVOR41, SR_IVOR32 + 0x48)
> +
> +#define do_e5500_sr_interrupt_regs(func) \
> +	do_e500mc_sr_interrupt_regs(func)
> +
> +#define do_e6500_sr_interrupt_regs(func) \
> +	do_##func##_spr_reg(IVOR32, SR_IVOR32 + 0x00) ;\
> +	do_##func##_spr_reg(IVOR33, SR_IVOR32 + 0x08) ;\
> +	do_e5500_sr_interrupt_regs(func)
> +
> +#define do_sr_general_software_regs(func) \
> +	do_##func##_spr_reg(SPRG1, SR_SPRG1) ;\
> +	do_##func##_spr_reg(SPRG3, SR_SPRG3) ;\
> +	do_##func##_spr_reg(PID0, SR_PID0)
> +
> +#define do_sr_e500_mmu_regs(func) \
> +	do_##func##_spr_reg(PID1, SR_PID0 + 0x08) ;\
> +	do_##func##_spr_reg(PID2, SR_PID0 + 0x10)
> +
> +#define do_sr_debug_regs(func) \
> +	do_##func##_spr_reg(DBCR0, SR_DBCR0 + 0x00) ;\
> +	do_##func##_spr_reg(DBCR1, SR_DBCR0 + 0x08) ;\
> +	do_##func##_spr_reg(DBCR2, SR_DBCR0 + 0x10) ;\
> +	do_##func##_spr_reg(IAC1, SR_IAC1 + 0x00) ;\
> +	do_##func##_spr_reg(IAC2, SR_IAC1 + 0x08) ;\
> +	do_##func##_spr_reg(DAC1, SR_DAC1 + 0x00) ;\
> +	do_##func##_spr_reg(DAC2, SR_DAC1 + 0x08)
> +
> +#define do_e6500_sr_debug_regs(func) \
> +	do_##func##_spr_reg(IAC3, SR_IAC1 + 0x10) ;\
> +	do_##func##_spr_reg(IAC4, SR_IAC1 + 0x18)
> +
> +	.section .text
> +	.align	5
> +booke_cpu_base_save:
> +	do_sr_general_gpr_regs(save)
> +	do_sr_general_pcr_regs(save)
> +	do_sr_general_software_regs(save)
> +1:
> +	mfspr	r5, SPRN_TBRU
> +	do_sr_general_time_regs(save)
> +	mfspr	r6, SPRN_TBRU
> +	cmpw	r5, r6
> +	bne	1b

Oh, here's where you deal with that.  Why?  It just obfuscates things.

> +booke_cpu_base_restore:
> +	do_sr_general_gpr_regs(restore)
> +	do_sr_general_pcr_regs(restore)
> +	do_sr_general_software_regs(restore)
> +
> +	isync
> +
> +	/* Restore Time registers, clear tb lower to avoid wrap */
> +	li	r0, 0
> +	mtspr	SPRN_TBWL, r0
> +	do_sr_general_time_regs(restore)

Why is zeroing TBL not done in the same place as you load the new TB?

> +/* Base registers, e500v1, e500v2 need to do some special save/restore */
> +e500_base_special_save:
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V1@l
> +	cmpw	r11, r12
> +	beq	1f
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V2@l
> +	cmpw	r11, r12
> +	bne	2f
> +1:
> +	do_sr_e500_pcr_regs(save)
> +	do_sr_e500_mmu_regs(save)
> +2:
> +	blr
> +
> +e500_base_special_restore:
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V1@l
> +	cmpw	r11, r12
> +	beq	1f
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V2@l
> +	cmpw	r11, r12
> +	bne	2f
> +1:
> +	do_sr_e500_pcr_regs(save)
> +	do_sr_e500_mmu_regs(save)
> +2:
> +	blr

Why is this separate from the other CPU-specific "append" code?

> +booke_cpu_append_save:
> +	mfspr	r0, SPRN_PVR
> +	rlwinm	r11, r0, 16, 16, 31
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E6500@l
> +	cmpw	r11, r12
> +	beq	e6500_append_save
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E5500@l
> +	cmpw	r11, r12
> +	beq	e5500_append_save
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500MC@l
> +	cmpw	r11, r12
> +	beq	e500mc_append_save
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V2@l
> +	cmpw	r11, r12
> +	beq	e500v2_append_save
> +
> +	lis	r12, 0
> +	ori	r12, r12, PVR_VER_E500V1@l
> +	cmpw	r11, r12
> +	beq	e500v1_append_save
> +	b	1f
> +
> +e6500_append_save:
> +	do_e6500_sr_interrupt_regs(save)
> +	do_e6500_sr_debug_regs(save)
> +	b	1f
> +
> +e5500_append_save:
> +	do_e5500_sr_interrupt_regs(save)
> +	b	1f
> +
> +e500mc_append_save:
> +	do_e500mc_sr_interrupt_regs(save)
> +	b	1f
> +
> +e500v2_append_save:
> +e500v1_append_save:
> +	do_e500_sr_interrupt_regs(save)
> +1:
> +	do_sr_interrupt_regs(save)
> +	do_sr_debug_regs(save)
> +
> +	blr

What is meant by "append" here?

> +/*
> + * r3 = the address of buffer
> + * r4 = type: HIBERNATION_FLAG or DEEPSLEEP_FLAG
> + */
> +_GLOBAL(booke_cpu_state_save)
> +	mflr	r9
> +	mr	r10, r3
> +
> +	cmpwi	r4, HIBERNATION_FLAG
> +	beq	1f
> +	bl	booke_cpu_append_save
> +1:
> +	bl	e500_base_special_save
> +	bl	booke_cpu_base_save
> +
> +	mtlr	r9
> +	blr

You're assuming a special ABI from these subfunctions (e.g. r9
non-volatile) but I don't see any comment to that effect on those
functions.

-Scott

^ permalink raw reply

* Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040
From: Scott Wood @ 2014-03-12  1:10 UTC (permalink / raw)
  To: Chenhui Zhao; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394168285-32275-9-git-send-email-chenhui.zhao@freescale.com>

On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> From: Zhao Chenhui <chenhui.zhao@freescale.com>
> 
> T1040 supports deep sleep feature, which can switch off most parts of
> the SoC when it is in deep sleep mode. This way, it becomes more
> energy-efficient.
> 
> The DDR controller will also be powered off in deep sleep. Therefore,
> the last stage (the latter part of fsl_dp_enter_low) will run without DDR
> access. This piece of code and related TLBs will be prefetched.
> 
> Due to the different initialization code between 32-bit and 64-bit, they
> have seperate resume entry and precedure.
> 
> The feature supports 32-bit and 64-bit kernel mode.
> 
> Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
> ---
>  arch/powerpc/include/asm/booke_save_regs.h |    3 +
>  arch/powerpc/kernel/cpu_setup_fsl_booke.S  |   17 ++
>  arch/powerpc/kernel/head_fsl_booke.S       |   30 +++
>  arch/powerpc/platforms/85xx/Makefile       |    2 +-
>  arch/powerpc/platforms/85xx/deepsleep.c    |  201 +++++++++++++++++++
>  arch/powerpc/platforms/85xx/qoriq_pm.c     |   38 ++++
>  arch/powerpc/platforms/85xx/sleep.S        |  295 ++++++++++++++++++++++++++++
>  arch/powerpc/sysdev/fsl_soc.h              |    7 +
>  8 files changed, 592 insertions(+), 1 deletions(-)
>  create mode 100644 arch/powerpc/platforms/85xx/deepsleep.c
>  create mode 100644 arch/powerpc/platforms/85xx/sleep.S
> 
> diff --git a/arch/powerpc/include/asm/booke_save_regs.h b/arch/powerpc/include/asm/booke_save_regs.h
> index 87c357a..37c1f6c 100644
> --- a/arch/powerpc/include/asm/booke_save_regs.h
> +++ b/arch/powerpc/include/asm/booke_save_regs.h
> @@ -88,6 +88,9 @@
>  #define HIBERNATION_FLAG	1
>  #define DEEPSLEEP_FLAG		2
>  
> +#define CPLD_FLAG	1
> +#define FPGA_FLAG	2

What is this?

>  #ifndef __ASSEMBLY__
>  extern void booke_cpu_state_save(void *buf, int type);
>  extern void *booke_cpu_state_restore(void *buf, int type);
> diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> index e59d6de..ea9bc28 100644
> --- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> +++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> @@ -318,6 +318,23 @@ flush_backside_L2_cache:
>  2:
>  	blr
>  
> +#define CPC_CPCCSR0		0x0
> +#define CPC_CPCCSR0_CPCFL	0x800

This is supposed to be CPU setup, not platform setup.

> +/* r3 : the base address of CPC  */
> +_GLOBAL(fsl_flush_cpc_cache)
> +	lwz	r6, CPC_CPCCSR0(r3)
> +	ori	r6, r6, CPC_CPCCSR0_CPCFL
> +	stw	r6, CPC_CPCCSR0(r3)
> +	sync
> +
> +	/* Wait until completing the flush */
> +1:	lwz	r6, CPC_CPCCSR0(r3)
> +	andi.	r6, r6, CPC_CPCCSR0_CPCFL
> +	bne	1b
> +
> +	blr
> +
>  _GLOBAL(__flush_caches_e500v2)

I'm not sure that this belongs here either.

>  	mflr r0
>  	bl	flush_dcache_L1
> diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
> index 20204fe..3285752 100644
> --- a/arch/powerpc/kernel/head_fsl_booke.S
> +++ b/arch/powerpc/kernel/head_fsl_booke.S
> @@ -162,6 +162,19 @@ _ENTRY(__early_start)
>  #include "fsl_booke_entry_mapping.S"
>  #undef ENTRY_MAPPING_BOOT_SETUP
>  
> +#if defined(CONFIG_SUSPEND) && defined(CONFIG_FSL_CORENET_RCPM)
> +	/* if deep_sleep_flag != 0, jump to the deep sleep resume entry */
> +	LOAD_REG_ADDR(r4, deep_sleep_flag)
> +	lwz	r3, 0(r4)
> +	cmpwi	r3, 0
> +	beq	11f
> +	/* clear deep_sleep_flag */
> +	li	r3, 0
> +	stw	r3, 0(r4)
> +	b	fsl_deepsleep_resume
> +11:
> +#endif

Why do you come in via the normal kernel entry, versus specifying a
direct entry point for deep sleep resume?  How does U-Boot even know
what the normal entry is when resuming?

Be careful of the "beq set_ivor" in the CONFIG_RELOCATABLE section
above.  Also you probably don't want the relocation code to run again
when resuming.

> +#if defined(CONFIG_SUSPEND) && defined(CONFIG_FSL_CORENET_RCPM)
> +_ENTRY(__entry_deep_sleep)
> +/*
> + * Bootloader will jump to here when resuming from deep sleep.
> + * After executing the init code in fsl_booke_entry_mapping.S,
> + * will jump to the real resume entry.
> + */
> +	li	r8, 1
> +	bl	12f
> +12:	mflr	r9
> +	addi	r9, r9, (deep_sleep_flag - 12b)
> +	stw	r8, 0(r9)
> +	b __early_start
> +deep_sleep_flag:
> +	.long	0
> +#endif

It's a bit ambiguous to say "entry_deep_sleep" when it's resuming rather
than entering...

So you do have a special entry point.  Why do you go to __early_start
only to quickly test a flag and branch away?

> diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
> index 7fae817..9a4ea86 100644
> --- a/arch/powerpc/platforms/85xx/Makefile
> +++ b/arch/powerpc/platforms/85xx/Makefile
> @@ -3,7 +3,7 @@
>  #
>  obj-$(CONFIG_SMP) += smp.o
>  ifeq ($(CONFIG_FSL_CORENET_RCPM), y)
> -obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o
> +obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o deepsleep.o sleep.o
>  endif
>  
>  obj-y += common.o
> diff --git a/arch/powerpc/platforms/85xx/deepsleep.c b/arch/powerpc/platforms/85xx/deepsleep.c
> new file mode 100644
> index 0000000..ddd7185
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/deepsleep.c
> @@ -0,0 +1,201 @@
> +/*
> + * Support deep sleep feature

AFAICT this supports deep sleep on T1040, not on all 85xx that has it.

> + *
> + * Copyright 2014 Freescale Semiconductor Inc.
> + *
> + * Author: Chenhui Zhao <chenhui.zhao@freescale.com>
> + *
> + * This program is free software; you can redistribute	it and/or modify it
> + * under  the terms of	the GNU General	 Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/of_platform.h>
> +#include <linux/of_address.h>
> +#include <asm/machdep.h>
> +#include <sysdev/fsl_soc.h>
> +#include <asm/booke_save_regs.h>
> +
> +#define SIZE_1MB	0x100000
> +#define SIZE_2MB	0x200000
> +
> +#define CCSR_SCFG_DPSLPCR	0xfc000
> +#define CCSR_SCFG_DPSLPCR_WDRR_EN	0x1
> +#define CCSR_SCFG_SPARECR2	0xfc504
> +#define CCSR_SCFG_SPARECR3	0xfc508
> +
> +#define CCSR_GPIO1_GPDIR	0x130000
> +#define CCSR_GPIO1_GPODR	0x130004
> +#define CCSR_GPIO1_GPDAT	0x130008
> +#define CCSR_GPIO1_GPDIR_29		0x4
> +
> +/* 128 bytes buffer for restoring data broke by DDR training initialization */
> +#define DDR_BUF_SIZE	128
> +static u8 ddr_buff[DDR_BUF_SIZE] __aligned(64);
> +
> +static void *dcsr_base, *ccsr_base, *pld_base;
> +static int pld_flag;
> +
> +int fsl_dp_iomap(void)
> +{
> +	struct device_node *np;
> +	const u32 *prop;
> +	int ret = 0;
> +	u64 ccsr_phy_addr, dcsr_phy_addr;
> +
> +	np = of_find_node_by_type(NULL, "soc");
> +	if (!np) {
> +		pr_err("%s: Can't find the node of \"soc\"\n", __func__);
> +		ret = -EINVAL;
> +		goto ccsr_err;
> +	}
> +	prop = of_get_property(np, "ranges", NULL);
> +	if (!prop) {
> +		pr_err("%s: Can't find the property of \"ranges\"\n", __func__);
> +		of_node_put(np);
> +		ret = -EINVAL;
> +		goto ccsr_err;
> +	}

Use get_immrbase(), or better use specific nodes in the device tree.

> +	ccsr_phy_addr = of_translate_address(np, prop + 1);
> +	ccsr_base = ioremap((phys_addr_t)ccsr_phy_addr, SIZE_2MB);
> +	of_node_put(np);
> +	if (!ccsr_base) {
> +		ret = -ENOMEM;
> +		goto ccsr_err;
> +	}

Unnecessary cast.

Why 2 MiB?

> +	np = of_find_compatible_node(NULL, NULL, "fsl,dcsr");
> +	if (!np) {
> +		pr_err("%s: Can't find the node of \"fsl,dcsr\"\n", __func__);
> +		ret = -EINVAL;
> +		goto dcsr_err;
> +	}
> +	prop = of_get_property(np, "ranges", NULL);
> +	if (!prop) {
> +		pr_err("%s: Can't find the property of \"ranges\"\n", __func__);
> +		of_node_put(np);
> +		ret = -EINVAL;
> +		goto dcsr_err;
> +	}
> +	dcsr_phy_addr = of_translate_address(np, prop + 1);
> +	dcsr_base = ioremap((phys_addr_t)dcsr_phy_addr, SIZE_1MB);
> +	of_node_put(np);
> +	if (!dcsr_base) {
> +		ret = -ENOMEM;
> +		goto dcsr_err;
> +	}

If you must do this, add a helper to get the dcsr base -- but do we not
already have dcsr subnodes for what you are using?

> +
> +	np = of_find_compatible_node(NULL, NULL, "fsl,fpga-qixis");
> +	if (np) {
> +		pld_flag = FPGA_FLAG;
> +	} else {
> +		np = of_find_compatible_node(NULL, NULL, "fsl,p104xrdb-cpld");
> +		if (np) {
> +			pld_flag = CPLD_FLAG;
> +		} else {
> +			pr_err("%s: Can't find the FPGA/CPLD node\n",
> +					__func__);
> +			ret = -EINVAL;
> +			goto pld_err;
> +		}
> +	}

OK, so this file isn't even specific to T1040 -- it's specific to our
reference boards?

> +static void fsl_dp_ddr_save(void *ccsr_base)
> +{
> +	u32 ddr_buff_addr;
> +
> +	/*
> +	 * DDR training initialization will break 128 bytes at the beginning
> +	 * of DDR, therefore, save them so that the bootloader will restore
> +	 * them. Assume that DDR is mapped to the address space started with
> +	 * CONFIG_PAGE_OFFSET.
> +	 */
> +	memcpy(ddr_buff, (void *)CONFIG_PAGE_OFFSET, DDR_BUF_SIZE);
> +
> +	/* assume ddr_buff is in the physical address space of 4GB */
> +	ddr_buff_addr = (u32)(__pa(ddr_buff) & 0xffffffff);

That assumption may break with a relocatable kernel.

> +
> +}
> +
> +int fsl_enter_epu_deepsleep(void)
> +{
> +
> +

No blank lines at begin/end of function.

> diff --git a/arch/powerpc/platforms/85xx/qoriq_pm.c b/arch/powerpc/platforms/85xx/qoriq_pm.c
> index 915b13b..5f2c016 100644
> --- a/arch/powerpc/platforms/85xx/qoriq_pm.c
> +++ b/arch/powerpc/platforms/85xx/qoriq_pm.c
> @@ -20,6 +20,8 @@
>  #define FSL_SLEEP		0x1
>  #define FSL_DEEP_SLEEP		0x2
>  
> +int (*fsl_enter_deepsleep)(void);
> +
>  /* specify the sleep state of the present platform */
>  int sleep_pm_state;
>  /* supported sleep modes by the present platform */
> @@ -28,6 +30,7 @@ static unsigned int sleep_modes;
>  static int qoriq_suspend_enter(suspend_state_t state)
>  {
>  	int ret = 0;
> +	int cpu;
>  
>  	switch (state) {
>  	case PM_SUSPEND_STANDBY:
> @@ -39,6 +42,17 @@ static int qoriq_suspend_enter(suspend_state_t state)
>  
>  		break;
>  
> +	case PM_SUSPEND_MEM:
> +
> +		cpu = smp_processor_id();
> +		qoriq_pm_ops->irq_mask(cpu);
> +
> +		ret = fsl_enter_deepsleep();
> +
> +		qoriq_pm_ops->irq_unmask(cpu);
> +
> +		break;
> +
>  	default:
>  		ret = -EINVAL;
>  
> @@ -52,12 +66,30 @@ static int qoriq_suspend_valid(suspend_state_t state)
>  	if (state == PM_SUSPEND_STANDBY && (sleep_modes & FSL_SLEEP))
>  		return 1;
>  
> +	if (state == PM_SUSPEND_MEM && (sleep_modes & FSL_DEEP_SLEEP))
> +		return 1;
> +
>  	return 0;
>  }
>  
> +static int qoriq_suspend_begin(suspend_state_t state)
> +{
> +	if (state == PM_SUSPEND_MEM)
> +		return fsl_dp_iomap();
> +
> +	return 0;
> +}
> +
> +static void qoriq_suspend_end(void)
> +{
> +	fsl_dp_iounmap();
> +}
> +
>  static const struct platform_suspend_ops qoriq_suspend_ops = {
>  	.valid = qoriq_suspend_valid,
>  	.enter = qoriq_suspend_enter,
> +	.begin = qoriq_suspend_begin,
> +	.end = qoriq_suspend_end,
>  };
>  
>  static int __init qoriq_suspend_init(void)
> @@ -71,6 +103,12 @@ static int __init qoriq_suspend_init(void)
>  	if (np)
>  		sleep_pm_state = PLAT_PM_LPM20;
>  
> +	np = of_find_compatible_node(NULL, NULL, "fsl,t1040-rcpm");
> +	if (np) {
> +		fsl_enter_deepsleep = fsl_enter_epu_deepsleep;
> +		sleep_modes |= FSL_DEEP_SLEEP;
> +	}
> +
>  	suspend_set_ops(&qoriq_suspend_ops);
>  
>  	return 0;
> diff --git a/arch/powerpc/platforms/85xx/sleep.S b/arch/powerpc/platforms/85xx/sleep.S
> new file mode 100644
> index 0000000..95a5746
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/sleep.S
> @@ -0,0 +1,295 @@
> +/*
> + * Implement the low level part of deep sleep
> + *
> + * Copyright 2014 Freescale Semiconductor Inc.
> + *
> + * This program is free software; you can redistribute	it and/or modify it
> + * under  the terms of	the GNU General	 Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include <asm/page.h>
> +#include <asm/ppc_asm.h>
> +#include <asm/reg.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/booke_save_regs.h>
> +#include <asm/mmu.h>
> +
> +#define FSLDELAY(count)		\
> +	li	r3, (count)@l;	\
> +	slwi	r3, r3, 10;	\
> +	mtctr	r3;		\
> +101:	nop;			\
> +	bdnz	101b;

You don't need a namespace prefix on local macros in a non-header file.

Is the timebase stopped where you're calling this from?

> +#define FSL_DIS_ALL_IRQ		\
> +	mfmsr	r8;			\
> +	rlwinm	r8, r8, 0, ~MSR_CE;	\
> +	rlwinm	r8, r8, 0, ~MSR_ME;	\
> +	rlwinm	r8, r8, 0, ~MSR_EE;	\
> +	rlwinm	r8, r8, 0, ~MSR_DE;	\
> +	mtmsr	r8;			\
> +	isync
> +
> +
> +	.section .data
> +	.align	6
> +booke_regs_buffer:
> +	.space REGS_BUFFER_SIZE
> +
> +	.section .txt
> +	.align 6
> +
> +_GLOBAL(fsl_dp_enter_low)
> +deepsleep_start:
> +	LOAD_REG_ADDR(r9, buf_tmp)
> +	PPC_STL	r3, 0(r9)
> +	PPC_STL	r4, 8(r9)
> +	PPC_STL	r5, 16(r9)
> +	PPC_STL	r6, 24(r9)
> +
> +	LOAD_REG_ADDR(r3, booke_regs_buffer)
> +	/* save the return address */
> +	mflr	r5
> +	PPC_STL r5, SR_LR(r3)
> +	mfmsr	r5
> +	PPC_STL r5, SR_MSR(r3)
> +	li	r4, DEEPSLEEP_FLAG
> +	bl	booke_cpu_state_save
> +
> +	LOAD_REG_ADDR(r9, buf_tmp)
> +	PPC_LL	r31, 0(r9)
> +	PPC_LL	r30, 8(r9)
> +	PPC_LL	r29, 16(r9)
> +	PPC_LL	r28, 24(r9)
> +
> +	/* flush caches */
> +	LOAD_REG_ADDR(r3, cur_cpu_spec)
> +	PPC_LL	r3, 0(r3)
> +	PPC_LL	r3, CPU_FLUSH_CACHES(r3)
> +	PPC_LCMPI  0, r3, 0
> +	beq	6f
> +#ifdef CONFIG_PPC64
> +	PPC_LL	r3, 0(r3)
> +#endif
> +	mtctr	r3
> +	bctrl
> +6:
> +#define CPC_OFFSET	0x10000
> +	mr	r3, r31
> +	addis	r3, r3, CPC_OFFSET@h
> +	bl	fsl_flush_cpc_cache
> +
> +	LOAD_REG_ADDR(r8, deepsleep_start)
> +	LOAD_REG_ADDR(r9, deepsleep_end)
> +
> +	/* prefecth TLB */
> +#define CCSR_GPIO1_GPDAT	0x130008
> +#define CCSR_GPIO1_GPDAT_29	0x4
> +	LOAD_REG_IMMEDIATE(r11, CCSR_GPIO1_GPDAT)
> +	add	r11, r31, r11
> +	lwz	r10, 0(r11)
> +
> +#define CCSR_RCPM_PCPH15SETR	0xe20b4
> +#define CCSR_RCPM_PCPH15SETR_CORE0	0x1
> +	LOAD_REG_IMMEDIATE(r12, CCSR_RCPM_PCPH15SETR)
> +	add	r12, r31, r12
> +	lwz	r10, 0(r12)
> +
> +#define CCSR_DDR_SDRAM_CFG_2	0x8114
> +#define CCSR_DDR_SDRAM_CFG_2_FRC_SR	0x80000000
> +	LOAD_REG_IMMEDIATE(r13, CCSR_DDR_SDRAM_CFG_2)
> +	add	r13, r31, r13
> +	lwz	r10, 0(r13)
> +
> +#define	DCSR_EPU_EPGCR		0x000
> +#define DCSR_EPU_EPGCR_GCE	0x80000000
> +	li	r14, DCSR_EPU_EPGCR
> +	add	r14, r30, r14
> +	lwz	r10, 0(r14)
> +
> +#define	DCSR_EPU_EPECR15	0x33C
> +#define DCSR_EPU_EPECR15_IC0	0x80000000
> +	li	r15, DCSR_EPU_EPECR15
> +	add	r15, r30, r15
> +	lwz	r10, 0(r15)
> +
> +#define CCSR_SCFG_QMCRDTRSTCR		0xfc40c
> +#define CCSR_SCFG_QMCRDTRSTCR_CRDTRST	0x80000000
> +	LOAD_REG_IMMEDIATE(r16, CCSR_SCFG_QMCRDTRSTCR)
> +	add	r16, r31, r16
> +	lwz	r10, 0(r16)
> +
> +/*
> + * There are two kind of register maps, one for CPLD and the other for FPGA
> + */
> +#define CPLD_MISCCSR		0x17
> +#define CPLD_MISCCSR_SLEEPEN	0x40
> +#define QIXIS_PWR_CTL2		0x21
> +#define QIXIS_PWR_CTL2_PCTL	0x2
> +	PPC_LCMPI  0, r28, FPGA_FLAG
> +	beq	20f
> +	addi	r29, r29, CPLD_MISCCSR
> +20:
> +	addi	r29, r29, QIXIS_PWR_CTL2
> +	lbz	r10, 0(r29)


Again, this is not marked as a board-specific file.  How do you expect
customers to adapt this mechanism to their boards?

> +
> +	/* prefecth code to cache so that executing code after disable DDR */
> +1:	lwz	r3, 0(r8)
> +	addi	r8, r8, 4
> +	cmpw	r8, r9
> +	blt	1b
> +	msync

Instructions don't execute from dcache...  I guess you're assuming it
will allocate in, and stay in, L2/CPC.  It'd be safer to lock those
cache lines in icache, or copy the code to SRAM.

> +	FSL_DIS_ALL_IRQ
> +
> +	/*
> +	 * Place DDR controller in self refresh mode.
> +	 * From here on, DDR can't be access any more.
> +	 */
> +	lwz	r10, 0(r13)
> +	oris	r10, r10, CCSR_DDR_SDRAM_CFG_2_FRC_SR@h
> +	stw	r10, 0(r13)
> +
> +	/* can't call udelay() here, so use a macro to delay */
> +	FSLDELAY(50)

A timebase loop doesn't require accessing DDR.

You also probably want to do a "sync, readback, data dependency, isync"
sequence to make sure that the store has hit CCSR before you begin your
delay (or is a delay required at all if you do that?).

> +	/*
> +	 * Enable deep sleep signals by write external CPLD/FPGA register.
> +	 * The bootloader will disable them when wakeup from deep sleep.
> +	 */
> +	lbz	r10, 0(r29)
> +	PPC_LCMPI  0, r28, FPGA_FLAG
> +	beq	22f
> +	ori	r10, r10, CPLD_MISCCSR_SLEEPEN
> +22:
> +	ori	r10, r10, QIXIS_PWR_CTL2_PCTL
> +	stb	r10, 0(r29)
> +
> +	/*
> +	 * Set GPIO1_29 to lock the signal MCKE down during deep sleep.
> +	 * The bootloader will clear it when wakeup.
> +	 */
> +	lwz	r10, 0(r11)
> +	ori	r10, r10, CCSR_GPIO1_GPDAT_29
> +	stw	r10, 0(r11)
> +
> +	FSLDELAY(10)
> +
> +	/* Clear the QMan CITI Credits */
> +	lwz	r10, 0(r16)
> +	oris	r10, r10, CCSR_SCFG_QMCRDTRSTCR_CRDTRST@h
> +	stw	r10, 0(r16)
> +
> +	/* Enable all EPU Counters */
> +	li	r10, 0
> +	oris	r10, r10, DCSR_EPU_EPGCR_GCE@h
> +	stw	r10, 0(r14)
> +
> +	/* Enable SCU15 to trigger on RCPM Concentrator 0 */
> +	lwz	r10, 0(r15)
> +	oris	r10, r10, DCSR_EPU_EPECR15_IC0@h
> +	stw	r10, 0(r15)
> +
> +	/* put Core0 in PH15 mode, trigger EPU FSM */
> +	lwz	r10, 0(r12)
> +	ori	r10, r10, CCSR_RCPM_PCPH15SETR_CORE0
> +	stw	r10, 0(r12)

Shouldn't there be a sync to ensure that the previous I/O happens before
the final store to enter PH15?

-Scott

^ permalink raw reply

* Re: Node 0 not necessary for powerpc?
From: David Rientjes @ 2014-03-12  2:02 UTC (permalink / raw)
  To: Nishanth Aravamudan; +Cc: linux-mm, cl, linuxppc-dev, anton
In-Reply-To: <20140311195632.GA946@linux.vnet.ibm.com>

On Tue, 11 Mar 2014, Nishanth Aravamudan wrote:

> I have a P7 system that has no node0, but a node0 shows up in numactl
> --hardware, which has no cpus and no memory (and no PCI devices):
> 
> numactl --hardware
> available: 4 nodes (0-3)
> node 0 cpus:
> node 0 size: 0 MB
> node 0 free: 0 MB
> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
> node 1 size: 0 MB
> node 1 free: 0 MB
> node 2 cpus:
> node 2 size: 7935 MB
> node 2 free: 7716 MB
> node 3 cpus:
> node 3 size: 8395 MB
> node 3 free: 8015 MB
> node distances:
> node   0   1   2   3 
>   0:  10  20  10  20 
>   1:  20  10  20  20 
>   2:  10  20  10  20 
>   3:  20  20  20  10 
> 
> This is because we statically initialize N_ONLINE to be [0] in
> mm/page_alloc.c:
> 
>         [N_ONLINE] = { { [0] = 1UL } },
> 
> I'm not sure what the architectural requirements are here, but at least
> on this test system, removing this initialization, it boots fine and is
> running. I've not yet tried stress tests, but it's survived the
> beginnings of kernbench so far.
> 
> numactl --hardware
> available: 3 nodes (1-3)
> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
> node 1 size: 0 MB
> node 1 free: 0 MB
> node 2 cpus:
> node 2 size: 7935 MB
> node 2 free: 7479 MB
> node 3 cpus:
> node 3 size: 8396 MB
> node 3 free: 8375 MB
> node distances:
> node   1   2   3 
>   1:  10  20  20 
>   2:  20  10  20 
>   3:  20  20  10
> 
> Perhaps we could put in a ARCH_DOES_NOT_NEED_NODE0 and only define it on
> powerpc for now, conditionalizing the above initialization on that?
> 

I don't know if anything has recently changed in the past year or so, but 
I've booted x86 machines with a hacked BIOS so that all memory on node 0 
is hotpluggable and offline, so I believe this is possible on x86 as well.

^ permalink raw reply

* Re: [PATCH 3/9] powerpc/rcpm: add RCPM driver
From: Chenhui Zhao @ 2014-03-12  3:59 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394581371.13761.62.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 06:42:51PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:57 +0800, Chenhui Zhao wrote:
> > diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
> > index b756f3d..3fdf9f3 100644
> > --- a/arch/powerpc/platforms/85xx/corenet_generic.c
> > +++ b/arch/powerpc/platforms/85xx/corenet_generic.c
> > @@ -56,6 +56,8 @@ void __init corenet_gen_setup_arch(void)
> >  
> >  	swiotlb_detect_4g();
> >  
> > +	fsl_rcpm_init();
> > +
> >  	pr_info("%s board from Freescale Semiconductor\n", ppc_md.name);
> 
> RCPM is not board-specific.  Why is this in board code?

Init the RCPM driver in the early stage before smp_init(). Because
the time base sync calls a callback function .freeze_time_base()
in the RCPM driver.

Will use early_initcall() instead.

> 
> > +static void rcpm_v1_cpu_enter_state(int cpu, int state)
> > +{
> > +	unsigned int hw_cpu = get_hard_smp_processor_id(cpu);
> > +	unsigned int mask = 1 << hw_cpu;
> > +
> > +	switch (state) {
> > +	case E500_PM_PH10:
> > +		setbits32(&rcpm_v1_regs->cdozcr, mask);
> > +		break;
> > +	case E500_PM_PH15:
> > +		setbits32(&rcpm_v1_regs->cnapcr, mask);
> > +		break;
> > +	default:
> > +		pr_err("Unknown cpu PM state\n");
> > +		break;
> > +	}
> > +}
> 
> Put __func__ in error messages -- and for "unknown value" type messages,
> print the value.

OK.

> 
> 
> > +static int rcpm_v1_plat_enter_state(int state)
> > +{
> > +	u32 *pmcsr_reg = &rcpm_v1_regs->powmgtcsr;
> > +	int ret = 0;
> > +	int result;
> > +
> > +	switch (state) {
> > +	case PLAT_PM_SLEEP:
> > +		setbits32(pmcsr_reg, RCPM_POWMGTCSR_SLP);
> > +
> > +		/* At this point, the device is in sleep mode. */
> > +
> > +		/* Upon resume, wait for RCPM_POWMGTCSR_SLP bit to be clear. */
> > +		result = spin_event_timeout(
> > +		  !(in_be32(pmcsr_reg) & RCPM_POWMGTCSR_SLP), 10000, 10);
> > +		if (!result) {
> > +			pr_err("%s: timeout waiting for SLP bit to be cleared\n",
> > +			  __func__);
> 
> Why are you indenting continuation lines with only two spaces (and yet
> still not aligning with anything)?

Will align with the previous parenthesis.

> 
> > +			ret = -ETIMEDOUT;
> > +		}
> > +		break;
> > +	default:
> > +		pr_err("Unsupported platform PM state\n");
> > +		ret = -EINVAL;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static void rcpm_v1_freeze_time_base(int freeze)
> > +{
> > +	u32 *tben_reg = &rcpm_v1_regs->ctbenr;
> > +	static u32 mask;
> > +
> > +	if (freeze) {
> > +		mask = in_be32(tben_reg);
> > +		clrbits32(tben_reg, mask);
> > +	} else {
> > +		setbits32(tben_reg, mask);
> > +	}
> > +
> > +	/* read back to push the previous write */
> > +	in_be32(tben_reg);
> > +}
> > +
> > +static void rcpm_v2_freeze_time_base(int freeze)
> > +{
> > +	u32 *tben_reg = &rcpm_v2_regs->pctbenr;
> > +	static u32 mask;
> > +
> > +	if (freeze) {
> > +		mask = in_be32(tben_reg);
> > +		clrbits32(tben_reg, mask);
> > +	} else {
> > +		setbits32(tben_reg, mask);
> > +	}
> > +
> > +	/* read back to push the previous write */
> > +	in_be32(tben_reg);
> > +}
> 
> It looks like the only difference between these two functions is how you
> calculate tben_reg -- factor the rest out into a single function.

Yes. Will factor them out into a single function.

> 
> > +int fsl_rcpm_init(void)
> > +{
> > +	struct device_node *np;
> > +
> > +	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-2.0");
> > +	if (np) {
> > +		rcpm_v2_regs = of_iomap(np, 0);
> > +		of_node_put(np);
> > +		if (!rcpm_v2_regs)
> > +			return -ENOMEM;
> > +
> > +		qoriq_pm_ops = &qoriq_rcpm_v2_ops;
> > +
> > +	} else {
> > +		np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-1.0");
> > +		if (np) {
> > +			rcpm_v1_regs = of_iomap(np, 0);
> > +			of_node_put(np);
> > +			if (!rcpm_v1_regs)
> > +				return -ENOMEM;
> > +
> > +			qoriq_pm_ops = &qoriq_rcpm_v1_ops;
> > +
> > +		} else {
> > +			pr_err("%s: can't find the rcpm node.\n", __func__);
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> 
> Why isn't this a proper platform driver?
> 
> -Scott

The RCPM is not a single function IP block, instead it is a collection
of device run control and power management. It would be called by other
drivers and functions. For example, the callback .freeze_time_base()
need to be called at early stage of kernel init. Therefore, it would be
better to init it at early stage.

-Chenhui

^ permalink raw reply

* [PATCH] powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules.
From: Tony Breeds @ 2014-03-12  4:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, LinuxPPC-dev; +Cc: Rusty Russell, Alan Modra

[-- Attachment #1: Type: text/plain, Size: 1066 bytes --]

When building modules with a native le toolchain the linker will
generate R_PPC64_TOCSAVE relocations when it's safe to omit saving r2 on
a plt call.  This isn't helpful in the conext of a kernel module and the
kernel will fail to load those modules with an error like:
	nf_conntrack: Unknown ADD relocation: 109

This patch tells the linker to avoid createing R_PPC64_TOCSAVE
relocations allowing modules to load.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
 arch/powerpc/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 0f4344e..fff3945 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -74,6 +74,9 @@ override CROSS32AS += -mlittle-endian
 LDEMULATION	:= lppc
 GNUTARGET	:= powerpcle
 MULTIPLEWORD	:= -mno-multiple
+ifeq ($(call cc-option-yn,-mno-save-toc-indirect),y)
+       KBUILD_CFLAGS_MODULE += -mno-save-toc-indirect
+endif
 else
 ifeq ($(call cc-option-yn,-mbig-endian),y)
 override CC	+= -mbig-endian
-- 
1.8.5.3

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply related

* Re: [PATCH 4/9] powerpc/85xx: support CPU hotplug for e500mc and e5500
From: Chenhui Zhao @ 2014-03-12  4:34 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394581693.13761.66.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 06:48:13PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index ac2621a..f3f4401 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -405,8 +405,12 @@ void generic_cpu_die(unsigned int cpu)
> >  
> >  	for (i = 0; i < 100; i++) {
> >  		smp_rmb();
> > -		if (per_cpu(cpu_state, cpu) == CPU_DEAD)
> > +		if (per_cpu(cpu_state, cpu) == CPU_DEAD) {
> > +#ifdef CONFIG_PPC64
> > +			paca[cpu].cpu_start = 0;
> > +#endif
> 
> Why wasn't this needed by previous ppc64 machines?

If not clear, cpu can't start in the case of cpu hotpolug.
The function pseries_cpu_die() in arch/powerpc/platforms/pseries/hotplug-cpu.c
also clears the flag.

> 
> > diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
> > index 2e5911e..0047883 100644
> > --- a/arch/powerpc/platforms/85xx/smp.c
> > +++ b/arch/powerpc/platforms/85xx/smp.c
> > @@ -19,6 +19,7 @@
> >  #include <linux/kexec.h>
> >  #include <linux/highmem.h>
> >  #include <linux/cpu.h>
> > +#include <linux/smp.h>
> >  
> >  #include <asm/machdep.h>
> >  #include <asm/pgtable.h>
> > @@ -46,6 +47,17 @@ static u64 timebase;
> >  static int tb_req;
> >  static int tb_valid;
> >  
> > +#ifdef CONFIG_PPC_E500MC
> > +/* specify the cpu PM state when cpu dies, PH15/NAP is the default */
> > +int qoriq_cpu_die_state = E500_PM_PH15;
> > +#endif
> 
> static?  Is there any way to modify this other than modifying source
> code?
> 
> BTW, QorIQ doesn't imply an e500mc derivative.

Will support e500, but for now these code support e500mc derivative in
advance.

Supposed qoriq_cpu_die_state can be changed by platform init code
if the default value is not proper for a specific platform.

> 
> > @@ -125,6 +138,34 @@ static void mpc85xx_take_timebase(void)
> >  }
> >  
> >  #ifdef CONFIG_HOTPLUG_CPU
> > +#ifdef CONFIG_PPC_E500MC
> > +static void qoriq_cpu_die(void)
> > +{
> > +	unsigned int cpu = smp_processor_id();
> > +
> > +	local_irq_disable();
> > +#ifdef CONFIG_PPC64
> > +	__hard_irq_disable();
> > +#endif
> 
> Why this instead of one call to hard_irq_disable() (no leading
> underscores)?
> 
> -Scott

hard_irq_disable() will clear soft_enabled again. local_irq_disable()
has cleared it.

Will use hard_irq_disable() to replace these lines.

  local_irq_disable();
  #ifdef CONFIG_PPC64
  	__hard_irq_disable();
  #endif

-Chenhui

^ permalink raw reply

* Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040
From: Kevin Hao @ 2014-03-12  5:57 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Chenhui Zhao, Jason.Jin, linux-kernel
In-Reply-To: <1394586624.13761.132.camel@snotra.buserror.net>

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

On Tue, Mar 11, 2014 at 08:10:24PM -0500, Scott Wood wrote:
> > +	FSL_DIS_ALL_IRQ
> > +
> > +	/*
> > +	 * Place DDR controller in self refresh mode.
> > +	 * From here on, DDR can't be access any more.
> > +	 */
> > +	lwz	r10, 0(r13)
> > +	oris	r10, r10, CCSR_DDR_SDRAM_CFG_2_FRC_SR@h
> > +	stw	r10, 0(r13)
> > +
> > +	/* can't call udelay() here, so use a macro to delay */
> > +	FSLDELAY(50)
> 
> A timebase loop doesn't require accessing DDR.
> 
> You also probably want to do a "sync, readback, data dependency, isync"
> sequence to make sure that the store has hit CCSR before you begin your
> delay (or is a delay required at all if you do that?).

Shouldn't we use "readback, sync" here? The following is quoted form t4240RM:
  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

> > +
> > +	/* Enable SCU15 to trigger on RCPM Concentrator 0 */
> > +	lwz	r10, 0(r15)
> > +	oris	r10, r10, DCSR_EPU_EPECR15_IC0@h
> > +	stw	r10, 0(r15)
> > +
> > +	/* put Core0 in PH15 mode, trigger EPU FSM */
> > +	lwz	r10, 0(r12)
> > +	ori	r10, r10, CCSR_RCPM_PCPH15SETR_CORE0
> > +	stw	r10, 0(r12)
> 
> Shouldn't there be a sync to ensure that the previous I/O happens before
> the final store to enter PH15?

Do we really need a sync here? According to the PowerISA, the above stores
should be performed in program order.
  If two Store instructions or two Load instructions
  specify storage locations that are both Caching
  Inhibited and Guarded, the corresponding storage
  accesses are performed in program order with
  respect to any processor or mechanism.

Thanks,
Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH 5/9] powerpc/85xx: disable irq by hardware when suspend for 64-bit
From: Chenhui Zhao @ 2014-03-12  7:46 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394581880.13761.69.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 06:51:20PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > In 64-bit mode, kernel just clears the irq soft-enable flag
> > in struct paca_struct to disable external irqs. But, in
> > the case of suspend, irqs should be disabled by hardware.
> > Therefore, hook a function to ppc_md.suspend_disable_irqs
> > to really disable irqs.
> > 
> > Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> > ---
> >  arch/powerpc/platforms/85xx/corenet_generic.c |   12 ++++++++++++
> >  1 files changed, 12 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
> > index 3fdf9f3..983d81f 100644
> > --- a/arch/powerpc/platforms/85xx/corenet_generic.c
> > +++ b/arch/powerpc/platforms/85xx/corenet_generic.c
> > @@ -32,6 +32,13 @@
> >  #include <sysdev/fsl_pci.h>
> >  #include "smp.h"
> >  
> > +#if defined(CONFIG_PPC64) && defined(CONFIG_SUSPEND)
> > +static void fsl_suspend_disable_irqs(void)
> > +{
> > +	__hard_irq_disable();
> > +}
> > +#endif
> 
> Why the underscore version?  Don't you want PACA_IRQ_HARD_DIS to be set?
> 
> If hard disabling is appropriate here, shouldn't we do it in
> generic_suspend_disable_irqs()?
> 
> Are there any existing platforms that supply a
> ppc_md.suspend_disable_irqs()?  I don't see any when grepping.
> 
> -Scott

Will use hard_irq_disable().

I think this is a general problem for powerpc.
Should clear MSR_EE before suspend. I agree to put it
in generic_suspend_disable_irqs().

-Chenhui

^ permalink raw reply

* Re: [PATCH 6/9] powerpc/85xx: support sleep feature on QorIQ SoCs with RCPM
From: Chenhui Zhao @ 2014-03-12  8:08 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394582427.13761.74.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 07:00:27PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > In sleep mode, the clocks of e500 cores and unused IP blocks is
> > turned off. The IP blocks which are allowed to wake up the processor
> > are still running.
> > 
> > The sleep mode is equal to the Standby state in Linux. Use the
> > command to enter sleep mode:
> >   echo standby > /sys/power/state
> > 
> > Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> > ---
> >  arch/powerpc/Kconfig                   |    4 +-
> >  arch/powerpc/platforms/85xx/Makefile   |    3 +
> >  arch/powerpc/platforms/85xx/qoriq_pm.c |   78 ++++++++++++++++++++++++++++++++
> >  3 files changed, 83 insertions(+), 2 deletions(-)
> >  create mode 100644 arch/powerpc/platforms/85xx/qoriq_pm.c
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 05f6323..e1d6510 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -222,7 +222,7 @@ config ARCH_HIBERNATION_POSSIBLE
> >  config ARCH_SUSPEND_POSSIBLE
> >  	def_bool y
> >  	depends on ADB_PMU || PPC_EFIKA || PPC_LITE5200 || PPC_83xx || \
> > -		   (PPC_85xx && !PPC_E500MC) || PPC_86xx || PPC_PSERIES \
> > +		   FSL_SOC_BOOKE || PPC_86xx || PPC_PSERIES \
> >  		   || 44x || 40x
> >  
> >  config PPC_DCR_NATIVE
> > @@ -709,7 +709,7 @@ config FSL_PCI
> >  config FSL_PMC
> >  	bool
> >  	default y
> > -	depends on SUSPEND && (PPC_85xx || PPC_86xx)
> > +	depends on SUSPEND && (PPC_85xx && !PPC_E500MC || PPC_86xx)
> 
> Don't mix && and || without parentheses.
> 
> Maybe convert this into being selected (similar to FSL_RCPM), rather
> than default y?

Yes, will do.

> 
> > diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
> > index 25cebe7..7fae817 100644
> > --- a/arch/powerpc/platforms/85xx/Makefile
> > +++ b/arch/powerpc/platforms/85xx/Makefile
> > @@ -2,6 +2,9 @@
> >  # Makefile for the PowerPC 85xx linux kernel.
> >  #
> >  obj-$(CONFIG_SMP) += smp.o
> > +ifeq ($(CONFIG_FSL_CORENET_RCPM), y)
> > +obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o
> > +endif
> 
> There should probably be a kconfig symbol for this.

OK.

> 
> > diff --git a/arch/powerpc/platforms/85xx/qoriq_pm.c b/arch/powerpc/platforms/85xx/qoriq_pm.c
> > new file mode 100644
> > index 0000000..915b13b
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/85xx/qoriq_pm.c
> > @@ -0,0 +1,78 @@
> > +/*
> > + * Support Power Management feature
> > + *
> > + * Copyright 2014 Freescale Semiconductor Inc.
> > + *
> > + * Author: Chenhui Zhao <chenhui.zhao@freescale.com>
> > + *
> > + * This program is free software; you can redistribute	it and/or modify it
> > + * under  the terms of	the GNU General	 Public License as published by the
> > + * Free Software Foundation;  either version 2 of the  License, or (at your
> > + * option) any later version.
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/suspend.h>
> > +#include <linux/of_platform.h>
> > +
> > +#include <sysdev/fsl_soc.h>
> > +
> > +#define FSL_SLEEP		0x1
> > +#define FSL_DEEP_SLEEP		0x2
> 
> FSL_DEEP_SLEEP is unused.

Will be used in the last patch.
[PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

> 
> > +
> > +/* specify the sleep state of the present platform */
> > +int sleep_pm_state;
> > +/* supported sleep modes by the present platform */
> > +static unsigned int sleep_modes;
> 
> Why is one signed and the other unsigned?

Undesigned. Will change them all to unsigned.

> 
> > +
> > +static int qoriq_suspend_enter(suspend_state_t state)
> > +{
> > +	int ret = 0;
> > +
> > +	switch (state) {
> > +	case PM_SUSPEND_STANDBY:
> > +
> > +		if (cur_cpu_spec->cpu_flush_caches)
> > +			cur_cpu_spec->cpu_flush_caches();
> > +
> > +		ret = qoriq_pm_ops->plat_enter_state(sleep_pm_state);
> > +
> > +		break;
> > +
> > +	default:
> > +		ret = -EINVAL;
> > +
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static int qoriq_suspend_valid(suspend_state_t state)
> > +{
> > +	if (state == PM_SUSPEND_STANDBY && (sleep_modes & FSL_SLEEP))
> > +		return 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct platform_suspend_ops qoriq_suspend_ops = {
> > +	.valid = qoriq_suspend_valid,
> > +	.enter = qoriq_suspend_enter,
> > +};
> > +
> > +static int __init qoriq_suspend_init(void)
> > +{
> > +	struct device_node *np;
> > +
> > +	sleep_modes = FSL_SLEEP;
> > +	sleep_pm_state = PLAT_PM_SLEEP;
> > +
> > +	np = of_find_compatible_node(NULL, NULL, "fsl,qoriq-rcpm-2.0");
> > +	if (np)
> > +		sleep_pm_state = PLAT_PM_LPM20;
> > +
> > +	suspend_set_ops(&qoriq_suspend_ops);
> > +
> > +	return 0;
> > +}
> > +arch_initcall(qoriq_suspend_init);
> 
> Why is this not a platform driver?  If fsl_pmc can do it...
> 
> -Scott

It can be, but what advantage of being a platform driver.

-Chenhui

^ permalink raw reply

* [PATCH] powerpc/le: Show the endianess of the LPAR under PowerVM.
From: Tony Breeds @ 2014-03-12  8:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, LinuxPPC-dev

[-- Attachment #1: Type: text/plain, Size: 751 bytes --]

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
 arch/powerpc/platforms/pseries/setup.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 972df0f..5a32caa 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -516,7 +516,11 @@ static void __init pSeries_setup_arch(void)
 static int __init pSeries_init_panel(void)
 {
 	/* Manually leave the kernel version on the panel. */
+#ifdef __BIG_ENDIAN__
 	ppc_md.progress("Linux ppc64\n", 0);
+#else
+	ppc_md.progress("Linux ppc64le\n", 0);
+#endif
 	ppc_md.progress(init_utsname()->version, 0);
 
 	return 0;
-- 
1.8.5.3


Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply related

* [PATCH] T1040RDB: add qe node for T1040RDB dts
From: Zhao Qiang @ 2014-03-12  8:26 UTC (permalink / raw)
  To: linuxppc-dev, B07421; +Cc: Zhao Qiang, R63061

Signed-off-by: Zhao Qiang <B45475@freescale.com>
---
 arch/powerpc/boot/dts/t1040rdb.dts | 43 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/arch/powerpc/boot/dts/t1040rdb.dts b/arch/powerpc/boot/dts/t1040rdb.dts
index e2eee18..6ff0412 100644
--- a/arch/powerpc/boot/dts/t1040rdb.dts
+++ b/arch/powerpc/boot/dts/t1040rdb.dts
@@ -268,6 +268,49 @@
 			fsl,fman-mac = <&enet4>;
 		};
 	};
+
+	qe: qe@ffe139999 {
+		ranges = <0x0 0xf 0xfe140000 0x40000>;
+		reg = <0xf 0xfe140000 0 0x480>;
+		brg-frequency = <0>;
+		bus-frequency = <0>;
+
+		si1: si@700 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "fsl,qe-si";
+			reg = <0x700 0x80>;
+		};
+
+		siram1: siram@1000 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			compatible = "fsl,qe-siram";
+			reg = <0x1000 0x800>;
+		};
+
+		tdma: ucc@2000 {
+			compatible = "fsl,ucc-tdm";
+			rx-clock-name = "clk3";
+			tx-clock-name = "clk4";
+			fsl,rx-sync-clock = "rsync_pin";
+			fsl,tx-sync-clock = "tsync_pin";
+			fsl,tx-timeslot = <0xfffffffe>;
+			fsl,rx-timeslot = <0xfffffffe>;
+			fsl,tdm-framer-type = "e1";
+			fsl,tdm-mode = "normal";
+			fsl,tdm-id = <0>;
+			fsl,siram-entry-id = <0>;
+		};
+
+		serial: ucc@2200 {
+			device_type = "serial";
+			compatible = "ucc_uart";
+			port-number = <1>;
+			rx-clock-name = "brg2";
+			tx-clock-name = "brg2";
+		};
+	};
 };
 /include/ "fsl/t1040si-post.dtsi"
 /include/ "fsl/qoriq-dpaa-res3.dtsi"
-- 
1.8.5

^ permalink raw reply related

* Re: [PATCH 7/9] fsl: add EPU FSM configuration for deep sleep
From: Chenhui Zhao @ 2014-03-12  8:34 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394582923.13761.81.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 07:08:43PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > From: Hongbo Zhang <hongbo.zhang@freescale.com>
> > 
> > In the last stage of deep sleep, software will trigger a Finite
> > State Machine (FSM) to control the hardware precedure, such as
> > board isolation, killing PLLs, removing power, and so on.
> > 
> > When the system is waked up by an interrupt, the FSM controls the
> > hardware to complete the early resume precedure.
> > 
> > This patch configure the EPU FSM preparing for deep sleep.
> > 
> > Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
> > Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> 
> Couldn't this be part of qoriq_pm.c?

Put the code in drivers/platform/fsl/ so that LS1 can share these code.

> 
> > diff --git a/arch/powerpc/sysdev/fsl_soc.h b/arch/powerpc/sysdev/fsl_soc.h
> > index 9b9a34a..eb83a30 100644
> > --- a/arch/powerpc/sysdev/fsl_soc.h
> > +++ b/arch/powerpc/sysdev/fsl_soc.h
> > @@ -69,5 +69,8 @@ extern const struct fsl_pm_ops *qoriq_pm_ops;
> >  
> >  extern int fsl_rcpm_init(void);
> >  
> > +extern void fsl_dp_fsm_setup(void *dcsr_base);
> > +extern void fsl_dp_fsm_clean(void *dcsr_base);
> 
> __iomem

Thanks. Will add.

> 
> > +
> >  #endif
> >  #endif
> > diff --git a/drivers/platform/Kconfig b/drivers/platform/Kconfig
> > index 09fde58..6539e6d 100644
> > --- a/drivers/platform/Kconfig
> > +++ b/drivers/platform/Kconfig
> > @@ -6,3 +6,7 @@ source "drivers/platform/goldfish/Kconfig"
> >  endif
> >  
> >  source "drivers/platform/chrome/Kconfig"
> > +
> > +if FSL_SOC
> > +source "drivers/platform/fsl/Kconfig"
> > +endif
> 
> Chrome doesn't need an ifdef -- why does this?

Don't wish other platform see these options, and the X86 and GOLDFISH have
ifdefs.

> 
> > diff --git a/drivers/platform/Makefile b/drivers/platform/Makefile
> > index 3656b7b..37c6f72 100644
> > --- a/drivers/platform/Makefile
> > +++ b/drivers/platform/Makefile
> > @@ -6,3 +6,4 @@ obj-$(CONFIG_X86)		+= x86/
> >  obj-$(CONFIG_OLPC)		+= olpc/
> >  obj-$(CONFIG_GOLDFISH)		+= goldfish/
> >  obj-$(CONFIG_CHROME_PLATFORMS)	+= chrome/
> > +obj-$(CONFIG_FSL_SOC)		+= fsl/
> > diff --git a/drivers/platform/fsl/Kconfig b/drivers/platform/fsl/Kconfig
> > new file mode 100644
> > index 0000000..72ed053
> > --- /dev/null
> > +++ b/drivers/platform/fsl/Kconfig
> > @@ -0,0 +1,10 @@
> > +#
> > +# Freescale Specific Power Management Drivers
> > +#
> > +
> > +config FSL_SLEEP_FSM
> > +	bool
> > +	help
> > +	  This driver configures a hardware FSM (Finite State Machine) for deep sleep.
> > +	  The FSM is used to finish clean-ups at the last stage of system entering deep
> > +	  sleep, and also wakes up system when a wake up event happens.
> > diff --git a/drivers/platform/fsl/Makefile b/drivers/platform/fsl/Makefile
> > new file mode 100644
> > index 0000000..d99ca0e
> > --- /dev/null
> > +++ b/drivers/platform/fsl/Makefile
> > @@ -0,0 +1,5 @@
> > +#
> > +# Makefile for linux/drivers/platform/fsl
> > +# Freescale Specific Power Management Drivers
> > +#
> > +obj-$(CONFIG_FSL_SLEEP_FSM)	+= sleep_fsm.o
> 
> Why is this here while the other stuff is in arch/powerpc/sysdev?
> 
> > +/* Block offsets */
> > +#define	RCPM_BLOCK_OFFSET	0x00022000
> > +#define	EPU_BLOCK_OFFSET	0x00000000
> > +#define	NPC_BLOCK_OFFSET	0x00001000
> 
> Why don't these block offsets come from the device tree?

Have maped DCSR registers. Don't wish to remap them.

> 
> > +static void *g_dcsr_base;
> 
> __iomem

OK.

> 
> > +	/* Configure the EPU Counters */
> > +	epu_write(EPCCR15, 0x92840000);
> > +	epu_write(EPCCR14, 0x92840000);
> > +	epu_write(EPCCR12, 0x92840000);
> > +	epu_write(EPCCR11, 0x92840000);
> > +	epu_write(EPCCR10, 0x92840000);
> > +	epu_write(EPCCR9, 0x92840000);
> > +	epu_write(EPCCR8, 0x92840000);
> > +	epu_write(EPCCR5, 0x92840000);
> > +	epu_write(EPCCR4, 0x92840000);
> > +	epu_write(EPCCR2, 0x92840000);
> > +
> > +	/* Configure the SCUs Inputs */
> > +	epu_write(EPSMCR15, 0x76000000);
> > +	epu_write(EPSMCR14, 0x00000031);
> > +	epu_write(EPSMCR13, 0x00003100);
> > +	epu_write(EPSMCR12, 0x7F000000);
> > +	epu_write(EPSMCR11, 0x31740000);
> > +	epu_write(EPSMCR10, 0x65000030);
> > +	epu_write(EPSMCR9, 0x00003000);
> > +	epu_write(EPSMCR8, 0x64300000);
> > +	epu_write(EPSMCR7, 0x30000000);
> > +	epu_write(EPSMCR6, 0x7C000000);
> > +	epu_write(EPSMCR5, 0x00002E00);
> > +	epu_write(EPSMCR4, 0x002F0000);
> > +	epu_write(EPSMCR3, 0x2F000000);
> > +	epu_write(EPSMCR2, 0x6C700000);
> 
> Where do these magic numbers come from?  Which chips are they valid for?

They are for T1040. Can be found in the RCPM chapter of T1040RM.

> 
> > +void fsl_dp_fsm_clean(void *dcsr_base)
> > +{
> > +
> > +	epu_write(EPEVTCR2, 0);
> > +	epu_write(EPEVTCR9, 0);
> > +
> > +	epu_write(EPGCR, 0);
> > +	epu_write(EPECR15, 0);
> > +
> > +	rcpm_write(CSTTACR0, 0);
> > +	rcpm_write(CG1CR0, 0);
> > +
> > +}
> 
> Don't put blank lines at the beginning/end of a block.
> 
> -Scott

Thanks.

-Chenhui

^ permalink raw reply

* Re: [PATCH 8/9] powerpc/85xx: add save/restore functions for core registers
From: Chenhui Zhao @ 2014-03-12  9:42 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin, Wang Dongsheng-B40534
In-Reply-To: <1394585114.13761.112.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 07:45:14PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > From: Wang Dongsheng <dongsheng.wang@freescale.com>
> > 
> > Add booke_cpu_state_save() and booke_cpu_state_restore() functions which can be
> > used to save/restore CPU's registers in the case of deep sleep and hibernation.
> > 
> > Supported processors: E6500, E5500, E500MC, E500v2 and E500v1.
> > 
> > Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
> > Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
> > ---
> >  arch/powerpc/include/asm/booke_save_regs.h |   96 ++++++++
> >  arch/powerpc/kernel/Makefile               |    1 +
> >  arch/powerpc/kernel/booke_save_regs.S      |  361 ++++++++++++++++++++++++++++
> >  3 files changed, 458 insertions(+), 0 deletions(-)
> >  create mode 100644 arch/powerpc/include/asm/booke_save_regs.h
> >  create mode 100644 arch/powerpc/kernel/booke_save_regs.S
> > 
> > diff --git a/arch/powerpc/include/asm/booke_save_regs.h b/arch/powerpc/include/asm/booke_save_regs.h
> > new file mode 100644
> > index 0000000..87c357a
> > --- /dev/null
> > +++ b/arch/powerpc/include/asm/booke_save_regs.h
> > @@ -0,0 +1,96 @@
> > +/*
> > + *  Save/restore e500 series core registers
> 
> Filename says booke, comment says e500.
> 
> Filename and comment also fail to point out that this is specifically
> for standby/suspend, not for hibernate which is implemented in
> swsusp_booke.S/swsusp_asm64.S.

Sorry for inconsistency. Will changes e500 to booke.
Hibernation and suspend can share the code.

> 
> > + *
> > + * Author: Wang Dongsheng <dongsheng.wang@freescale.com>
> > + *
> > + * Copyright 2014 Freescale Semiconductor Inc.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#ifndef __ASM_FSL_SLEEP_H
> > +#define __ASM_FSL_SLEEP_H
> > +
> > +/*
> > + * 8 bytes for each register, which is compatible with
> > + * both 32-bit and 64-bit registers
> > + *
> > + * Acronyms:
> > + *	dw(data width)	0x08
> > + *
> > + * Map:
> > + * General-Purpose Registers
> > + *	GPR1(sp)		0
> > + *	GPR2			0x8		(dw * 1)
> > + *	GPR13 - GPR31		0x10 ~ 0xa0	(dw * 2 ~ dw * 20)
> 
> Putting these values in a comment separate from the code that defines it
> sounds like a good way to get a mismatch between the two.

Ok.

> 
> > + * Foating-point registers
> > + *	FPR14 - FPR31		0xa8 ~ 0x130	(dw * 21 ~ dw * 38)
> 
> Call enable_kernel_fp() or similar, rather than saving FP regs here.
> Likewise with altivec.  And why starting at FPR14?  Volatile versus
> nonvolatile is irrelevant because Linux doesn't participate in the FP
> ABI.  Everything is non-volatile *if* we have a user FP context
> resident, and everything is volatile otherwise.

Will remove them.

> 
> > + * Timer Registers
> > + *	TCR			0x168		(dw * 45)
> > + *	TB(64bit)		0x170		(dw * 46)
> > + *	TBU(32bit)		0x178		(dw * 47)
> > + *	TBL(32bit)		0x180		(dw * 48)
> 
> Why are you saving TBU/L separate from TB?  They're the same thing.

Will remove TBU and TBL.

> 
> > + * Interrupt Registers
> > + *	IVPR			0x188		(dw * 49)
> > + *	IVOR0 - IVOR15		0x190 ~ 0x208	(dw * 50 ~ dw * 65)
> > + *	IVOR32 - IVOR41		0x210 ~ 0x258	(dw * 66 ~ dw * 75)
> 
> What about IVOR42 (LRAT error)?

Will add it.

> 
> > + * Software-Use Registers
> > + *	SPRG1			0x260		(dw * 76), 64-bit need to save.
> > + *	SPRG3			0x268		(dw * 77), 32-bit need to save.
> 
> What about "CPU and NUMA node for VDSO getcpu" on 64-bit?  Currently
> SPRG3, but it will need to change for critical interrupt support.
> 
> > + * MMU Registers
> > + *	PID0 - PID2		0x270 ~ 0x280	(dw * 78 ~ dw * 80)
> 
> PID1/PID2 are e500v1/v2 only -- and Linux doesn't use them outside of
> KVM (and you're not in KVM when you're running this code).
> 
> Are we ever going to have a non-zero PID at this point?

I incline to the view that saving all registers regardless of used or
unused. The good point is that it can be compliant to the future
changes of the usage of registers.

What do you think?

> 
> > + * Debug Registers
> > + *	DBCR0 - DBCR2		0x288 ~ 0x298	(dw * 81 ~ dw * 83)
> > + *	IAC1 - IAC4		0x2a0 ~ 0x2b8	(dw * 84 ~ dw * 87)
> > + *	DAC1 - DAC2		0x2c0 ~ 0x2c8	(dw * 88 ~ dw * 89)
> > + *
> > + */
> 
> IAC3-4 are not implemented on e500.
> 
> Do we really need to save the debug registers?  We're not going to be in
> a debugged process when we do suspend.  If the concern is kgdb, it
> probably needs to be told to get out of the way during suspend for other
> reasons.

I think in the ideal case the suspend would not break any context. We
should try to save/restore all cpu state. Of cause, trade-off is
unavoidable in practice.

> 
> > +#define SR_GPR1			0x000
> > +#define SR_GPR2			0x008
> > +#define SR_GPR13		0x010
> > +#define SR_FPR14		0x0a8
> > +#define SR_CR			0x138
> > +#define SR_LR			0x140
> > +#define SR_MSR			0x148
> 
> These are very vague names to be putting in a header file.

How about BOOKE_xx_OFFSET?

> 
> > +/*
> > + * hibernation and deepsleep save/restore different number of registers,
> > + * use these flags to indicate.
> > + */
> > +#define HIBERNATION_FLAG	1
> > +#define DEEPSLEEP_FLAG		2
> 
> Again, namespacing -- but why is hibernation using this at all?  What's
> wrong with the existing hibernation support?

How about BOOKE_HIBERNATION_FLAG?

Just wish to share code between hibernation and suspend.

> 
> > diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> > index fcc9a89..64acae6 100644
> > --- a/arch/powerpc/kernel/Makefile
> > +++ b/arch/powerpc/kernel/Makefile
> > @@ -73,6 +73,7 @@ obj-$(CONFIG_HIBERNATION)	+= swsusp_booke.o
> >  else
> >  obj-$(CONFIG_HIBERNATION)	+= swsusp_$(CONFIG_WORD_SIZE).o
> >  endif
> > +obj-$(CONFIG_E500)		+= booke_save_regs.o
> 
> Shouldn't this depend on whether suspend is enabled?
> 
> > diff --git a/arch/powerpc/kernel/booke_save_regs.S b/arch/powerpc/kernel/booke_save_regs.S
> > new file mode 100644
> > index 0000000..9ccd576
> > --- /dev/null
> > +++ b/arch/powerpc/kernel/booke_save_regs.S
> > @@ -0,0 +1,361 @@
> > +/*
> > + * Freescale Power Management, Save/Restore core state
> > + *
> > + * Copyright 2014 Freescale Semiconductor, Inc.
> > + * Author: Wang Dongsheng <dongsheng.wang@freescale.com>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +
> > +#include <asm/ppc_asm.h>
> > +#include <asm/booke_save_regs.h>
> > +
> > +/*
> > + * Supported Core List:
> > + * E500v1, E500v2, E500MC, E5500, E6500.
> > + */
> > +
> > +/* Save/Restore register operation define */
> > +#define do_save_gpr_reg(reg, addr) \
> > +	PPC_STL reg, addr(r10)
> > +
> > +#define do_save_fpr_reg(reg, addr) \
> > +	stfd	reg, addr(r10)
> > +
> > +#define do_save_spr_reg(reg, addr) \
> > +	mfspr	r0, SPRN_##reg ;\
> > +	PPC_STL r0, addr(r10)
> > +
> > +#define do_save_special_reg(special, addr) \
> > +	mf##special	r0 ;\
> > +	PPC_STL r0, addr(r10)
> > +
> > +#define do_restore_gpr_reg(reg, addr) \
> > +	PPC_LL reg, addr(r10)
> > +
> > +#define do_restore_fpr_reg(reg, addr) \
> > +	lfd	reg, addr(r10)
> > +
> > +#define do_restore_spr_reg(reg, addr) \
> > +	PPC_LL r0, addr(r10) ;\
> > +	mtspr	SPRN_##reg, r0
> > +
> > +#define do_restore_special_reg(special, addr) \
> > +	PPC_LL r0, addr(r10) ;\
> > +	mt##special	r0
> > +
> > +#define do_sr_general_gpr_regs(func) \
> > +	do_##func##_gpr_reg(r1, SR_GPR1) ;\
> > +	do_##func##_gpr_reg(r2, SR_GPR2) ;\
> > +	do_##func##_gpr_reg(r13, SR_GPR13 + 0x00) ;\
> > +	do_##func##_gpr_reg(r14, SR_GPR13 + 0x08) ;\
> > +	do_##func##_gpr_reg(r15, SR_GPR13 + 0x10) ;\
> > +	do_##func##_gpr_reg(r16, SR_GPR13 + 0x18) ;\
> > +	do_##func##_gpr_reg(r17, SR_GPR13 + 0x20) ;\
> > +	do_##func##_gpr_reg(r18, SR_GPR13 + 0x28) ;\
> > +	do_##func##_gpr_reg(r19, SR_GPR13 + 0x30) ;\
> > +	do_##func##_gpr_reg(r20, SR_GPR13 + 0x38) ;\
> > +	do_##func##_gpr_reg(r21, SR_GPR13 + 0x40) ;\
> > +	do_##func##_gpr_reg(r22, SR_GPR13 + 0x48) ;\
> > +	do_##func##_gpr_reg(r23, SR_GPR13 + 0x50) ;\
> > +	do_##func##_gpr_reg(r24, SR_GPR13 + 0x58) ;\
> > +	do_##func##_gpr_reg(r25, SR_GPR13 + 0x60) ;\
> > +	do_##func##_gpr_reg(r26, SR_GPR13 + 0x68) ;\
> > +	do_##func##_gpr_reg(r27, SR_GPR13 + 0x70) ;\
> > +	do_##func##_gpr_reg(r28, SR_GPR13 + 0x78) ;\
> > +	do_##func##_gpr_reg(r29, SR_GPR13 + 0x80) ;\
> > +	do_##func##_gpr_reg(r30, SR_GPR13 + 0x88) ;\
> > +	do_##func##_gpr_reg(r31, SR_GPR13 + 0x90)
> > +
> > +#define do_sr_general_pcr_regs(func) \
> > +	do_##func##_spr_reg(EPCR, SR_EPCR) ;\
> > +	do_##func##_spr_reg(HID0, SR_HID0 + 0x00)
> > +
> > +#define do_sr_e500_pcr_regs(func) \
> > +	do_##func##_spr_reg(HID1, SR_HID0 + 0x08)
> 
> PCR?
> 
> In the comments you said "Only e500, e500v2 need to save HID0 - HID1",
> yet you save HID0 in the "general" macro.

Will change it.

> 
> > +#define do_sr_save_tb_regs \
> > +	do_save_spr_reg(TBRU, SR_TBU) ;\
> > +	do_save_spr_reg(TBRL, SR_TBL)
> 
> What if TBU increments between those two reads?  Use the standard
> sequence to read the timebase.
> 
> > +#define do_sr_interrupt_regs(func) \
> > +	do_##func##_spr_reg(IVPR, SR_IVPR) ;\
> > +	do_##func##_spr_reg(IVOR0, SR_IVOR0 + 0x00) ;\
> > +	do_##func##_spr_reg(IVOR1, SR_IVOR0 + 0x08) ;\
> > +	do_##func##_spr_reg(IVOR2, SR_IVOR0 + 0x10) ;\
> > +	do_##func##_spr_reg(IVOR3, SR_IVOR0 + 0x18) ;\
> > +	do_##func##_spr_reg(IVOR4, SR_IVOR0 + 0x20) ;\
> > +	do_##func##_spr_reg(IVOR5, SR_IVOR0 + 0x28) ;\
> > +	do_##func##_spr_reg(IVOR6, SR_IVOR0 + 0x30) ;\
> > +	do_##func##_spr_reg(IVOR7, SR_IVOR0 + 0x38) ;\
> > +	do_##func##_spr_reg(IVOR8, SR_IVOR0 + 0x40) ;\
> > +	do_##func##_spr_reg(IVOR10, SR_IVOR0 + 0x50) ;\
> > +	do_##func##_spr_reg(IVOR11, SR_IVOR0 + 0x58) ;\
> > +	do_##func##_spr_reg(IVOR12, SR_IVOR0 + 0x60) ;\
> > +	do_##func##_spr_reg(IVOR13, SR_IVOR0 + 0x68) ;\
> > +	do_##func##_spr_reg(IVOR14, SR_IVOR0 + 0x70) ;\
> > +	do_##func##_spr_reg(IVOR15, SR_IVOR0 + 0x78)
> > +
> > +#define do_e500_sr_interrupt_regs(func) \
> > +	do_##func##_spr_reg(IVOR32, SR_IVOR32 + 0x00) ;\
> > +	do_##func##_spr_reg(IVOR33, SR_IVOR32 + 0x08) ;\
> > +	do_##func##_spr_reg(IVOR34, SR_IVOR32 + 0x10)
> > +
> > +#define do_e500mc_sr_interrupt_regs(func) \
> > +	do_##func##_spr_reg(IVOR9, SR_IVOR0 + 0x48) ;\
> > +	do_##func##_spr_reg(IVOR35, SR_IVOR32 + 0x18) ;\
> > +	do_##func##_spr_reg(IVOR36, SR_IVOR32 + 0x20) ;\
> > +	do_##func##_spr_reg(IVOR37, SR_IVOR32 + 0x28) ;\
> > +	do_##func##_spr_reg(IVOR38, SR_IVOR32 + 0x30) ;\
> > +	do_##func##_spr_reg(IVOR39, SR_IVOR32 + 0x38) ;\
> > +	do_##func##_spr_reg(IVOR40, SR_IVOR32 + 0x40) ;\
> > +	do_##func##_spr_reg(IVOR41, SR_IVOR32 + 0x48)
> > +
> > +#define do_e5500_sr_interrupt_regs(func) \
> > +	do_e500mc_sr_interrupt_regs(func)
> > +
> > +#define do_e6500_sr_interrupt_regs(func) \
> > +	do_##func##_spr_reg(IVOR32, SR_IVOR32 + 0x00) ;\
> > +	do_##func##_spr_reg(IVOR33, SR_IVOR32 + 0x08) ;\
> > +	do_e5500_sr_interrupt_regs(func)
> > +
> > +#define do_sr_general_software_regs(func) \
> > +	do_##func##_spr_reg(SPRG1, SR_SPRG1) ;\
> > +	do_##func##_spr_reg(SPRG3, SR_SPRG3) ;\
> > +	do_##func##_spr_reg(PID0, SR_PID0)
> > +
> > +#define do_sr_e500_mmu_regs(func) \
> > +	do_##func##_spr_reg(PID1, SR_PID0 + 0x08) ;\
> > +	do_##func##_spr_reg(PID2, SR_PID0 + 0x10)
> > +
> > +#define do_sr_debug_regs(func) \
> > +	do_##func##_spr_reg(DBCR0, SR_DBCR0 + 0x00) ;\
> > +	do_##func##_spr_reg(DBCR1, SR_DBCR0 + 0x08) ;\
> > +	do_##func##_spr_reg(DBCR2, SR_DBCR0 + 0x10) ;\
> > +	do_##func##_spr_reg(IAC1, SR_IAC1 + 0x00) ;\
> > +	do_##func##_spr_reg(IAC2, SR_IAC1 + 0x08) ;\
> > +	do_##func##_spr_reg(DAC1, SR_DAC1 + 0x00) ;\
> > +	do_##func##_spr_reg(DAC2, SR_DAC1 + 0x08)
> > +
> > +#define do_e6500_sr_debug_regs(func) \
> > +	do_##func##_spr_reg(IAC3, SR_IAC1 + 0x10) ;\
> > +	do_##func##_spr_reg(IAC4, SR_IAC1 + 0x18)
> > +
> > +	.section .text
> > +	.align	5
> > +booke_cpu_base_save:
> > +	do_sr_general_gpr_regs(save)
> > +	do_sr_general_pcr_regs(save)
> > +	do_sr_general_software_regs(save)
> > +1:
> > +	mfspr	r5, SPRN_TBRU
> > +	do_sr_general_time_regs(save)
> > +	mfspr	r6, SPRN_TBRU
> > +	cmpw	r5, r6
> > +	bne	1b
> 
> Oh, here's where you deal with that.  Why?  It just obfuscates things.

Will make it more readable.

> 
> > +booke_cpu_base_restore:
> > +	do_sr_general_gpr_regs(restore)
> > +	do_sr_general_pcr_regs(restore)
> > +	do_sr_general_software_regs(restore)
> > +
> > +	isync
> > +
> > +	/* Restore Time registers, clear tb lower to avoid wrap */
> > +	li	r0, 0
> > +	mtspr	SPRN_TBWL, r0
> > +	do_sr_general_time_regs(restore)
> 
> Why is zeroing TBL not done in the same place as you load the new TB?

Will fix it.

> 
> > +/* Base registers, e500v1, e500v2 need to do some special save/restore */
> > +e500_base_special_save:
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V1@l
> > +	cmpw	r11, r12
> > +	beq	1f
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V2@l
> > +	cmpw	r11, r12
> > +	bne	2f
> > +1:
> > +	do_sr_e500_pcr_regs(save)
> > +	do_sr_e500_mmu_regs(save)
> > +2:
> > +	blr
> > +
> > +e500_base_special_restore:
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V1@l
> > +	cmpw	r11, r12
> > +	beq	1f
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V2@l
> > +	cmpw	r11, r12
> > +	bne	2f
> > +1:
> > +	do_sr_e500_pcr_regs(save)
> > +	do_sr_e500_mmu_regs(save)
> > +2:
> > +	blr
> 
> Why is this separate from the other CPU-specific "append" code?
> 
> > +booke_cpu_append_save:
> > +	mfspr	r0, SPRN_PVR
> > +	rlwinm	r11, r0, 16, 16, 31
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E6500@l
> > +	cmpw	r11, r12
> > +	beq	e6500_append_save
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E5500@l
> > +	cmpw	r11, r12
> > +	beq	e5500_append_save
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500MC@l
> > +	cmpw	r11, r12
> > +	beq	e500mc_append_save
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V2@l
> > +	cmpw	r11, r12
> > +	beq	e500v2_append_save
> > +
> > +	lis	r12, 0
> > +	ori	r12, r12, PVR_VER_E500V1@l
> > +	cmpw	r11, r12
> > +	beq	e500v1_append_save
> > +	b	1f
> > +
> > +e6500_append_save:
> > +	do_e6500_sr_interrupt_regs(save)
> > +	do_e6500_sr_debug_regs(save)
> > +	b	1f
> > +
> > +e5500_append_save:
> > +	do_e5500_sr_interrupt_regs(save)
> > +	b	1f
> > +
> > +e500mc_append_save:
> > +	do_e500mc_sr_interrupt_regs(save)
> > +	b	1f
> > +
> > +e500v2_append_save:
> > +e500v1_append_save:
> > +	do_e500_sr_interrupt_regs(save)
> > +1:
> > +	do_sr_interrupt_regs(save)
> > +	do_sr_debug_regs(save)
> > +
> > +	blr
> 
> What is meant by "append" here?

Means extra registers for specific e500 derivative core.

> 
> > +/*
> > + * r3 = the address of buffer
> > + * r4 = type: HIBERNATION_FLAG or DEEPSLEEP_FLAG
> > + */
> > +_GLOBAL(booke_cpu_state_save)
> > +	mflr	r9
> > +	mr	r10, r3
> > +
> > +	cmpwi	r4, HIBERNATION_FLAG
> > +	beq	1f
> > +	bl	booke_cpu_append_save
> > +1:
> > +	bl	e500_base_special_save
> > +	bl	booke_cpu_base_save
> > +
> > +	mtlr	r9
> > +	blr
> 
> You're assuming a special ABI from these subfunctions (e.g. r9
> non-volatile) but I don't see any comment to that effect on those
> functions.
> 
> -Scott

Will add comments.

-Chenhui

^ permalink raw reply

* Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040
From: Chenhui Zhao @ 2014-03-12 10:40 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, Jason.Jin
In-Reply-To: <1394586624.13761.132.camel@snotra.buserror.net>

On Tue, Mar 11, 2014 at 08:10:24PM -0500, Scott Wood wrote:
> On Fri, 2014-03-07 at 12:58 +0800, Chenhui Zhao wrote:
> > From: Zhao Chenhui <chenhui.zhao@freescale.com>
> > 
> > T1040 supports deep sleep feature, which can switch off most parts of
> > the SoC when it is in deep sleep mode. This way, it becomes more
> > energy-efficient.
> > 
> > The DDR controller will also be powered off in deep sleep. Therefore,
> > the last stage (the latter part of fsl_dp_enter_low) will run without DDR
> > access. This piece of code and related TLBs will be prefetched.
> > 
> > Due to the different initialization code between 32-bit and 64-bit, they
> > have seperate resume entry and precedure.
> > 
> > The feature supports 32-bit and 64-bit kernel mode.
> > 
> > Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
> > ---
> >  arch/powerpc/include/asm/booke_save_regs.h |    3 +
> >  arch/powerpc/kernel/cpu_setup_fsl_booke.S  |   17 ++
> >  arch/powerpc/kernel/head_fsl_booke.S       |   30 +++
> >  arch/powerpc/platforms/85xx/Makefile       |    2 +-
> >  arch/powerpc/platforms/85xx/deepsleep.c    |  201 +++++++++++++++++++
> >  arch/powerpc/platforms/85xx/qoriq_pm.c     |   38 ++++
> >  arch/powerpc/platforms/85xx/sleep.S        |  295 ++++++++++++++++++++++++++++
> >  arch/powerpc/sysdev/fsl_soc.h              |    7 +
> >  8 files changed, 592 insertions(+), 1 deletions(-)
> >  create mode 100644 arch/powerpc/platforms/85xx/deepsleep.c
> >  create mode 100644 arch/powerpc/platforms/85xx/sleep.S
> > 
> > diff --git a/arch/powerpc/include/asm/booke_save_regs.h b/arch/powerpc/include/asm/booke_save_regs.h
> > index 87c357a..37c1f6c 100644
> > --- a/arch/powerpc/include/asm/booke_save_regs.h
> > +++ b/arch/powerpc/include/asm/booke_save_regs.h
> > @@ -88,6 +88,9 @@
> >  #define HIBERNATION_FLAG	1
> >  #define DEEPSLEEP_FLAG		2
> >  
> > +#define CPLD_FLAG	1
> > +#define FPGA_FLAG	2
> 
> What is this?

We have two kind of boards, QDS and RDB. They have different register
map. Use the flag to indicate the current board is using which kind
of register map.

> 
> >  #ifndef __ASSEMBLY__
> >  extern void booke_cpu_state_save(void *buf, int type);
> >  extern void *booke_cpu_state_restore(void *buf, int type);
> > diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> > index e59d6de..ea9bc28 100644
> > --- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> > +++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
> > @@ -318,6 +318,23 @@ flush_backside_L2_cache:
> >  2:
> >  	blr
> >  
> > +#define CPC_CPCCSR0		0x0
> > +#define CPC_CPCCSR0_CPCFL	0x800
> 
> This is supposed to be CPU setup, not platform setup.
> 
> > +/* r3 : the base address of CPC  */
> > +_GLOBAL(fsl_flush_cpc_cache)
> > +	lwz	r6, CPC_CPCCSR0(r3)
> > +	ori	r6, r6, CPC_CPCCSR0_CPCFL
> > +	stw	r6, CPC_CPCCSR0(r3)
> > +	sync
> > +
> > +	/* Wait until completing the flush */
> > +1:	lwz	r6, CPC_CPCCSR0(r3)
> > +	andi.	r6, r6, CPC_CPCCSR0_CPCFL
> > +	bne	1b
> > +
> > +	blr
> > +
> >  _GLOBAL(__flush_caches_e500v2)
> 
> I'm not sure that this belongs here either.

Will find a better place.

> 
> >  	mflr r0
> >  	bl	flush_dcache_L1
> > diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
> > index 20204fe..3285752 100644
> > --- a/arch/powerpc/kernel/head_fsl_booke.S
> > +++ b/arch/powerpc/kernel/head_fsl_booke.S
> > @@ -162,6 +162,19 @@ _ENTRY(__early_start)
> >  #include "fsl_booke_entry_mapping.S"
> >  #undef ENTRY_MAPPING_BOOT_SETUP
> >  
> > +#if defined(CONFIG_SUSPEND) && defined(CONFIG_FSL_CORENET_RCPM)
> > +	/* if deep_sleep_flag != 0, jump to the deep sleep resume entry */
> > +	LOAD_REG_ADDR(r4, deep_sleep_flag)
> > +	lwz	r3, 0(r4)
> > +	cmpwi	r3, 0
> > +	beq	11f
> > +	/* clear deep_sleep_flag */
> > +	li	r3, 0
> > +	stw	r3, 0(r4)
> > +	b	fsl_deepsleep_resume
> > +11:
> > +#endif
> 
> Why do you come in via the normal kernel entry, versus specifying a
> direct entry point for deep sleep resume?  How does U-Boot even know
> what the normal entry is when resuming?

I wish to return to a specified point (like 64-bit mode), but the code in
fsl_booke_entry_mapping.S only can run in the first page. Because it
only setups a temp mapping of 4KB.

> 
> Be careful of the "beq set_ivor" in the CONFIG_RELOCATABLE section
> above.  Also you probably don't want the relocation code to run again
> when resuming.

When resuming, will run from the point __early_start. Don't run the code
in the CONFIG_RELOCATABLE section.

> 
> > +#if defined(CONFIG_SUSPEND) && defined(CONFIG_FSL_CORENET_RCPM)
> > +_ENTRY(__entry_deep_sleep)
> > +/*
> > + * Bootloader will jump to here when resuming from deep sleep.
> > + * After executing the init code in fsl_booke_entry_mapping.S,
> > + * will jump to the real resume entry.
> > + */
> > +	li	r8, 1
> > +	bl	12f
> > +12:	mflr	r9
> > +	addi	r9, r9, (deep_sleep_flag - 12b)
> > +	stw	r8, 0(r9)
> > +	b __early_start
> > +deep_sleep_flag:
> > +	.long	0
> > +#endif
> 
> It's a bit ambiguous to say "entry_deep_sleep" when it's resuming rather
> than entering...

How about __fsl_entry_resume?

> 
> So you do have a special entry point.  Why do you go to __early_start
> only to quickly test a flag and branch away?
> 
> > diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
> > index 7fae817..9a4ea86 100644
> > --- a/arch/powerpc/platforms/85xx/Makefile
> > +++ b/arch/powerpc/platforms/85xx/Makefile
> > @@ -3,7 +3,7 @@
> >  #
> >  obj-$(CONFIG_SMP) += smp.o
> >  ifeq ($(CONFIG_FSL_CORENET_RCPM), y)
> > -obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o
> > +obj-$(CONFIG_SUSPEND)	+= qoriq_pm.o deepsleep.o sleep.o
> >  endif
> >  
> >  obj-y += common.o
> > diff --git a/arch/powerpc/platforms/85xx/deepsleep.c b/arch/powerpc/platforms/85xx/deepsleep.c
> > new file mode 100644
> > index 0000000..ddd7185
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/85xx/deepsleep.c
> > @@ -0,0 +1,201 @@
> > +/*
> > + * Support deep sleep feature
> 
> AFAICT this supports deep sleep on T1040, not on all 85xx that has it.

Yes, for now T1040 and T1042.

> 
> > + *
> > + * Copyright 2014 Freescale Semiconductor Inc.
> > + *
> > + * Author: Chenhui Zhao <chenhui.zhao@freescale.com>
> > + *
> > + * This program is free software; you can redistribute	it and/or modify it
> > + * under  the terms of	the GNU General	 Public License as published by the
> > + * Free Software Foundation;  either version 2 of the  License, or (at your
> > + * option) any later version.
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/of_address.h>
> > +#include <asm/machdep.h>
> > +#include <sysdev/fsl_soc.h>
> > +#include <asm/booke_save_regs.h>
> > +
> > +#define SIZE_1MB	0x100000
> > +#define SIZE_2MB	0x200000
> > +
> > +#define CCSR_SCFG_DPSLPCR	0xfc000
> > +#define CCSR_SCFG_DPSLPCR_WDRR_EN	0x1
> > +#define CCSR_SCFG_SPARECR2	0xfc504
> > +#define CCSR_SCFG_SPARECR3	0xfc508
> > +
> > +#define CCSR_GPIO1_GPDIR	0x130000
> > +#define CCSR_GPIO1_GPODR	0x130004
> > +#define CCSR_GPIO1_GPDAT	0x130008
> > +#define CCSR_GPIO1_GPDIR_29		0x4
> > +
> > +/* 128 bytes buffer for restoring data broke by DDR training initialization */
> > +#define DDR_BUF_SIZE	128
> > +static u8 ddr_buff[DDR_BUF_SIZE] __aligned(64);
> > +
> > +static void *dcsr_base, *ccsr_base, *pld_base;
> > +static int pld_flag;
> > +
> > +int fsl_dp_iomap(void)
> > +{
> > +	struct device_node *np;
> > +	const u32 *prop;
> > +	int ret = 0;
> > +	u64 ccsr_phy_addr, dcsr_phy_addr;
> > +
> > +	np = of_find_node_by_type(NULL, "soc");
> > +	if (!np) {
> > +		pr_err("%s: Can't find the node of \"soc\"\n", __func__);
> > +		ret = -EINVAL;
> > +		goto ccsr_err;
> > +	}
> > +	prop = of_get_property(np, "ranges", NULL);
> > +	if (!prop) {
> > +		pr_err("%s: Can't find the property of \"ranges\"\n", __func__);
> > +		of_node_put(np);
> > +		ret = -EINVAL;
> > +		goto ccsr_err;
> > +	}
> 
> Use get_immrbase(), or better use specific nodes in the device tree.
> 
> > +	ccsr_phy_addr = of_translate_address(np, prop + 1);
> > +	ccsr_base = ioremap((phys_addr_t)ccsr_phy_addr, SIZE_2MB);
> > +	of_node_put(np);
> > +	if (!ccsr_base) {
> > +		ret = -ENOMEM;
> > +		goto ccsr_err;
> > +	}
> 
> Unnecessary cast.
> 
> Why 2 MiB?

All registers used are inside the 2MB address space.

> 
> > +	np = of_find_compatible_node(NULL, NULL, "fsl,dcsr");
> > +	if (!np) {
> > +		pr_err("%s: Can't find the node of \"fsl,dcsr\"\n", __func__);
> > +		ret = -EINVAL;
> > +		goto dcsr_err;
> > +	}
> > +	prop = of_get_property(np, "ranges", NULL);
> > +	if (!prop) {
> > +		pr_err("%s: Can't find the property of \"ranges\"\n", __func__);
> > +		of_node_put(np);
> > +		ret = -EINVAL;
> > +		goto dcsr_err;
> > +	}
> > +	dcsr_phy_addr = of_translate_address(np, prop + 1);
> > +	dcsr_base = ioremap((phys_addr_t)dcsr_phy_addr, SIZE_1MB);
> > +	of_node_put(np);
> > +	if (!dcsr_base) {
> > +		ret = -ENOMEM;
> > +		goto dcsr_err;
> > +	}
> 
> If you must do this, add a helper to get the dcsr base -- but do we not
> already have dcsr subnodes for what you are using?
> 
> > +
> > +	np = of_find_compatible_node(NULL, NULL, "fsl,fpga-qixis");
> > +	if (np) {
> > +		pld_flag = FPGA_FLAG;
> > +	} else {
> > +		np = of_find_compatible_node(NULL, NULL, "fsl,p104xrdb-cpld");
> > +		if (np) {
> > +			pld_flag = CPLD_FLAG;
> > +		} else {
> > +			pr_err("%s: Can't find the FPGA/CPLD node\n",
> > +					__func__);
> > +			ret = -EINVAL;
> > +			goto pld_err;
> > +		}
> > +	}
> 
> OK, so this file isn't even specific to T1040 -- it's specific to our
> reference boards?

Yes. There are some FPGA/CPLD setting which needed by deep sleep.

> 
> > +static void fsl_dp_ddr_save(void *ccsr_base)
> > +{
> > +	u32 ddr_buff_addr;
> > +
> > +	/*
> > +	 * DDR training initialization will break 128 bytes at the beginning
> > +	 * of DDR, therefore, save them so that the bootloader will restore
> > +	 * them. Assume that DDR is mapped to the address space started with
> > +	 * CONFIG_PAGE_OFFSET.
> > +	 */
> > +	memcpy(ddr_buff, (void *)CONFIG_PAGE_OFFSET, DDR_BUF_SIZE);
> > +
> > +	/* assume ddr_buff is in the physical address space of 4GB */
> > +	ddr_buff_addr = (u32)(__pa(ddr_buff) & 0xffffffff);
> 
> That assumption may break with a relocatable kernel.
> 
> > +
> > +}
> > +
> > +int fsl_enter_epu_deepsleep(void)
> > +{
> > +
> > +
> 
> No blank lines at begin/end of function.
> 
> > diff --git a/arch/powerpc/platforms/85xx/qoriq_pm.c b/arch/powerpc/platforms/85xx/qoriq_pm.c
> > index 915b13b..5f2c016 100644
> > --- a/arch/powerpc/platforms/85xx/qoriq_pm.c
> > +++ b/arch/powerpc/platforms/85xx/qoriq_pm.c
> > @@ -20,6 +20,8 @@
> >  #define FSL_SLEEP		0x1
> >  #define FSL_DEEP_SLEEP		0x2
> >  
> > +int (*fsl_enter_deepsleep)(void);
> > +
> >  /* specify the sleep state of the present platform */
> >  int sleep_pm_state;
> >  /* supported sleep modes by the present platform */
> > @@ -28,6 +30,7 @@ static unsigned int sleep_modes;
> >  static int qoriq_suspend_enter(suspend_state_t state)
> >  {
> >  	int ret = 0;
> > +	int cpu;
> >  
> >  	switch (state) {
> >  	case PM_SUSPEND_STANDBY:
> > @@ -39,6 +42,17 @@ static int qoriq_suspend_enter(suspend_state_t state)
> >  
> >  		break;
> >  
> > +	case PM_SUSPEND_MEM:
> > +
> > +		cpu = smp_processor_id();
> > +		qoriq_pm_ops->irq_mask(cpu);
> > +
> > +		ret = fsl_enter_deepsleep();
> > +
> > +		qoriq_pm_ops->irq_unmask(cpu);
> > +
> > +		break;
> > +
> >  	default:
> >  		ret = -EINVAL;
> >  
> > @@ -52,12 +66,30 @@ static int qoriq_suspend_valid(suspend_state_t state)
> >  	if (state == PM_SUSPEND_STANDBY && (sleep_modes & FSL_SLEEP))
> >  		return 1;
> >  
> > +	if (state == PM_SUSPEND_MEM && (sleep_modes & FSL_DEEP_SLEEP))
> > +		return 1;
> > +
> >  	return 0;
> >  }
> >  
> > +static int qoriq_suspend_begin(suspend_state_t state)
> > +{
> > +	if (state == PM_SUSPEND_MEM)
> > +		return fsl_dp_iomap();
> > +
> > +	return 0;
> > +}
> > +
> > +static void qoriq_suspend_end(void)
> > +{
> > +	fsl_dp_iounmap();
> > +}
> > +
> >  static const struct platform_suspend_ops qoriq_suspend_ops = {
> >  	.valid = qoriq_suspend_valid,
> >  	.enter = qoriq_suspend_enter,
> > +	.begin = qoriq_suspend_begin,
> > +	.end = qoriq_suspend_end,
> >  };
> >  
> >  static int __init qoriq_suspend_init(void)
> > @@ -71,6 +103,12 @@ static int __init qoriq_suspend_init(void)
> >  	if (np)
> >  		sleep_pm_state = PLAT_PM_LPM20;
> >  
> > +	np = of_find_compatible_node(NULL, NULL, "fsl,t1040-rcpm");
> > +	if (np) {
> > +		fsl_enter_deepsleep = fsl_enter_epu_deepsleep;
> > +		sleep_modes |= FSL_DEEP_SLEEP;
> > +	}
> > +
> >  	suspend_set_ops(&qoriq_suspend_ops);
> >  
> >  	return 0;
> > diff --git a/arch/powerpc/platforms/85xx/sleep.S b/arch/powerpc/platforms/85xx/sleep.S
> > new file mode 100644
> > index 0000000..95a5746
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/85xx/sleep.S
> > @@ -0,0 +1,295 @@
> > +/*
> > + * Implement the low level part of deep sleep
> > + *
> > + * Copyright 2014 Freescale Semiconductor Inc.
> > + *
> > + * This program is free software; you can redistribute	it and/or modify it
> > + * under  the terms of	the GNU General	 Public License as published by the
> > + * Free Software Foundation;  either version 2 of the  License, or (at your
> > + * option) any later version.
> > + */
> > +
> > +#include <asm/page.h>
> > +#include <asm/ppc_asm.h>
> > +#include <asm/reg.h>
> > +#include <asm/asm-offsets.h>
> > +#include <asm/booke_save_regs.h>
> > +#include <asm/mmu.h>
> > +
> > +#define FSLDELAY(count)		\
> > +	li	r3, (count)@l;	\
> > +	slwi	r3, r3, 10;	\
> > +	mtctr	r3;		\
> > +101:	nop;			\
> > +	bdnz	101b;
> 
> You don't need a namespace prefix on local macros in a non-header file.
> 
> Is the timebase stopped where you're calling this from?

No. My purpose is to avoid jump in the last stage of entering deep sleep.
Jump may cause problem at that time.

> 
> > +#define FSL_DIS_ALL_IRQ		\
> > +	mfmsr	r8;			\
> > +	rlwinm	r8, r8, 0, ~MSR_CE;	\
> > +	rlwinm	r8, r8, 0, ~MSR_ME;	\
> > +	rlwinm	r8, r8, 0, ~MSR_EE;	\
> > +	rlwinm	r8, r8, 0, ~MSR_DE;	\
> > +	mtmsr	r8;			\
> > +	isync
> > +
> > +
> > +	.section .data
> > +	.align	6
> > +booke_regs_buffer:
> > +	.space REGS_BUFFER_SIZE
> > +
> > +	.section .txt
> > +	.align 6
> > +
> > +_GLOBAL(fsl_dp_enter_low)
> > +deepsleep_start:
> > +	LOAD_REG_ADDR(r9, buf_tmp)
> > +	PPC_STL	r3, 0(r9)
> > +	PPC_STL	r4, 8(r9)
> > +	PPC_STL	r5, 16(r9)
> > +	PPC_STL	r6, 24(r9)
> > +
> > +	LOAD_REG_ADDR(r3, booke_regs_buffer)
> > +	/* save the return address */
> > +	mflr	r5
> > +	PPC_STL r5, SR_LR(r3)
> > +	mfmsr	r5
> > +	PPC_STL r5, SR_MSR(r3)
> > +	li	r4, DEEPSLEEP_FLAG
> > +	bl	booke_cpu_state_save
> > +
> > +	LOAD_REG_ADDR(r9, buf_tmp)
> > +	PPC_LL	r31, 0(r9)
> > +	PPC_LL	r30, 8(r9)
> > +	PPC_LL	r29, 16(r9)
> > +	PPC_LL	r28, 24(r9)
> > +
> > +	/* flush caches */
> > +	LOAD_REG_ADDR(r3, cur_cpu_spec)
> > +	PPC_LL	r3, 0(r3)
> > +	PPC_LL	r3, CPU_FLUSH_CACHES(r3)
> > +	PPC_LCMPI  0, r3, 0
> > +	beq	6f
> > +#ifdef CONFIG_PPC64
> > +	PPC_LL	r3, 0(r3)
> > +#endif
> > +	mtctr	r3
> > +	bctrl
> > +6:
> > +#define CPC_OFFSET	0x10000
> > +	mr	r3, r31
> > +	addis	r3, r3, CPC_OFFSET@h
> > +	bl	fsl_flush_cpc_cache
> > +
> > +	LOAD_REG_ADDR(r8, deepsleep_start)
> > +	LOAD_REG_ADDR(r9, deepsleep_end)
> > +
> > +	/* prefecth TLB */
> > +#define CCSR_GPIO1_GPDAT	0x130008
> > +#define CCSR_GPIO1_GPDAT_29	0x4
> > +	LOAD_REG_IMMEDIATE(r11, CCSR_GPIO1_GPDAT)
> > +	add	r11, r31, r11
> > +	lwz	r10, 0(r11)
> > +
> > +#define CCSR_RCPM_PCPH15SETR	0xe20b4
> > +#define CCSR_RCPM_PCPH15SETR_CORE0	0x1
> > +	LOAD_REG_IMMEDIATE(r12, CCSR_RCPM_PCPH15SETR)
> > +	add	r12, r31, r12
> > +	lwz	r10, 0(r12)
> > +
> > +#define CCSR_DDR_SDRAM_CFG_2	0x8114
> > +#define CCSR_DDR_SDRAM_CFG_2_FRC_SR	0x80000000
> > +	LOAD_REG_IMMEDIATE(r13, CCSR_DDR_SDRAM_CFG_2)
> > +	add	r13, r31, r13
> > +	lwz	r10, 0(r13)
> > +
> > +#define	DCSR_EPU_EPGCR		0x000
> > +#define DCSR_EPU_EPGCR_GCE	0x80000000
> > +	li	r14, DCSR_EPU_EPGCR
> > +	add	r14, r30, r14
> > +	lwz	r10, 0(r14)
> > +
> > +#define	DCSR_EPU_EPECR15	0x33C
> > +#define DCSR_EPU_EPECR15_IC0	0x80000000
> > +	li	r15, DCSR_EPU_EPECR15
> > +	add	r15, r30, r15
> > +	lwz	r10, 0(r15)
> > +
> > +#define CCSR_SCFG_QMCRDTRSTCR		0xfc40c
> > +#define CCSR_SCFG_QMCRDTRSTCR_CRDTRST	0x80000000
> > +	LOAD_REG_IMMEDIATE(r16, CCSR_SCFG_QMCRDTRSTCR)
> > +	add	r16, r31, r16
> > +	lwz	r10, 0(r16)
> > +
> > +/*
> > + * There are two kind of register maps, one for CPLD and the other for FPGA
> > + */
> > +#define CPLD_MISCCSR		0x17
> > +#define CPLD_MISCCSR_SLEEPEN	0x40
> > +#define QIXIS_PWR_CTL2		0x21
> > +#define QIXIS_PWR_CTL2_PCTL	0x2
> > +	PPC_LCMPI  0, r28, FPGA_FLAG
> > +	beq	20f
> > +	addi	r29, r29, CPLD_MISCCSR
> > +20:
> > +	addi	r29, r29, QIXIS_PWR_CTL2
> > +	lbz	r10, 0(r29)
> 
> 
> Again, this is not marked as a board-specific file.  How do you expect
> customers to adapt this mechanism to their boards?

Will add comment.

> 
> > +
> > +	/* prefecth code to cache so that executing code after disable DDR */
> > +1:	lwz	r3, 0(r8)
> > +	addi	r8, r8, 4
> > +	cmpw	r8, r9
> > +	blt	1b
> > +	msync
> 
> Instructions don't execute from dcache...  I guess you're assuming it
> will allocate in, and stay in, L2/CPC.  It'd be safer to lock those
> cache lines in icache, or copy the code to SRAM.

Yes, they should be in L2/CPC. Will try to lock them in icache.

> 
> > +	FSL_DIS_ALL_IRQ
> > +
> > +	/*
> > +	 * Place DDR controller in self refresh mode.
> > +	 * From here on, DDR can't be access any more.
> > +	 */
> > +	lwz	r10, 0(r13)
> > +	oris	r10, r10, CCSR_DDR_SDRAM_CFG_2_FRC_SR@h
> > +	stw	r10, 0(r13)
> > +
> > +	/* can't call udelay() here, so use a macro to delay */
> > +	FSLDELAY(50)
> 
> A timebase loop doesn't require accessing DDR.

Don't wish to jump at this time. Maybe cause problems.

> 
> You also probably want to do a "sync, readback, data dependency, isync"
> sequence to make sure that the store has hit CCSR before you begin your
> delay (or is a delay required at all if you do that?).

Yes. It is safer with a sync sequence.

The DDR controller need some time to signal the external DDR modules to
enter self refresh mode.

-Chenhui

^ permalink raw reply

* [PATCH RFC v9 0/6] MPC512x DMA slave s/g support, OF DMA lookup
From: Alexander Popov @ 2014-03-12 11:47 UTC (permalink / raw)
  To: Gerhard Sittig, Dan Williams, Vinod Koul, Lars-Peter Clausen,
	Arnd Bergmann, Anatolij Gustschin, Andy Shevchenko,
	Alexander Popov, linuxppc-dev, dmaengine
  Cc: devicetree

2013/7/14 Gerhard Sittig <gsi@denx.de>:
> this series
> - introduces slave s/g support (that's support for DMA transfers which
>    involve peripherals in contrast to mem-to-mem transfers)
> - adds device tree based lookup support for DMA channels
> - combines floating patches and related feedback which already covered
>    several aspects of what the suggested LPB driver needs, to demonstrate
>    how integration might be done
> - carries Q&D SD card support to enable another DMA client during test,
>    while this patch needs to get dropped upon pickup

Changes in v2:
> - re-order mpc8308 related code paths for improved readability, no
>    change in behaviour, introduction of symbolic channel names here
>    already
> - squash 'execute() start condition' and 'terminate all' into the
>    introduction of 'slave s/g prep' and 'device control' support; refuse
>    s/g lists with more than one item since slave support is operational
>    yet proper s/g support is missing (can get addressed later)
> - always start transfers from software on MPC8308 as there are no
>    external request lines for peripheral flow control
> - drop dt-bindings header file and symbolic channel names in OF nodes

Changes in v3 and v4:
 Part 1/5:
 - use #define instead of enum since individual channels don't require
    special handling.
 Part 2/5:
 - add a flag "will_access_peripheral" to DMA transfer descriptor
    according recommendations of Gerhard Sittig.
    This flag is set in mpc_dma_prep_memcpy() and mpc_dma_prep_slave_sg()
    and is evaluated in mpc_dma_execute() to choose a type of start for
    the transfer.
 - prevent descriptors of transfers which involve peripherals from
    being chained together;
    each of such transfers needs hardware initiated start.
 - add locking while working with struct mpc_dma_chan
    according recommendations of Lars-Peter Clausen.
 - remove default nbytes value. Client kernel modules must set
    src_maxburst and dst_maxburst fields of struct dma_slave_config (dmaengine.h).

Changes in v5:
 Part 2/5:
 - add and improve comments;
 - improve the code moving transfer descriptors from 'queued' to 'active' list
    in mpc_dma_execute();
 - allow mpc_dma_prep_slave_sg() to run with non-empty 'active' list;
 - take 'mdesc' back to 'free' list in case of error in mpc_dma_prep_slave_sg();
 - improve checks of the transfer parameters;
 - provide the default value for 'maxburst' in mpc_dma_device_control().

Changes in v6:
 Part 2/5:
 - remove doubtful comment;
 - fix coding style issues;
 - set default value for 'maxburst' to 1 which applies to most cases;
 Part 3/5:
 - use dma_get_slave_channel() instead of dma_request_channel()
    in new function of_dma_xlate_by_chan_id() according recommendations of
    Arnd Bergmann;
 Part 4/5:
 - set DMA_PRIVATE flag for MPC512x DMA controller since its driver relies on
    of_dma_xlate_by_chan_id() which doesn't use dma_request_channel()
    any more; (removed in v7)
 - resolve little patch conflict;
 Part 5/5:
 - resolve little patch conflict;

Changes in v7:
 Part 2:
 - improve comment;
 Part 4:
 - split in two separate patches. Part 4/6 contains device tree
    binding document and in part 5/6 MPC512x DMA controller is registered
    for device tree channel lookup;
 - remove setting DMA_PRIVATE flag for MPC512x DMA controller from part 5/6;

Changes in v8:
 Part 2:
 - improve comments;
 - fix style issues;
 Part 6:
 - remove since it has become obsolete;

Changes in v9:
 A new patch (part 3/6) is added to this series according the
   feedback of Andy Shevchenko.
 Part 2/6:
 - keep style of the comments;
 - use is_slave_direction() instead of manual checks;
 - remove redundant else branches of the conditions;
 - make mpc_dma_device_control() return -ENXIO for unknown command;
 Part 6/6:
 - change according the new part 3/6;
 - fix style issues;

> known issues:
> - it's yet to get confirmed whether MPC8308 can use slave support or
>    whether the DMA controller's driver shall actively reject it, the
>    information that's available so far suggests that peripheral transfers
>    to IP bus attached I/O is useful and shall not get blocked right away
 - adding support for transfers which don't increment the RAM address or
    do increment the peripheral "port's" address is easy with
    this implementation; but which options of the common API
    should be used for specifying such transfers?
2014/02/13 Gerhard Sittig <gsi@denx.de>:
> - The MPC512x DMA completely lacks a binding document, so one
>    should get added.
> - The MPC8308 hardware is similar and can re-use the MPC512x
>    binding, which should be stated.
> - The Linux implementation currently has no OF based channel
>    lookup support, so '#dma-cells' is "a future feature".  I guess
>    the binding can and should already discuss the feature,
>    regardless of whether all implementations support it.


Alexander Popov (5):
  dma: mpc512x: reorder mpc8308 specific instructions
  dma: mpc512x: add support for peripheral transfers
  dma: mpc512x: replace devm_request_irq() with request_irq()
  dma: of: Add common xlate function for matching by channel id
  dma: mpc512x: register for device tree channel lookup

Gerhard Sittig (1):
  dma: mpc512x: add device tree binding document

 .../devicetree/bindings/dma/mpc512x-dma.txt        |  55 ++++
 arch/powerpc/boot/dts/mpc5121.dtsi                 |   1 +
 drivers/dma/mpc512x_dma.c                          | 308 +++++++++++++++++++--
 drivers/dma/of-dma.c                               |  35 +++
 include/linux/of_dma.h                             |   4 +
 5 files changed, 373 insertions(+), 30 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/mpc512x-dma.txt

-- 
1.8.4.2

^ permalink raw reply

* [PATCH RFC v9 1/6] dma: mpc512x: reorder mpc8308 specific instructions
From: Alexander Popov @ 2014-03-12 11:47 UTC (permalink / raw)
  To: Gerhard Sittig, Dan Williams, Vinod Koul, Lars-Peter Clausen,
	Arnd Bergmann, Anatolij Gustschin, Andy Shevchenko,
	Alexander Popov, linuxppc-dev, dmaengine
In-Reply-To: <1394624875-24411-1-git-send-email-a13xp0p0v88@gmail.com>

Concentrate the specific code for MPC8308 in the 'if' branch
and handle MPC512x in the 'else' branch.
This modification only reorders instructions but doesn't change behaviour.

Signed-off-by: Alexander Popov <a13xp0p0v88@gmail.com>
Acked-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Gerhard Sittig <gsi@denx.de>
---
 drivers/dma/mpc512x_dma.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/dma/mpc512x_dma.c b/drivers/dma/mpc512x_dma.c
index 448750d..2ce248b 100644
--- a/drivers/dma/mpc512x_dma.c
+++ b/drivers/dma/mpc512x_dma.c
@@ -52,9 +52,17 @@
 #define MPC_DMA_DESCRIPTORS	64
 
 /* Macro definitions */
-#define MPC_DMA_CHANNELS	64
 #define MPC_DMA_TCD_OFFSET	0x1000
 
+/*
+ * Maximum channel counts for individual hardware variants
+ * and the maximum channel count over all supported controllers,
+ * used for data structure size
+ */
+#define MPC8308_DMACHAN_MAX	16
+#define MPC512x_DMACHAN_MAX	64
+#define MPC_DMA_CHANNELS	64
+
 /* Arbitration mode of group and channel */
 #define MPC_DMA_DMACR_EDCG	(1 << 31)
 #define MPC_DMA_DMACR_ERGA	(1 << 3)
@@ -710,10 +718,10 @@ static int mpc_dma_probe(struct platform_device *op)
 
 	dma = &mdma->dma;
 	dma->dev = dev;
-	if (!mdma->is_mpc8308)
-		dma->chancnt = MPC_DMA_CHANNELS;
+	if (mdma->is_mpc8308)
+		dma->chancnt = MPC8308_DMACHAN_MAX;
 	else
-		dma->chancnt = 16; /* MPC8308 DMA has only 16 channels */
+		dma->chancnt = MPC512x_DMACHAN_MAX;
 	dma->device_alloc_chan_resources = mpc_dma_alloc_chan_resources;
 	dma->device_free_chan_resources = mpc_dma_free_chan_resources;
 	dma->device_issue_pending = mpc_dma_issue_pending;
@@ -747,7 +755,19 @@ static int mpc_dma_probe(struct platform_device *op)
 	 * - Round-robin group arbitration,
 	 * - Round-robin channel arbitration.
 	 */
-	if (!mdma->is_mpc8308) {
+	if (mdma->is_mpc8308) {
+		/* MPC8308 has 16 channels and lacks some registers */
+		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_ERCA);
+
+		/* enable snooping */
+		out_be32(&mdma->regs->dmagpor, MPC_DMA_DMAGPOR_SNOOP_ENABLE);
+		/* Disable error interrupts */
+		out_be32(&mdma->regs->dmaeeil, 0);
+
+		/* Clear interrupts status */
+		out_be32(&mdma->regs->dmaintl, 0xFFFF);
+		out_be32(&mdma->regs->dmaerrl, 0xFFFF);
+	} else {
 		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_EDCG |
 					MPC_DMA_DMACR_ERGA | MPC_DMA_DMACR_ERCA);
 
@@ -768,18 +788,6 @@ static int mpc_dma_probe(struct platform_device *op)
 		/* Route interrupts to IPIC */
 		out_be32(&mdma->regs->dmaihsa, 0);
 		out_be32(&mdma->regs->dmailsa, 0);
-	} else {
-		/* MPC8308 has 16 channels and lacks some registers */
-		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_ERCA);
-
-		/* enable snooping */
-		out_be32(&mdma->regs->dmagpor, MPC_DMA_DMAGPOR_SNOOP_ENABLE);
-		/* Disable error interrupts */
-		out_be32(&mdma->regs->dmaeeil, 0);
-
-		/* Clear interrupts status */
-		out_be32(&mdma->regs->dmaintl, 0xFFFF);
-		out_be32(&mdma->regs->dmaerrl, 0xFFFF);
 	}
 
 	/* Register DMA engine */
-- 
1.8.4.2

^ permalink raw reply related

* [PATCH RFC v9 2/6] dma: mpc512x: add support for peripheral transfers
From: Alexander Popov @ 2014-03-12 11:47 UTC (permalink / raw)
  To: Gerhard Sittig, Dan Williams, Vinod Koul, Lars-Peter Clausen,
	Arnd Bergmann, Anatolij Gustschin, Andy Shevchenko,
	Alexander Popov, linuxppc-dev, dmaengine
In-Reply-To: <1394624875-24411-1-git-send-email-a13xp0p0v88@gmail.com>

Introduce support for slave s/g transfer preparation and the associated
device control callback in the MPC512x DMA controller driver, which adds
support for data transfers between memory and peripheral I/O to the
previously supported mem-to-mem transfers.

Signed-off-by: Alexander Popov <a13xp0p0v88@gmail.com>
---
 drivers/dma/mpc512x_dma.c | 236 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 231 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/mpc512x_dma.c b/drivers/dma/mpc512x_dma.c
index 2ce248b..b1e430c 100644
--- a/drivers/dma/mpc512x_dma.c
+++ b/drivers/dma/mpc512x_dma.c
@@ -2,6 +2,7 @@
  * Copyright (C) Freescale Semicondutor, Inc. 2007, 2008.
  * Copyright (C) Semihalf 2009
  * Copyright (C) Ilya Yanok, Emcraft Systems 2010
+ * Copyright (C) Alexander Popov, Promcontroller 2013
  *
  * Written by Piotr Ziecik <kosmo@semihalf.com>. Hardware description
  * (defines, structures and comments) was taken from MPC5121 DMA driver
@@ -29,8 +30,17 @@
  */
 
 /*
- * This is initial version of MPC5121 DMA driver. Only memory to memory
- * transfers are supported (tested using dmatest module).
+ * MPC512x and MPC8308 DMA driver. It supports
+ * memory to memory data transfers (tested using dmatest module) and
+ * data transfers between memory and peripheral I/O memory
+ * by means of slave s/g with these limitations:
+ *  - chunked transfers (transfers with more than one part) are refused
+ *     as long as proper support for scatter/gather is missing;
+ *  - transfers on MPC8308 always start from software as this SoC appears
+ *     not to have external request lines for peripheral flow control;
+ *  - minimal memory <-> I/O memory transfer chunk is 4 bytes and consequently
+ *     source and destination addresses must be 4-byte aligned
+ *     and transfer size must be aligned on (4 * maxburst) boundary;
  */
 
 #include <linux/module.h>
@@ -189,6 +199,7 @@ struct mpc_dma_desc {
 	dma_addr_t			tcd_paddr;
 	int				error;
 	struct list_head		node;
+	int				will_access_peripheral;
 };
 
 struct mpc_dma_chan {
@@ -201,6 +212,10 @@ struct mpc_dma_chan {
 	struct mpc_dma_tcd		*tcd;
 	dma_addr_t			tcd_paddr;
 
+	/* Settings for access to peripheral FIFO */
+	dma_addr_t			per_paddr;	/* FIFO address */
+	u32				tcd_nunits;
+
 	/* Lock for this structure */
 	spinlock_t			lock;
 };
@@ -251,8 +266,23 @@ static void mpc_dma_execute(struct mpc_dma_chan *mchan)
 	struct mpc_dma_desc *mdesc;
 	int cid = mchan->chan.chan_id;
 
-	/* Move all queued descriptors to active list */
-	list_splice_tail_init(&mchan->queued, &mchan->active);
+	while (!list_empty(&mchan->queued)) {
+		mdesc = list_first_entry(&mchan->queued,
+						struct mpc_dma_desc, node);
+		/*
+		 * Grab either several mem-to-mem transfer descriptors
+		 * or one peripheral transfer descriptor,
+		 * don't mix mem-to-mem and peripheral transfer descriptors
+		 * within the same 'active' list.
+		 */
+		if (mdesc->will_access_peripheral) {
+			if (list_empty(&mchan->active))
+				list_move_tail(&mdesc->node, &mchan->active);
+			break;
+		} else {
+			list_move_tail(&mdesc->node, &mchan->active);
+		}
+	}
 
 	/* Chain descriptors into one transaction */
 	list_for_each_entry(mdesc, &mchan->active, node) {
@@ -278,7 +308,17 @@ static void mpc_dma_execute(struct mpc_dma_chan *mchan)
 
 	if (first != prev)
 		mdma->tcd[cid].e_sg = 1;
-	out_8(&mdma->regs->dmassrt, cid);
+
+	if (mdma->is_mpc8308) {
+		/* MPC8308, no request lines, software initiated start */
+		out_8(&mdma->regs->dmassrt, cid);
+	} else if (first->will_access_peripheral) {
+		/* Peripherals involved, start by external request signal */
+		out_8(&mdma->regs->dmaserq, cid);
+	} else {
+		/* Memory to memory transfer, software initiated start */
+		out_8(&mdma->regs->dmassrt, cid);
+	}
 }
 
 /* Handle interrupt on one half of DMA controller (32 channels) */
@@ -596,6 +636,7 @@ mpc_dma_prep_memcpy(struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
 	}
 
 	mdesc->error = 0;
+	mdesc->will_access_peripheral = 0;
 	tcd = mdesc->tcd;
 
 	/* Prepare Transfer Control Descriptor for this transaction */
@@ -643,6 +684,188 @@ mpc_dma_prep_memcpy(struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
 	return &mdesc->desc;
 }
 
+static struct dma_async_tx_descriptor *
+mpc_dma_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
+		unsigned int sg_len, enum dma_transfer_direction direction,
+		unsigned long flags, void *context)
+{
+	struct mpc_dma *mdma = dma_chan_to_mpc_dma(chan);
+	struct mpc_dma_chan *mchan = dma_chan_to_mpc_dma_chan(chan);
+	struct mpc_dma_desc *mdesc = NULL;
+	dma_addr_t per_paddr;
+	u32 tcd_nunits;
+	struct mpc_dma_tcd *tcd;
+	unsigned long iflags;
+	struct scatterlist *sg;
+	size_t len;
+	int iter, i;
+
+	/* Currently there is no proper support for scatter/gather */
+	if (sg_len != 1)
+		return NULL;
+
+	if (!is_slave_direction(direction))
+		return NULL;
+
+	for_each_sg(sgl, sg, sg_len, i) {
+		spin_lock_irqsave(&mchan->lock, iflags);
+
+		mdesc = list_first_entry(&mchan->free,
+						struct mpc_dma_desc, node);
+		if (!mdesc) {
+			spin_unlock_irqrestore(&mchan->lock, iflags);
+			/* Try to free completed descriptors */
+			mpc_dma_process_completed(mdma);
+			return NULL;
+		}
+
+		list_del(&mdesc->node);
+
+		per_paddr = mchan->per_paddr;
+		tcd_nunits = mchan->tcd_nunits;
+
+		spin_unlock_irqrestore(&mchan->lock, iflags);
+
+		if (per_paddr == 0 || tcd_nunits == 0)
+			goto err_prep;
+
+		mdesc->error = 0;
+		mdesc->will_access_peripheral = 1;
+
+		/* Prepare Transfer Control Descriptor for this transaction */
+		tcd = mdesc->tcd;
+
+		memset(tcd, 0, sizeof(struct mpc_dma_tcd));
+
+		if (!IS_ALIGNED(sg_dma_address(sg), 4))
+			goto err_prep;
+
+		if (direction == DMA_DEV_TO_MEM) {
+			tcd->saddr = per_paddr;
+			tcd->daddr = sg_dma_address(sg);
+			tcd->soff = 0;
+			tcd->doff = 4;
+		} else {
+			tcd->saddr = sg_dma_address(sg);
+			tcd->daddr = per_paddr;
+			tcd->soff = 4;
+			tcd->doff = 0;
+		}
+
+		tcd->ssize = MPC_DMA_TSIZE_4;
+		tcd->dsize = MPC_DMA_TSIZE_4;
+
+		len = sg_dma_len(sg);
+		tcd->nbytes = tcd_nunits * 4;
+		if (!IS_ALIGNED(len, tcd->nbytes))
+			goto err_prep;
+
+		iter = len / tcd->nbytes;
+		if (iter >= 1 << 15) {
+			/* len is too big */
+			goto err_prep;
+		}
+		/* citer_linkch contains the high bits of iter */
+		tcd->biter = iter & 0x1ff;
+		tcd->biter_linkch = iter >> 9;
+		tcd->citer = tcd->biter;
+		tcd->citer_linkch = tcd->biter_linkch;
+
+		tcd->e_sg = 0;
+		tcd->d_req = 1;
+
+		/* Place descriptor in prepared list */
+		spin_lock_irqsave(&mchan->lock, iflags);
+		list_add_tail(&mdesc->node, &mchan->prepared);
+		spin_unlock_irqrestore(&mchan->lock, iflags);
+	}
+
+	return &mdesc->desc;
+
+err_prep:
+	/* Put the descriptor back */
+	spin_lock_irqsave(&mchan->lock, iflags);
+	list_add_tail(&mdesc->node, &mchan->free);
+	spin_unlock_irqrestore(&mchan->lock, iflags);
+
+	return NULL;
+}
+
+static int mpc_dma_device_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
+							unsigned long arg)
+{
+	struct mpc_dma_chan *mchan;
+	struct mpc_dma *mdma;
+	struct dma_slave_config *cfg;
+	unsigned long flags;
+
+	mchan = dma_chan_to_mpc_dma_chan(chan);
+	switch (cmd) {
+	case DMA_TERMINATE_ALL:
+		/* Disable channel requests */
+		mdma = dma_chan_to_mpc_dma(chan);
+
+		spin_lock_irqsave(&mchan->lock, flags);
+
+		out_8(&mdma->regs->dmacerq, chan->chan_id);
+		list_splice_tail_init(&mchan->prepared, &mchan->free);
+		list_splice_tail_init(&mchan->queued, &mchan->free);
+		list_splice_tail_init(&mchan->active, &mchan->free);
+
+		spin_unlock_irqrestore(&mchan->lock, flags);
+
+		return 0;
+	case DMA_SLAVE_CONFIG:
+		/* Constraints:
+		 *  - only transfers between a peripheral device and
+		 *     memory are supported;
+		 *  - minimal transfer chunk is 4 bytes and consequently
+		 *     source and destination addresses must be 4-byte aligned
+		 *     and transfer size must be aligned on (4 * maxburst)
+		 *     boundary;
+		 *  - during the transfer RAM address is being incremented by
+		 *     the size of minimal transfer chunk;
+		 *  - peripheral port's address is constant during the transfer.
+		 */
+
+		cfg = (void *)arg;
+
+		if (!is_slave_direction(cfg->direction))
+			return -EINVAL;
+
+		if (cfg->src_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES &&
+			cfg->dst_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
+			return -EINVAL;
+
+		spin_lock_irqsave(&mchan->lock, flags);
+
+		if (cfg->direction == DMA_DEV_TO_MEM) {
+			mchan->per_paddr = cfg->src_addr;
+			mchan->tcd_nunits = cfg->src_maxburst;
+		} else {
+			mchan->per_paddr = cfg->dst_addr;
+			mchan->tcd_nunits = cfg->dst_maxburst;
+		}
+
+		if (!IS_ALIGNED(mchan->per_paddr, 4)) {
+			spin_unlock_irqrestore(&mchan->lock, flags);
+			return -EINVAL;
+		}
+
+		if (mchan->tcd_nunits == 0)
+			mchan->tcd_nunits = 1;	/* Apply default */
+
+		spin_unlock_irqrestore(&mchan->lock, flags);
+
+		return 0;
+	default:
+		/* Unknown command */
+		break;
+	}
+
+	return -ENXIO;
+}
+
 static int mpc_dma_probe(struct platform_device *op)
 {
 	struct device_node *dn = op->dev.of_node;
@@ -727,9 +950,12 @@ static int mpc_dma_probe(struct platform_device *op)
 	dma->device_issue_pending = mpc_dma_issue_pending;
 	dma->device_tx_status = mpc_dma_tx_status;
 	dma->device_prep_dma_memcpy = mpc_dma_prep_memcpy;
+	dma->device_prep_slave_sg = mpc_dma_prep_slave_sg;
+	dma->device_control = mpc_dma_device_control;
 
 	INIT_LIST_HEAD(&dma->channels);
 	dma_cap_set(DMA_MEMCPY, dma->cap_mask);
+	dma_cap_set(DMA_SLAVE, dma->cap_mask);
 
 	for (i = 0; i < dma->chancnt; i++) {
 		mchan = &mdma->channels[i];
-- 
1.8.4.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox