LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Viresh Kumar @ 2014-03-27  6:39 UTC (permalink / raw)
  To: Gautham R. Shenoy, Rafael J. Wysocki
  Cc: Linux PM list, linuxppc-dev@ozlabs.org, Anton Blanchard,
	Srivatsa S. Bhat
In-Reply-To: <1395852947-22290-2-git-send-email-ego@linux.vnet.ibm.com>

Cc'ing Rafael.

On 26 March 2014 22:25, Gautham R. Shenoy <ego@linux.vnet.ibm.com> wrote:
> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
>
> Backend driver to dynamically set voltage and frequency on
> IBM POWER non-virtualized platforms.  Power management SPRs
> are used to set the required PState.
>
> This driver works in conjunction with cpufreq governors
> like 'ondemand' to provide a demand based frequency and
> voltage setting on IBM POWER non-virtualized platforms.
>
> PState table is obtained from OPAL v3 firmware through device
> tree.
>
> powernv_cpufreq back-end driver would parse the relevant device-tree
> nodes and initialise the cpufreq subsystem on powernv platform.
>
> The code was originally written by svaidy@linux.vnet.ibm.com. Over
> time it was modified to accomodate bug-fixes as well as updates to the
> the cpu-freq core. Relevant portions of the change logs corresponding
> to those modifications are noted below:
>
>  * The policy->cpus needs to be populated in a hotplug-invariant
>    manner instead of using cpu_sibling_mask() which varies with
>    cpu-hotplug. This is because the cpufreq core code copies this
>    content into policy->related_cpus mask which is should not vary on

s/is /

>    cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com]

:)

>  * On POWER systems, the CPU frequency is controlled at a core-level
>    and hence we need to serialize so that only one of the threads in
>    the core switches the core's frequency at a time. Introduce
>    per-core locking to enable finer-grained synchronization and
>    thereby enhance the speed and responsiveness of the cpufreq driver
>    to varying workload demands.
>
>      The design of per-core locking is very simple and
>    straight-forward: we first define a Per-CPU lock and use the ones
>    that belongs to the first thread sibling of the core.
>
>      cpu_first_thread_sibling() macro is used to find the *common*
>    lock for all thread siblings belonging to a core. [Authored by
>    srivatsa.bhat@linux.vnet.ibm.com]

We don't need that after serialization patch of core is accepted. And it
should be accepted soon, in one form or other.

>  * Create a helper routine that can return the cpu-frequency for the
>    corresponding pstate_id. Also, cache the values of the pstate_max,
>    pstate_min and pstate_nominal and nr_pstates in a static structure
>    so that they can be reused in the future to perform any
>    validations. [Authored by ego@linux.vnet.ibm.com]
>
>  * Create a driver attribute named cpuinfo_nominal_freq which creates
>    a sysfs read-only file named cpuinfo_nominal_freq. Export the
>    frequency corresponding to the nominal_pstate through this
>    interface.
>
>      Nominal frequency is the highest non-turbo frequency for the
>    platform.  This is generally used for setting governor policies
>    from user space for optimal energy efficiency. [Authored by
>    ego@linux.vnet.ibm.com]
>
>  * Implement a powernv_cpufreq_get(unsigned int cpu) method which will
>    return the current operating frequency. Export this via the sysfs
>    interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to
>    powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com]
>
> [Change log updated by ego@linux.vnet.ibm.com]
>
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  arch/powerpc/configs/pseries_defconfig    |   1 +
>  arch/powerpc/configs/pseries_le_defconfig |   1 +
>  arch/powerpc/include/asm/reg.h            |   4 +
>  arch/powerpc/platforms/powernv/Kconfig    |   6 +
>  drivers/cpufreq/Kconfig.powerpc           |   8 +
>  drivers/cpufreq/Makefile                  |   1 +
>  drivers/cpufreq/powernv-cpufreq.c         | 372 ++++++++++++++++++++++++++++++
>  7 files changed, 393 insertions(+)
>  create mode 100644 drivers/cpufreq/powernv-cpufreq.c
>
> diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
> index e9a8b4e..a285d44 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -353,3 +353,4 @@ CONFIG_CRYPTO_DEV_NX_ENCRYPT=m
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM_BOOK3S_64=m
>  CONFIG_KVM_BOOK3S_64_HV=y
> +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> diff --git a/arch/powerpc/configs/pseries_le_defconfig b/arch/powerpc/configs/pseries_le_defconfig
> index 62771e0..47e6161 100644
> --- a/arch/powerpc/configs/pseries_le_defconfig
> +++ b/arch/powerpc/configs/pseries_le_defconfig
> @@ -350,3 +350,4 @@ CONFIG_CRYPTO_LZO=m
>  # CONFIG_CRYPTO_ANSI_CPRNG is not set
>  CONFIG_CRYPTO_DEV_NX=y
>  CONFIG_CRYPTO_DEV_NX_ENCRYPT=m
> +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y

don't know how Rafael want this, but probably this part could have gone
through ppc tree..

> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 90c06ec..84f92ca 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -271,6 +271,10 @@
>  #define SPRN_HSRR1     0x13B   /* Hypervisor Save/Restore 1 */
>  #define SPRN_IC                0x350   /* Virtual Instruction Count */
>  #define SPRN_VTB       0x351   /* Virtual Time Base */
> +#define SPRN_PMICR     0x354   /* Power Management Idle Control Reg */
> +#define SPRN_PMSR      0x355   /* Power Management Status Reg */
> +#define SPRN_PMCR      0x374   /* Power Management Control Register */
> +
>  /* HFSCR and FSCR bit numbers are the same */
>  #define FSCR_TAR_LG    8       /* Enable Target Address Register */
>  #define FSCR_EBB_LG    7       /* Enable Event Based Branching */
> diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
> index 895e8a2..c252ee9 100644
> --- a/arch/powerpc/platforms/powernv/Kconfig
> +++ b/arch/powerpc/platforms/powernv/Kconfig
> @@ -11,6 +11,12 @@ config PPC_POWERNV
>         select PPC_UDBG_16550
>         select PPC_SCOM
>         select ARCH_RANDOM
> +       select CPU_FREQ
> +       select CPU_FREQ_GOV_PERFORMANCE
> +       select CPU_FREQ_GOV_POWERSAVE
> +       select CPU_FREQ_GOV_USERSPACE
> +       select CPU_FREQ_GOV_ONDEMAND
> +       select CPU_FREQ_GOV_CONSERVATIVE
>         default y
>
>  config PPC_POWERNV_RTAS
> diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
> index ca0021a..72564b7 100644
> --- a/drivers/cpufreq/Kconfig.powerpc
> +++ b/drivers/cpufreq/Kconfig.powerpc
> @@ -54,3 +54,11 @@ config PPC_PASEMI_CPUFREQ
>         help
>           This adds the support for frequency switching on PA Semi
>           PWRficient processors.
> +
> +config POWERNV_CPUFREQ
> +       tristate "CPU frequency scaling for IBM POWERNV platform"
> +       depends on PPC_POWERNV
> +       default y
> +       help
> +        This adds support for CPU frequency switching on IBM POWERNV
> +        platform
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index 7494565..0dbb963 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ)   += ppc-corenet-cpufreq.o
>  obj-$(CONFIG_CPU_FREQ_PMAC)            += pmac32-cpufreq.o
>  obj-$(CONFIG_CPU_FREQ_PMAC64)          += pmac64-cpufreq.o
>  obj-$(CONFIG_PPC_PASEMI_CPUFREQ)       += pasemi-cpufreq.o
> +obj-$(CONFIG_POWERNV_CPUFREQ)          += powernv-cpufreq.o
>
>  ##################################################################################
>  # Other platform drivers
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> new file mode 100644
> index 0000000..b35865b
> --- /dev/null
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -0,0 +1,372 @@
> +/*
> + * POWERNV cpufreq driver for the IBM POWER processors
> + *
> + * (C) Copyright IBM 2014
> + *
> + * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2, or (at your option)
> + * any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> +
> +#include <linux/module.h>
> +#include <linux/cpufreq.h>
> +#include <linux/smp.h>
> +#include <linux/of.h>
> +#include <asm/cputhreads.h>

I thought I gave a comment on missing headers here?

> +/* Per-Core locking for frequency transitions */
> +static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
> +
> +#define lock_core_freq(cpu)                            \
> +                       mutex_lock(&per_cpu(freq_switch_lock,\
> +                               cpu_first_thread_sibling(cpu)));
> +#define unlock_core_freq(cpu)                          \
> +                       mutex_unlock(&per_cpu(freq_switch_lock,\
> +                               cpu_first_thread_sibling(cpu)));
> +
> +#define POWERNV_MAX_PSTATES    256
> +
> +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];

I though we don't need this anymore? Either Rafael will take my patch as
is for the BOOST fixup or we will end up creating .isboost field in
struct cpufreq_frequency_table and so you could have used .driver_data here.

> +struct powernv_pstate_info {
> +       int pstate_min_id;
> +       int pstate_max_id;
> +       int pstate_nominal_id;
> +       int nr_pstates;
> +};
> +static struct powernv_pstate_info powernv_pstate_info;

Maybe write it as this (if you like :), as this is the only instance
of this struct):

Also, because 'powernv_pstate_info' is a local variable we can get rid of
powerenv_ from its name and name it just pstate_info. That will make
code shorter at many places and you may not be required to break
lines into two at some places. If you wish :)

+static struct powernv_pstate_info {
+       int pstate_min_id;
+       int pstate_max_id;
+       int pstate_nominal_id;
+       int nr_pstates;
+} powernv_pstate_info;

> +/*
> + * Initialize the freq table based on data obtained
> + * from the firmware passed via device-tree
> + */
> +static int init_powernv_pstates(void)
> +{
> +       struct device_node *power_mgt;
> +       int nr_pstates = 0;
> +       int pstate_min, pstate_max, pstate_nominal;
> +       const __be32 *pstate_ids, *pstate_freqs;
> +       int i;

Can merge all the int definitions into a single line.

> +       u32 len_ids, len_freqs;
> +
> +       power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
> +       if (!power_mgt) {
> +               pr_warn("power-mgt node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
> +               pr_warn("ibm,pstate-min node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
> +               pr_warn("ibm,pstate-max node not found\n");
> +               return -ENODEV;
> +       }

Why do you need to get these from DT? And not find that yourself here instead?

> +       if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
> +                                &pstate_nominal)) {
> +               pr_warn("ibm,pstate-nominal not found\n");
> +               return -ENODEV;
> +       }
> +       pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
> +               pstate_nominal, pstate_max);
> +
> +       pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
> +       if (!pstate_ids) {
> +               pr_warn("ibm,pstate-ids not found\n");
> +               return -ENODEV;
> +       }
> +
> +       pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
> +                                     &len_freqs);
> +       if (!pstate_freqs) {
> +               pr_warn("ibm,pstate-frequencies-mhz not found\n");
> +               return -ENODEV;
> +       }
> +
> +       WARN_ON(len_ids != len_freqs);
> +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> +       if (!nr_pstates) {
> +               WARN_ON(1);
> +               return -ENODEV;
> +       }

Maybe like this:

+       if (WARN_ON(!nr_pstates))
+               return -ENODEV;

> +       pr_debug("NR PStates %d\n", nr_pstates);
> +       for (i = 0; i < nr_pstates; i++) {
> +               u32 id = be32_to_cpu(pstate_ids[i]);
> +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> +
> +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> +               powernv_freqs[i].driver_data = i;

Looks like more than one comments aren't addressed :(
You can use this field for your id. And even if you couldn't have
done that, you don't need to initialize this field at all..

> +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> +               powernv_pstate_ids[i] = id;
> +       }
> +       /* End of list marker entry */
> +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> +
> +       powernv_pstate_info.pstate_min_id = pstate_min;
> +       powernv_pstate_info.pstate_max_id = pstate_max;
> +       powernv_pstate_info.pstate_nominal_id = pstate_nominal;
> +       powernv_pstate_info.nr_pstates = nr_pstates;
> +
> +       return 0;
> +}
> +
> +/*
> + * Returns the cpu frequency corresponding to the pstate_id.
> + */

Maybe:

+/* Returns the cpu frequency corresponding to the pstate_id. */

> +static unsigned int pstate_id_to_freq(int pstate_id)
> +{
> +       int i;
> +
> +       i = powernv_pstate_info.pstate_max_id - pstate_id;

It looks like these ids would always be contiguous? In that case
it would be better if you can mention this property at the top of this
file in some comment. So, that new people can understand things
quickly.

> +       BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
> +       WARN_ON(powernv_pstate_ids[i] != pstate_id);

Do you really want it? We have already confirmed that 'i' is
within limits.

> +       return powernv_freqs[i].frequency;
> +}
> +
> +/*
> + * cpuinfo_nominal_freq_show - Show the nominal CPU frequency as indicated by
> + * the firmware
> + */
> +static ssize_t cpuinfo_nominal_freq_show(struct cpufreq_policy *policy,
> +                                       char *buf)
> +{
> +       return sprintf(buf, "%u\n",
> +               pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id));
> +}
> +
> +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq =
> +       __ATTR_RO(cpuinfo_nominal_freq);
> +
> +static struct freq_attr *powernv_cpu_freq_attr[] = {
> +       &cpufreq_freq_attr_scaling_available_freqs,
> +       &cpufreq_freq_attr_cpuinfo_nominal_freq,
> +       NULL,
> +};
> +
> +/* Helper routines */
> +
> +/* Access helpers to power mgt SPR */
> +
> +static inline unsigned long get_pmspr(unsigned long sprn)
> +{
> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               return mfspr(SPRN_PMCR);
> +
> +       case SPRN_PMICR:
> +               return mfspr(SPRN_PMICR);
> +
> +       case SPRN_PMSR:
> +               return mfspr(SPRN_PMSR);
> +       }
> +       BUG();
> +}
> +
> +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> +{
> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               mtspr(SPRN_PMCR, val);
> +               return;
> +
> +       case SPRN_PMICR:
> +               mtspr(SPRN_PMICR, val);
> +               return;
> +
> +       case SPRN_PMSR:
> +               mtspr(SPRN_PMSR, val);
> +               return;
> +       }
> +       BUG();
> +}
> +
> +/*
> + * Use objects of this type to query/update
> + * pstates on a remote cpu via smp_call_function.
> + */
> +struct powernv_smp_call_data {
> +       unsigned int freq;
> +       int pstate_id;
> +};
> +
> +/*
> + * powernv_read_cpu_freq: Reads the current frequency on this cpu.
> + *
> + * Called via smp_call_function.
> + *
> + * Note: The caller of the smp_call_function should pass an argument of
> + * the type 'struct powernv_smp_call_data *' along with this function.
> + *
> + * The current frequency on this cpu will be returned via
> + * ((struct powernv_smp_call_data *)arg)->freq;
> + */
> +static void powernv_read_cpu_freq(void *arg)
> +{
> +       unsigned long pmspr_val;
> +       s8 local_pstate_id;
> +        struct powernv_smp_call_data *freq_data;
> +
> +       freq_data = (struct powernv_smp_call_data *)arg;

don't need casting here ?

> +
> +       pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +       /*
> +         * The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign!
> +         */
> +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +       freq_data->pstate_id = local_pstate_id;
> +       freq_data->freq = pstate_id_to_freq(freq_data->pstate_id);
> +
> +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d kHz \n",
> +               smp_processor_id(), pmspr_val, freq_data->pstate_id,

s/smp_processor_id/raw_smp_processor_id ?

> +               freq_data->freq);
> +}
> +
> +/*
> + * powernv_cpufreq_get: Returns the cpu frequency as reported by the
> + * firmware for 'cpu'. This value is reported through the sysfs file
> + * cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +       struct powernv_smp_call_data freq_data;
> +
> +       smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +                       &freq_data, 1);
> +
> +       return freq_data.freq;
> +}
> +
> +/*
> + * set_pstate: Sets the frequency on this cpu.
> + *
> + * This is called via an smp_call_function.
> + *
> + * The caller must ensure that freq_data is of the type
> + * (struct powernv_smp_call_data *) and the pstate_id which needs to be set
> + * on this cpu should be present in freq_data->pstate_id.
> + */
> +static void set_pstate(void *freq_data)
> +{
> +       unsigned long val;
> +       unsigned long pstate_ul =
> +               ((struct powernv_smp_call_data *) freq_data)->pstate_id;
> +
> +       val = get_pmspr(SPRN_PMCR);
> +       val = val & 0x0000ffffffffffffULL;
> +
> +       pstate_ul = pstate_ul & 0xFF;
> +
> +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> +
> +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);

s/smp_processor_id/raw_smp_processor_id ? At other places as well.

> +       set_pmspr(SPRN_PMCR, val);
> +}
> +
> +/*
> + * powernv_set_freq: Sets the frequency corresponding to the cpufreq
> + * table entry indexed by new_index on the cpus in the mask 'cpus'

Rafael doesn't like CPUs to be written as cpus.. I got this comment long
back :) (Applicable only for comments and logs)

> + */
> +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)

Why do you want to keep this routine separately? Why not have a single routine
powernv_cpufreq_target_index() ?

> +{
> +       struct powernv_smp_call_data freq_data;
> +
> +       freq_data.pstate_id = powernv_pstate_ids[new_index];
> +
> +       /*
> +        * Use smp_call_function to send IPI and execute the
> +        * mtspr on target cpu.  We could do that without IPI
> +        * if current CPU is within policy->cpus (core)
> +        */
> +       smp_call_function_any(cpus, set_pstate, &freq_data, 1);

Not sure how smp_call_function_any() behaves but wouldn't it be
a good optimization if you can check if raw_smp_processor_id()
returns one of the CPUs from 'cpus'? And in that case don't
shoot an IPI.

Same for the get part as well.

But then I looked at the implementation of these routines and found
they already have this optimization in :) .. But with overhead of few
locks and disable_preempt() :(

> +       return 0;
> +}
> +
> +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> +{
> +       int base, i;
> +
> +       base = cpu_first_thread_sibling(policy->cpu);
> +
> +       for (i = 0; i < threads_per_core; i++)
> +               cpumask_set_cpu(base + i, policy->cpus);
> +
> +       policy->cpuinfo.transition_latency = 25000;
> +       policy->cur = powernv_freqs[0].frequency;

How can you be so sure? Also clock is doing this just after calling init()
and so you can just remove it :)

> +       return cpufreq_table_validate_and_show(policy, powernv_freqs);
> +}
> +
> +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> +{
> +       return cpufreq_generic_frequency_table_verify(policy);
> +}
> +
> +static int powernv_cpufreq_target_index(struct cpufreq_policy *policy,
> +                                       unsigned int new_index)
> +{
> +       int rc;
> +
> +       lock_core_freq(policy->cpu);
> +       rc = powernv_set_freq(policy->cpus, new_index);
> +       unlock_core_freq(policy->cpu);
> +
> +       return rc;
> +}
> +
> +static struct cpufreq_driver powernv_cpufreq_driver = {
> +       .name           = "powernv-cpufreq",
> +       .flags          = CPUFREQ_CONST_LOOPS,
> +       .init           = powernv_cpufreq_cpu_init,
> +       .verify         = powernv_cpufreq_verify,

Can do this instead:
+       .verify         = cpufreq_generic_frequency_table_verify,

> +       .target_index   = powernv_cpufreq_target_index,
> +       .get            = powernv_cpufreq_get,
> +       .attr           = powernv_cpu_freq_attr,
> +};
> +
> +static int __init powernv_cpufreq_init(void)
> +{
> +       int cpu, rc = 0;
> +
> +       /* Discover pstates from device tree and init */
> +       rc = init_powernv_pstates();
> +       if (rc) {
> +               pr_info("powernv-cpufreq disabled\n");
> +               return rc;
> +       }
> +
> +       /* Init per-core mutex */
> +       for_each_possible_cpu(cpu)
> +               mutex_init(&per_cpu(freq_switch_lock, cpu));
> +
> +       return cpufreq_register_driver(&powernv_cpufreq_driver);
> +}
> +module_init(powernv_cpufreq_init);
> +
> +static void __exit powernv_cpufreq_exit(void)
> +{
> +       cpufreq_unregister_driver(&powernv_cpufreq_driver);
> +}
> +module_exit(powernv_cpufreq_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");

^ permalink raw reply

* Re: [PATCH v4] powernv: Dynamic Frequency Scaling Enablement
From: Viresh Kumar @ 2014-03-27  6:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev@ozlabs.org,
	Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <1395901724.5569.83.camel@pasglop>

On 27 March 2014 11:58, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> Any other comment ? :-)

Just gave on the patch..

> Because it depends on some other stuff in -next, I think it's best if
> this gets merged via Rafael's tree, unless someone can point me at
> the necessary pre-requisites in the form of a topic branch I can
> pull in powerpc along with this.

If you don't have any issues, let it go through Rafael's tree.

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Gautham R Shenoy @ 2014-03-27  9:30 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Gautham R. Shenoy, Linux PM list, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <CAKohponPb+b=8Pa=te3CkiN6QPAUAukngqLSdG4vxc9pUkD8OQ@mail.gmail.com>

On Thu, Mar 27, 2014 at 12:09:53PM +0530, Viresh Kumar wrote:
> Cc'ing Rafael.
>

Thanks. It was a lapse on my part by not Cc'ing Rafael in the first
place.
 
> On 26 March 2014 22:25, Gautham R. Shenoy <ego@linux.vnet.ibm.com> wrote:
> > From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> >
> > Backend driver to dynamically set voltage and frequency on
> > IBM POWER non-virtualized platforms.  Power management SPRs
> > are used to set the required PState.
> >
> > This driver works in conjunction with cpufreq governors
> > like 'ondemand' to provide a demand based frequency and
> > voltage setting on IBM POWER non-virtualized platforms.
> >
> > PState table is obtained from OPAL v3 firmware through device
> > tree.
> >
> > powernv_cpufreq back-end driver would parse the relevant device-tree
> > nodes and initialise the cpufreq subsystem on powernv platform.
> >
> > The code was originally written by svaidy@linux.vnet.ibm.com. Over
> > time it was modified to accomodate bug-fixes as well as updates to the
> > the cpu-freq core. Relevant portions of the change logs corresponding
> > to those modifications are noted below:
> >
> >  * The policy->cpus needs to be populated in a hotplug-invariant
> >    manner instead of using cpu_sibling_mask() which varies with
> >    cpu-hotplug. This is because the cpufreq core code copies this
> >    content into policy->related_cpus mask which is should not vary on
> 
> s/is /

Good catch! Will fix this

[...]
> 
> >  * On POWER systems, the CPU frequency is controlled at a core-level
> >    and hence we need to serialize so that only one of the threads in
> >    the core switches the core's frequency at a time. Introduce
> >    per-core locking to enable finer-grained synchronization and
> >    thereby enhance the speed and responsiveness of the cpufreq driver
> >    to varying workload demands.
> >
> >      The design of per-core locking is very simple and
> >    straight-forward: we first define a Per-CPU lock and use the ones
> >    that belongs to the first thread sibling of the core.
> >
> >      cpu_first_thread_sibling() macro is used to find the *common*
> >    lock for all thread siblings belonging to a core. [Authored by
> >    srivatsa.bhat@linux.vnet.ibm.com]
> 
> We don't need that after serialization patch of core is accepted. And it
> should be accepted soon, in one form or other.

As of now, I prefer this patch be based on code that is in the -next
tree. I'll get rid of the per-core locking once the serialization
patch of the core is accepted.

[...]
> > --- a/arch/powerpc/configs/pseries_defconfig
> > +++ b/arch/powerpc/configs/pseries_defconfig
> > @@ -353,3 +353,4 @@ CONFIG_CRYPTO_DEV_NX_ENCRYPT=m
> >  CONFIG_VIRTUALIZATION=y
> >  CONFIG_KVM_BOOK3S_64=m
> >  CONFIG_KVM_BOOK3S_64_HV=y
> > +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> > diff --git a/arch/powerpc/configs/pseries_le_defconfig b/arch/powerpc/configs/pseries_le_defconfig
> > index 62771e0..47e6161 100644
> > --- a/arch/powerpc/configs/pseries_le_defconfig
> > +++ b/arch/powerpc/configs/pseries_le_defconfig
> > @@ -350,3 +350,4 @@ CONFIG_CRYPTO_LZO=m
> >  # CONFIG_CRYPTO_ANSI_CPRNG is not set
> >  CONFIG_CRYPTO_DEV_NX=y
> >  CONFIG_CRYPTO_DEV_NX_ENCRYPT=m
> > +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> 
> don't know how Rafael want this, but probably this part could have gone
> through ppc tree..
> 

That would mean multiple patches :-)

> > +
> > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> > +
> > +#include <linux/module.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/smp.h>
> > +#include <linux/of.h>
> > +#include <asm/cputhreads.h>
> 
> I thought I gave a comment on missing headers here?

Ok, so these seem to be the missing ones.

#include <linux/kernel.h>
#include <linux/percpu-defs.h>
#include <linux/mutex.h>
#include <linux/sysfs.h>
#include <linux/cpumask.h>

#include <asm/reg.h>

Have I missed anything else ?


> > +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> > +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> 
> I though we don't need this anymore? Either Rafael will take my patch as
> is for the BOOST fixup or we will end up creating .isboost field in
> struct cpufreq_frequency_table and so you could have used .driver_data here.
> 

I mentioned in my reply to your BOOST fixup patch that we would like
pstate_ids to be of the type "signed int", while the .driver_data is
an "unsigned int". If we end up using .driver_data, we would have to
be careful and not use it directly but reassign it to a signed int (or
typecast it) before using it.

Not just that, the pr_debugs in the core which are printed during
frequency transitions will then end up printing large positive values
(since it will interpret negative pstate_ids as unsigned int) in the
place of pstate_ids, which would not be particularly useful.

> > +struct powernv_pstate_info {
> > +       int pstate_min_id;
> > +       int pstate_max_id;
> > +       int pstate_nominal_id;
> > +       int nr_pstates;
> > +};
> > +static struct powernv_pstate_info powernv_pstate_info;
> 
> Maybe write it as this (if you like :), as this is the only instance
> of this struct):
> 
> Also, because 'powernv_pstate_info' is a local variable we can get rid of
> powerenv_ from its name and name it just pstate_info. That will make
> code shorter at many places and you may not be required to break
> lines into two at some places. If you wish :)
> 
Ok fair enough!

> +static struct powernv_pstate_info {
> +       int pstate_min_id;
> +       int pstate_max_id;
> +       int pstate_nominal_id;
> +       int nr_pstates;
> +} powernv_pstate_info;
> 
> > +/*
> > + * Initialize the freq table based on data obtained
> > + * from the firmware passed via device-tree
> > + */
> > +static int init_powernv_pstates(void)
> > +{
> > +       struct device_node *power_mgt;
> > +       int nr_pstates = 0;
> > +       int pstate_min, pstate_max, pstate_nominal;
> > +       const __be32 *pstate_ids, *pstate_freqs;
> > +       int i;
> 
> Can merge all the int definitions into a single line.

Too many ints to be incorporated in a single line. But will see if I
can beautify it :-) 

> > +       u32 len_ids, len_freqs;
> > +
> > +       power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
> > +       if (!power_mgt) {
> > +               pr_warn("power-mgt node not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
> > +               pr_warn("ibm,pstate-min node not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
> > +               pr_warn("ibm,pstate-max node not found\n");
> > +               return -ENODEV;
> > +       }
> 
> Why do you need to get these from DT? And not find that yourself here instead?
>

Well, I think we can obtain a more accurate value from the DT which
would have already computed it. But I'll let vaidy answer this.

> > +       if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
> > +                                &pstate_nominal)) {
> > +               pr_warn("ibm,pstate-nominal not found\n");
> > +               return -ENODEV;
> > +       }
> > +       pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
> > +               pstate_nominal, pstate_max);
> > +
> > +       pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
> > +       if (!pstate_ids) {
> > +               pr_warn("ibm,pstate-ids not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
> > +                                     &len_freqs);
> > +       if (!pstate_freqs) {
> > +               pr_warn("ibm,pstate-frequencies-mhz not found\n");
> > +               return -ENODEV;
> > +       }
> > +
> > +       WARN_ON(len_ids != len_freqs);
> > +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> > +       if (!nr_pstates) {
> > +               WARN_ON(1);
> > +               return -ENODEV;
> > +       }
> 
> Maybe like this:
> 
> +       if (WARN_ON(!nr_pstates))
> +               return -ENODEV;
> 

Thanks. This looks better.

> > +       pr_debug("NR PStates %d\n", nr_pstates);
> > +       for (i = 0; i < nr_pstates; i++) {
> > +               u32 id = be32_to_cpu(pstate_ids[i]);
> > +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> > +
> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> > +               powernv_freqs[i].driver_data = i;
> 
> Looks like more than one comments aren't addressed :(
> You can use this field for your id. And even if you couldn't have
> done that, you don't need to initialize this field at all..


Ok, since we are better off not using it, we shouldn't be initializing
it.

> 
> > +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> > +               powernv_pstate_ids[i] = id;
> > +       }
> > +       /* End of list marker entry */
> > +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> > +
> > +       powernv_pstate_info.pstate_min_id = pstate_min;
> > +       powernv_pstate_info.pstate_max_id = pstate_max;
> > +       powernv_pstate_info.pstate_nominal_id = pstate_nominal;
> > +       powernv_pstate_info.nr_pstates = nr_pstates;
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * Returns the cpu frequency corresponding to the pstate_id.
> > + */
> 
> Maybe:
> 
> +/* Returns the cpu frequency corresponding to the pstate_id. */
> 

Right, single line comment! Will fix this.

> > +static unsigned int pstate_id_to_freq(int pstate_id)
> > +{
> > +       int i;
> > +
> > +       i = powernv_pstate_info.pstate_max_id - pstate_id;
> 
> It looks like these ids would always be contiguous? In that case
> it would be better if you can mention this property at the top of this
> file in some comment. So, that new people can understand things
> quickly.
> 

Ok.

> > +       BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
> > +       WARN_ON(powernv_pstate_ids[i] != pstate_id);
> 
> Do you really want it? We have already confirmed that 'i' is
> within limits.

"i" might be within limits. But I want to make sure that the pstate_id
corresponding to "i" is the same as the pstate_id that we
requested. Not that anything would have changed since the
initialization, but I get paranoid about these things from time to
time :-)

> 
> > +       return powernv_freqs[i].frequency;
> > +}
> > +

[...]

> > +
> > +/*
> > + * powernv_read_cpu_freq: Reads the current frequency on this cpu.
> > + *
> > + * Called via smp_call_function.
> > + *
> > + * Note: The caller of the smp_call_function should pass an argument of
> > + * the type 'struct powernv_smp_call_data *' along with this function.
> > + *
> > + * The current frequency on this cpu will be returned via
> > + * ((struct powernv_smp_call_data *)arg)->freq;
> > + */
> > +static void powernv_read_cpu_freq(void *arg)
> > +{
> > +       unsigned long pmspr_val;
> > +       s8 local_pstate_id;
> > +        struct powernv_smp_call_data *freq_data;
> > +
> > +       freq_data = (struct powernv_smp_call_data *)arg;
> 
> don't need casting here ?

Doesn't harm having it there. Just like the #includes :-)

> 
> > +
> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> > +
> > +       /*
> > +         * The local pstate id corresponds bits 48..55 in the PMSR.
> > +         * Note: Watch out for the sign!
> > +         */
> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> > +       freq_data->pstate_id = local_pstate_id;
> > +       freq_data->freq = pstate_id_to_freq(freq_data->pstate_id);
> > +
> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d kHz \n",
> > +               smp_processor_id(), pmspr_val, freq_data->pstate_id,
> 
> s/smp_processor_id/raw_smp_processor_id ?

No. This function is called via smp_call_function(). So we have
preempt_disable on and it is safe to use smp_processor_id.

> 
> > +               freq_data->freq);
> > +}

[...]
> > +/*
> > + * set_pstate: Sets the frequency on this cpu.
> > + *
> > + * This is called via an smp_call_function.
> > + *
> > + * The caller must ensure that freq_data is of the type
> > + * (struct powernv_smp_call_data *) and the pstate_id which needs to be set
> > + * on this cpu should be present in freq_data->pstate_id.
> > + */
> > +static void set_pstate(void *freq_data)
> > +{
> > +       unsigned long val;
> > +       unsigned long pstate_ul =
> > +               ((struct powernv_smp_call_data *) freq_data)->pstate_id;
> > +
> > +       val = get_pmspr(SPRN_PMCR);
> > +       val = val & 0x0000ffffffffffffULL;
> > +
> > +       pstate_ul = pstate_ul & 0xFF;
> > +
> > +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> > +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> > +
> > +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> 
> s/smp_processor_id/raw_smp_processor_id ? At other places as well.

Even this function is called via smp_call_function(). So, we should
have preempt disabled.

> 
> > +       set_pmspr(SPRN_PMCR, val);
> > +}
> > +
> > +/*
> > + * powernv_set_freq: Sets the frequency corresponding to the cpufreq
> > + * table entry indexed by new_index on the cpus in the mask 'cpus'
> 
> Rafael doesn't like CPUs to be written as cpus.. I got this comment long
> back :) (Applicable only for comments and logs)

Ah, ok. Will fix this.

> 
> > + */
> > +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> 
> Why do you want to keep this routine separately? Why not have a single routine
> powernv_cpufreq_target_index() ?
> 
> > +{
> > +       struct powernv_smp_call_data freq_data;
> > +
> > +       freq_data.pstate_id = powernv_pstate_ids[new_index];
> > +
> > +       /*
> > +        * Use smp_call_function to send IPI and execute the
> > +        * mtspr on target cpu.  We could do that without IPI
> > +        * if current CPU is within policy->cpus (core)
> > +        */
> > +       smp_call_function_any(cpus, set_pstate, &freq_data, 1);
> 
> Not sure how smp_call_function_any() behaves but wouldn't it be
> a good optimization if you can check if raw_smp_processor_id()
> returns one of the CPUs from 'cpus'? And in that case don't
> shoot an IPI.

That's what smp_call_function_any() does.

> 
> Same for the get part as well.
> 
> But then I looked at the implementation of these routines and found
> they already have this optimization in :) .. But with overhead of few
> locks and disable_preempt() :(
> 

Well, locks are taken when we are not running on this_cpu() so that we
can issue an IPI (or use some other optimized communication mechanism)
for executing set_pstate on one of the cpus in cpus. So this overhead
should be acceptable.

On the other hand if we are running on this_cpu(), we simply go ahead
and execute call which is what you're suggesting we do.

> > +}
> > +
> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> > +{
> > +       int base, i;
> > +
> > +       base = cpu_first_thread_sibling(policy->cpu);
> > +
> > +       for (i = 0; i < threads_per_core; i++)
> > +               cpumask_set_cpu(base + i, policy->cpus);
> > +
> > +       policy->cpuinfo.transition_latency = 25000;
> > +       policy->cur = powernv_freqs[0].frequency;
> 
> How can you be so sure? Also clock is doing this just after calling init()
> and so you can just remove it :)

You mean s/clock/core ? 

You're right, the core sets policy->cur by invoking driver->get()
which in our case will read it from the PMSR and initialize it with
the correct value. So these lines can be removed.

> 
> > +       return cpufreq_table_validate_and_show(policy, powernv_freqs);
> > +}
> > +
> > +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> > +{
> > +       return cpufreq_generic_frequency_table_verify(policy);
> > +}
> > +
> > +static int powernv_cpufreq_target_index(struct cpufreq_policy *policy,
> > +                                       unsigned int new_index)
> > +{
> > +       int rc;
> > +
> > +       lock_core_freq(policy->cpu);
> > +       rc = powernv_set_freq(policy->cpus, new_index);
> > +       unlock_core_freq(policy->cpu);
> > +
> > +       return rc;
> > +}
> > +
> > +static struct cpufreq_driver powernv_cpufreq_driver = {
> > +       .name           = "powernv-cpufreq",
> > +       .flags          = CPUFREQ_CONST_LOOPS,
> > +       .init           = powernv_cpufreq_cpu_init,
> > +       .verify         = powernv_cpufreq_verify,
> 
> Can do this instead:
> +       .verify         = cpufreq_generic_frequency_table_verify,

Ok. 

> 

Thanks for a detailed review again.

I'll send out the updated patch today incorporating the comments. It
shouldn't be hard since most of the comments do not affect the
functionality.

--
Thanks and Regards
gautham.

^ permalink raw reply

* RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag
From: David Laight @ 2014-03-27  9:41 UTC (permalink / raw)
  To: 'Mark Brown'
  Cc: alsa-devel@alsa-project.org, Li.Xiubo@freescale.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	'Nicolin Chen'
In-Reply-To: <20140327011424.GB30768@sirena.org.uk>

From: Mark Brown
> On Wed, Mar 26, 2014 at 11:59:53AM +0000, David Laight wrote:
> > From: Nicolin Chen
>=20
> > > +	regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > +	regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
>=20
> > Assuming these are 'write to clear' bits, you might want
> > to make the write (above) and all the traces (below)
> > conditional on the value being non-zero.
>=20
> The trace is already conditional?  I'd also expect to see the driver
> only acknowledging sources it knows about and only reporting that the
> interrupt was handled if it saw one of them - right now all interrupts
> are unconditionally acknowleged.

The traces are separately conditional on their own bits.
That is a lot of checks that will normally be false.

Also the driver may need to clear all the active interrupt
bits in order to make the IRQ go away.
It should trace that bits it doesn't expect to be set though.

	David

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Viresh Kumar @ 2014-03-27  9:59 UTC (permalink / raw)
  To: ego@linux.vnet.ibm.com
  Cc: Linux PM list, Rafael J. Wysocki, linuxppc-dev@ozlabs.org,
	Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327093050.GA27777@in.ibm.com>

On 27 March 2014 15:00, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> As of now, I prefer this patch be based on code that is in the -next
> tree. I'll get rid of the per-core locking once the serialization
> patch of the core is accepted.

Okay.. Then divide this patch into two parts, second one doing all
the serialization stuff and I will review only the first one from V5 :)

Maybe mention that in Patch2 as well, that once serialization patches
are accepted, its not required.

>> don't know how Rafael want this, but probably this part could have gone
>> through ppc tree..
>
> That would mean multiple patches :-)

I wasn't opposed to multiple patches at all, but multiple patches for
basic functionality of same driver file. Otherwise separating things
into patches is the best receipe.

>> > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
>> > +
>> > +#include <linux/module.h>
>> > +#include <linux/cpufreq.h>
>> > +#include <linux/smp.h>
>> > +#include <linux/of.h>
>> > +#include <asm/cputhreads.h>
>>
>> I thought I gave a comment on missing headers here?
>
> Ok, so these seem to be the missing ones.
>
> #include <linux/kernel.h>
> #include <linux/percpu-defs.h>
> #include <linux/mutex.h>

This may shift to Patch 2 ..

> #include <linux/sysfs.h>
> #include <linux/cpumask.h>
>
> #include <asm/reg.h>
>
> Have I missed anything else ?

Not sure :) .. I will see if I can find anymore..

>> > +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>> > +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
>>
>> I though we don't need this anymore? Either Rafael will take my patch as
>> is for the BOOST fixup or we will end up creating .isboost field in
>> struct cpufreq_frequency_table and so you could have used .driver_data here.
>>
>
> I mentioned in my reply to your BOOST fixup patch that we would like
> pstate_ids to be of the type "signed int", while the .driver_data is
> an "unsigned int". If we end up using .driver_data, we would have to
> be careful and not use it directly but reassign it to a signed int (or
> typecast it) before using it.

Probably many people would want it to be unsigned it, so that they
can put some strange hex values there..

Which part of your code conflicts with this right now? I can see only
this line in powernv_set_freq()

freq_data.pstate_id = powernv_pstate_ids[new_index];

And I don't think we will have side effects here.

> Not just that, the pr_debugs in the core which are printed during
> frequency transitions will then end up printing large positive values
> (since it will interpret negative pstate_ids as unsigned int) in the
> place of pstate_ids, which would not be particularly useful.

I have checked it again, those prints shouldn't have used that field.
Will fix them.

And so you can use that field in your code and I will get core fixed
for you..

>> Why do you need to get these from DT? And not find that yourself here instead?
>
> Well, I think we can obtain a more accurate value from the DT which
> would have already computed it. But I'll let vaidy answer this.

Can we simply refer to frequency values here for finding min and max
pstates? If yes, then probably this is the better place as there can be
human errors in DT.. Also those binding are just not required then.
And obviously less headache for you as well when you are changing
pstate tables.

>> > +       freq_data = (struct powernv_smp_call_data *)arg;
>>
>> don't need casting here ?
>
> Doesn't harm having it there. Just like the #includes :-)

Do you ever have these typecasts while you do kmalloc? It also
returns void *.. That's the same..

CodingStyle: Chapter 14: Allocating memory:

Casting the return value which is a void pointer is redundant. The conversion
from void pointer to any other pointer type is guaranteed by the C programming
language.

>> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d kHz \n",
>> > +               smp_processor_id(), pmspr_val, freq_data->pstate_id,
>>
>> s/smp_processor_id/raw_smp_processor_id ?
>
> No. This function is called via smp_call_function(). So we have
> preempt_disable on and it is safe to use smp_processor_id.

My question wasn't about being safe, but avoiding the complexity
of debug_smp_processor_id(). raw_smp_processor_id() can execute
very quickly.

> You mean s/clock/core ?

Oops!! yes.

> Thanks for a detailed review again.

Its my job :)

> I'll send out the updated patch today incorporating the comments. It
> shouldn't be hard since most of the comments do not affect the
> functionality.

Probably they do now ? :)

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Vaidyanathan Srinivasan @ 2014-03-27 10:11 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Linux PM list, Viresh Kumar, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327093050.GA27777@in.ibm.com>

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2014-03-27 15:00:50]:

[snip]
 
> > > +       u32 len_ids, len_freqs;
> > > +
> > > +       power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
> > > +       if (!power_mgt) {
> > > +               pr_warn("power-mgt node not found\n");
> > > +               return -ENODEV;
> > > +       }
> > > +
> > > +       if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
> > > +               pr_warn("ibm,pstate-min node not found\n");
> > > +               return -ENODEV;
> > > +       }
> > > +
> > > +       if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
> > > +               pr_warn("ibm,pstate-max node not found\n");
> > > +               return -ENODEV;
> > > +       }
> > 
> > Why do you need to get these from DT? And not find that yourself here instead?
> >
> 
> Well, I think we can obtain a more accurate value from the DT which
> would have already computed it. But I'll let vaidy answer this.

DT provides the values that we should use.  Ideally these are the
upper and lower limits of the PState table,  however as per the
platform spec, we should use the PState IDs between these limits.  We
would like to parse it and keep it in the driver for debug purposes.

The platform exports this info in the DT to be more explicit and not
have a strong dependency on the PState numbering or PState table
itself.  In the driver we parse it for debug/validation purpose.

--Vaidy

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Srivatsa S. Bhat @ 2014-03-27 10:21 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego@linux.vnet.ibm.com, Linux PM list, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard
In-Reply-To: <CAKohponOe9Au+pe5u9FpddW3+f5pegZB953+5fmYBDusE9j0HA@mail.gmail.com>

On 03/27/2014 03:29 PM, Viresh Kumar wrote:
> On 27 March 2014 15:00, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>> As of now, I prefer this patch be based on code that is in the -next
>> tree. I'll get rid of the per-core locking once the serialization
>> patch of the core is accepted.
> 
[...] 
>>>> +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d kHz \n",
>>>> +               smp_processor_id(), pmspr_val, freq_data->pstate_id,
>>>
>>> s/smp_processor_id/raw_smp_processor_id ?
>>
>> No. This function is called via smp_call_function(). So we have
>> preempt_disable on and it is safe to use smp_processor_id.
> 
> My question wasn't about being safe, but avoiding the complexity
> of debug_smp_processor_id(). raw_smp_processor_id() can execute
> very quickly.
> 

smp_processor_id() maps to debug_smp_processor_id() only if
CONFIG_DEBUG_PREEMPT is set. Otherwise, it is same as raw_smp_processor_id().
So I think its best to keep it as it is.

Regards,
Srivatsa S. Bhat

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Viresh Kumar @ 2014-03-27 10:22 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Gautham R Shenoy, Linux PM list, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327101137.GA8747@dirshya.in.ibm.com>

On 27 March 2014 15:41, Vaidyanathan Srinivasan
<svaidy@linux.vnet.ibm.com> wrote:
>> > Why do you need to get these from DT? And not find that yourself here instead?

> DT provides the values that we should use.  Ideally these are the
> upper and lower limits of the PState table,  however as per the
> platform spec, we should use the PState IDs between these limits.  We
> would like to parse it and keep it in the driver for debug purposes.
>
> The platform exports this info in the DT to be more explicit and not
> have a strong dependency on the PState numbering or PState table
> itself.  In the driver we parse it for debug/validation purpose.

I couldn't find how they are used for debugging except this:

+ pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+ pstate_nominal, pstate_max);

This is the only place where these values are used.

Anyway, you can keep it if you really want to..

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Viresh Kumar @ 2014-03-27 10:23 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: ego@linux.vnet.ibm.com, Linux PM list, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard
In-Reply-To: <5333FBA8.1040107@linux.vnet.ibm.com>

On 27 March 2014 15:51, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> smp_processor_id() maps to debug_smp_processor_id() only if
> CONFIG_DEBUG_PREEMPT is set. Otherwise, it is same as raw_smp_processor_id().
> So I think its best to keep it as it is.

That was the case in .configs and so suggested raw_* variant. Keep it
as you want to :)

^ permalink raw reply

* MPC8641 based custom board Kernel stuck at 1000Mhz core clock
From: Ashish @ 2014-03-27 10:34 UTC (permalink / raw)
  To: kernelnewbies; +Cc: scottwood, linuxppc-dev
In-Reply-To: <CAERE9PrX61oNtQoT1tpL_+rXfirFeMa4woD2P2TSUJ_Xj-h4rg@mail.gmail.com>

Hi,

  I am using MPC8641-HPCN based custom board and able to boot linux at 
MPX clock 400Mhz and core clock 800mhz. When I am increasing core 
frequency ie MPX clock at 400Mhz and core at 1Ghz, kernel stuck. Does 
any body have faced this kind of issue ?  Or any idea how to resolve this?

Thanks & Regards

A$hi$h

^ permalink raw reply

* Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag
From: Mark Brown @ 2014-03-27 10:54 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: alsa-devel@alsa-project.org, Xiubo Li-B47053,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
In-Reply-To: <20140327035726.GA16359@MrMyself>

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

On Thu, Mar 27, 2014 at 11:57:27AM +0800, Nicolin Chen wrote:
> On Thu, Mar 27, 2014 at 12:06:53PM +0800, Xiubo Li-B47053 wrote:

> > I have checked in the Vybrid and LS1 SoC datasheets, and they are all the
> > Same as above, and nothing else.

> > Have I missed ?

> What i.MX IC team told me is SAI ignores what we do to FWF and FRF, so you
> don't need to worry about it at all unless Vybrid makes them writable, in
> which case we may also need to clear these bits and confirm with Vybrid IC
> team if they're also W1C.

And even if it payed attention I'd expect that a lot of the time they'd
just be reasserted immediately as the condition still holds.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH v2] ASoC: fsl_sai: Add isr to deal with error flag
From: Nicolin Chen @ 2014-03-27 11:06 UTC (permalink / raw)
  To: broonie, Li.Xiubo
  Cc: alsa-devel, David.Laight, linuxppc-dev, linux-kernel, timur

It's quite cricial to clear error flags because SAI might hang if getting
FIFO underrun during playback (I haven't confirmed the same issue on Rx
overflow though).

So this patch enables those irq and adds isr() to clear the flags so as to
keep playback entirely safe.

Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
---

Changelog
v2:
 * Mask the active flags only for the following handler.
 * Reset FIFO for FIFO underrun/overflow cases.
 * Enable two error flags only as default.
 * Use dev_warn for two error flags.
 * Only clear those W1C bits.

 sound/soc/fsl/fsl_sai.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
 sound/soc/fsl/fsl_sai.h | 15 +++++++++
 2 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index c4a4231..0bc98bb 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -23,6 +23,71 @@
 
 #include "fsl_sai.h"
 
+#define FSL_SAI_FLAGS (FSL_SAI_CSR_SEIE |\
+		       FSL_SAI_CSR_FEIE)
+
+static irqreturn_t fsl_sai_isr(int irq, void *devid)
+{
+	struct fsl_sai *sai = (struct fsl_sai *)devid;
+	struct device *dev = &sai->pdev->dev;
+	u32 xcsr, mask;
+
+	/* Only handle those what we enabled */
+	mask = (FSL_SAI_FLAGS >> FSL_SAI_CSR_xIE_SHIFT) << FSL_SAI_CSR_xF_SHIFT;
+
+	/* Tx IRQ */
+	regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
+	xcsr &= mask;
+
+	if (xcsr & FSL_SAI_CSR_WSF)
+		dev_dbg(dev, "isr: Start of Tx word detected\n");
+
+	if (xcsr & FSL_SAI_CSR_SEF)
+		dev_warn(dev, "isr: Tx Frame sync error detected\n");
+
+	if (xcsr & FSL_SAI_CSR_FEF) {
+		dev_warn(dev, "isr: Transmit underrun detected\n");
+		/* FIFO reset for safety */
+		xcsr |= FSL_SAI_CSR_FR;
+	}
+
+	if (xcsr & FSL_SAI_CSR_FWF)
+		dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
+
+	if (xcsr & FSL_SAI_CSR_FRF)
+		dev_dbg(dev, "isr: Transmit FIFO watermark has been reached\n");
+
+	regmap_update_bits(sai->regmap, FSL_SAI_TCSR,
+			   FSL_SAI_CSR_xF_W_MASK | FSL_SAI_CSR_FR, xcsr);
+
+	/* Rx IRQ */
+	regmap_read(sai->regmap, FSL_SAI_RCSR, &xcsr);
+	xcsr &= mask;
+
+	if (xcsr & FSL_SAI_CSR_WSF)
+		dev_dbg(dev, "isr: Start of Rx word detected\n");
+
+	if (xcsr & FSL_SAI_CSR_SEF)
+		dev_warn(dev, "isr: Rx Frame sync error detected\n");
+
+	if (xcsr & FSL_SAI_CSR_FEF) {
+		dev_warn(dev, "isr: Receive overflow detected\n");
+		/* FIFO reset for safety */
+		xcsr |= FSL_SAI_CSR_FR;
+	}
+
+	if (xcsr & FSL_SAI_CSR_FWF)
+		dev_dbg(dev, "isr: Enabled receive FIFO is full\n");
+
+	if (xcsr & FSL_SAI_CSR_FRF)
+		dev_dbg(dev, "isr: Receive FIFO watermark has been reached\n");
+
+	regmap_update_bits(sai->regmap, FSL_SAI_RCSR,
+			   FSL_SAI_CSR_xF_W_MASK | FSL_SAI_CSR_FR, xcsr);
+
+	return IRQ_HANDLED;
+}
+
 static int fsl_sai_set_dai_sysclk_tr(struct snd_soc_dai *cpu_dai,
 		int clk_id, unsigned int freq, int fsl_dir)
 {
@@ -373,8 +438,8 @@ static int fsl_sai_dai_probe(struct snd_soc_dai *cpu_dai)
 {
 	struct fsl_sai *sai = dev_get_drvdata(cpu_dai->dev);
 
-	regmap_update_bits(sai->regmap, FSL_SAI_TCSR, 0xffffffff, 0x0);
-	regmap_update_bits(sai->regmap, FSL_SAI_RCSR, 0xffffffff, 0x0);
+	regmap_update_bits(sai->regmap, FSL_SAI_TCSR, 0xffffffff, FSL_SAI_FLAGS);
+	regmap_update_bits(sai->regmap, FSL_SAI_RCSR, 0xffffffff, FSL_SAI_FLAGS);
 	regmap_update_bits(sai->regmap, FSL_SAI_TCR1, FSL_SAI_CR1_RFW_MASK,
 			   FSL_SAI_MAXBURST_TX * 2);
 	regmap_update_bits(sai->regmap, FSL_SAI_RCR1, FSL_SAI_CR1_RFW_MASK,
@@ -490,12 +555,14 @@ static int fsl_sai_probe(struct platform_device *pdev)
 	struct fsl_sai *sai;
 	struct resource *res;
 	void __iomem *base;
-	int ret;
+	int irq, ret;
 
 	sai = devm_kzalloc(&pdev->dev, sizeof(*sai), GFP_KERNEL);
 	if (!sai)
 		return -ENOMEM;
 
+	sai->pdev = pdev;
+
 	sai->big_endian_regs = of_property_read_bool(np, "big-endian-regs");
 	if (sai->big_endian_regs)
 		fsl_sai_regmap_config.val_format_endian = REGMAP_ENDIAN_BIG;
@@ -514,6 +581,18 @@ static int fsl_sai_probe(struct platform_device *pdev)
 		return PTR_ERR(sai->regmap);
 	}
 
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(&pdev->dev, "no irq for node %s\n", np->full_name);
+		return irq;
+	}
+
+	ret = devm_request_irq(&pdev->dev, irq, fsl_sai_isr, 0, np->name, sai);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to claim irq %u\n", irq);
+		return ret;
+	}
+
 	sai->dma_params_rx.addr = res->start + FSL_SAI_RDR;
 	sai->dma_params_tx.addr = res->start + FSL_SAI_TDR;
 	sai->dma_params_rx.maxburst = FSL_SAI_MAXBURST_RX;
diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
index e432260..a264185 100644
--- a/sound/soc/fsl/fsl_sai.h
+++ b/sound/soc/fsl/fsl_sai.h
@@ -37,7 +37,21 @@
 
 /* SAI Transmit/Recieve Control Register */
 #define FSL_SAI_CSR_TERE	BIT(31)
+#define FSL_SAI_CSR_FR		BIT(25)
+#define FSL_SAI_CSR_xF_SHIFT	16
+#define FSL_SAI_CSR_xF_W_SHIFT	18
+#define FSL_SAI_CSR_xF_MASK	(0x1f << FSL_SAI_CSR_xF_SHIFT)
+#define FSL_SAI_CSR_xF_W_MASK	(0x7 << FSL_SAI_CSR_xF_W_SHIFT)
+#define FSL_SAI_CSR_WSF		BIT(20)
+#define FSL_SAI_CSR_SEF		BIT(19)
+#define FSL_SAI_CSR_FEF		BIT(18)
 #define FSL_SAI_CSR_FWF		BIT(17)
+#define FSL_SAI_CSR_FRF		BIT(16)
+#define FSL_SAI_CSR_xIE_SHIFT	8
+#define FSL_SAI_CSR_WSIE	BIT(12)
+#define FSL_SAI_CSR_SEIE	BIT(11)
+#define FSL_SAI_CSR_FEIE	BIT(10)
+#define FSL_SAI_CSR_FWIE	BIT(9)
 #define FSL_SAI_CSR_FRIE	BIT(8)
 #define FSL_SAI_CSR_FRDE	BIT(0)
 
@@ -99,6 +113,7 @@
 #define FSL_SAI_MAXBURST_RX 6
 
 struct fsl_sai {
+	struct platform_device *pdev;
 	struct regmap *regmap;
 
 	bool big_endian_regs;
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Benjamin Herrenschmidt @ 2014-03-27 11:12 UTC (permalink / raw)
  To: ego
  Cc: Linux PM list, Viresh Kumar, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327093050.GA27777@in.ibm.com>

On Thu, 2014-03-27 at 15:00 +0530, Gautham R Shenoy wrote:
> > > --- a/arch/powerpc/configs/pseries_le_defconfig
> > > +++ b/arch/powerpc/configs/pseries_le_defconfig
> > > @@ -350,3 +350,4 @@ CONFIG_CRYPTO_LZO=m
> > >  # CONFIG_CRYPTO_ANSI_CPRNG is not set
> > >  CONFIG_CRYPTO_DEV_NX=y
> > >  CONFIG_CRYPTO_DEV_NX_ENCRYPT=m
> > > +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> > 
> > don't know how Rafael want this, but probably this part could have
> gone
> > through ppc tree..
> > 
> 
> That would mean multiple patches :-)

We can always fix the defconfig later but I am also find if Rafael
carries that bit.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Gautham R Shenoy @ 2014-03-27 11:20 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego@linux.vnet.ibm.com, Linux PM list, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <CAKohponOe9Au+pe5u9FpddW3+f5pegZB953+5fmYBDusE9j0HA@mail.gmail.com>

On Thu, Mar 27, 2014 at 03:29:36PM +0530, Viresh Kumar wrote:
> On 27 March 2014 15:00, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > As of now, I prefer this patch be based on code that is in the -next
> > tree. I'll get rid of the per-core locking once the serialization
> > patch of the core is accepted.
> 
> Okay.. Then divide this patch into two parts, second one doing all
> the serialization stuff and I will review only the first one from V5 :)
> 
> Maybe mention that in Patch2 as well, that once serialization patches
> are accepted, its not required.

Sure. This would be a good idea.

> 
> >> don't know how Rafael want this, but probably this part could have gone
> >> through ppc tree..
> >
> > That would mean multiple patches :-)
> 
> I wasn't opposed to multiple patches at all, but multiple patches for
> basic functionality of same driver file. Otherwise separating things
> into patches is the best receipe.

Fair enough!

> 
> >> > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> >> > +
> >> > +#include <linux/module.h>
> >> > +#include <linux/cpufreq.h>
> >> > +#include <linux/smp.h>
> >> > +#include <linux/of.h>
> >> > +#include <asm/cputhreads.h>
> >>
> >> I thought I gave a comment on missing headers here?
> >
> > Ok, so these seem to be the missing ones.
> >
> > #include <linux/kernel.h>
> > #include <linux/percpu-defs.h>
> > #include <linux/mutex.h>
> 
> This may shift to Patch 2 ..

Ok.

> 
> > #include <linux/sysfs.h>
> > #include <linux/cpumask.h>
> >
> > #include <asm/reg.h>
> >
> > Have I missed anything else ?
> 
> Not sure :) .. I will see if I can find anymore..
>

Cool.
 
> >> > +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> >> > +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> >>
> >> I though we don't need this anymore? Either Rafael will take my patch as
> >> is for the BOOST fixup or we will end up creating .isboost field in
> >> struct cpufreq_frequency_table and so you could have used .driver_data here.
> >>
> >
> > I mentioned in my reply to your BOOST fixup patch that we would like
> > pstate_ids to be of the type "signed int", while the .driver_data is
> > an "unsigned int". If we end up using .driver_data, we would have to
> > be careful and not use it directly but reassign it to a signed int (or
> > typecast it) before using it.
> 
> Probably many people would want it to be unsigned it, so that they
> can put some strange hex values there..
> 
> Which part of your code conflicts with this right now? I can see only
> this line in powernv_set_freq()

I am not denying the advantage of storing pstate_ids in .driver_data
since we will then have all the information in one place.
(That said, in the future if we want to export additional
information, say pertaining to the voltage, we will have to keep a
separate array anyway :-) )

However, as of now, I am wary about reusing a variable provided by
the core, whose type is fixed (which is not the type I am interested
in) and whose interpretation is ambiguous (since it is also being used
as a BOOST indicator).

When we have a .is_boost field, or a flags field in the cpufreq_table,
and once the .driver_data field is completely opaque, I would be
comfortable making use of .driver_data to store the pstate_ids. So
I'll send out those patches once the fixes are present in the
core. For now, let this code stay as it is. I think we should be ok
with the additional overhead of 1kb as of now.

> 
> freq_data.pstate_id = powernv_pstate_ids[new_index];
> 
> And I don't think we will have side effects here.
> 
> > Not just that, the pr_debugs in the core which are printed during
> > frequency transitions will then end up printing large positive values
> > (since it will interpret negative pstate_ids as unsigned int) in the
> > place of pstate_ids, which would not be particularly useful.
> 
> I have checked it again, those prints shouldn't have used that field.
> Will fix them.
> 
> And so you can use that field in your code and I will get core fixed
> for you..

Like I said earlier, once the core is fixed, I'll be comfortable using
the .driver_data field for the purpose I need. Not until then :-)

> 
> >> Why do you need to get these from DT? And not find that yourself here instead?
> >
> > Well, I think we can obtain a more accurate value from the DT which
> > would have already computed it. But I'll let vaidy answer this.
> 
> Can we simply refer to frequency values here for finding min and max
> pstates? If yes, then probably this is the better place as there can be
> human errors in DT.. Also those binding are just not required then.
> And obviously less headache for you as well when you are changing
> pstate tables.
> 
> >> > +       freq_data = (struct powernv_smp_call_data *)arg;
> >>
> >> don't need casting here ?
> >
> > Doesn't harm having it there. Just like the #includes :-)
> 
> Do you ever have these typecasts while you do kmalloc? It also
> returns void *.. That's the same..
> 
> CodingStyle: Chapter 14: Allocating memory:
> 
> Casting the return value which is a void pointer is redundant. The conversion
> from void pointer to any other pointer type is guaranteed by the C programming
> language.

Thanks. I shall keep this in mind in the future.

> 
> >> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d kHz \n",
> >> > +               smp_processor_id(), pmspr_val, freq_data->pstate_id,
> >>
> >> s/smp_processor_id/raw_smp_processor_id ?
> >
> > No. This function is called via smp_call_function(). So we have
> > preempt_disable on and it is safe to use smp_processor_id.
> 
> My question wasn't about being safe, but avoiding the complexity
> of debug_smp_processor_id(). raw_smp_processor_id() can execute
> very quickly.

Well, in the scenarios that we're interested in, it is highly unlikely
that CONFIG_PREMPT is set. Hence we'll default to
raw_smp_processor_id() anyway. So, I think we can retain
smp_processor_id(). 

> 
> > You mean s/clock/core ?
> 
> Oops!! yes.
> 
> > Thanks for a detailed review again.
> 
> Its my job :)
> 
> > I'll send out the updated patch today incorporating the comments. It
> > shouldn't be hard since most of the comments do not affect the
> > functionality.
> 
> Probably they do now ? :)

Not really.

> 
--
Thanks and Regards
gautham.

^ permalink raw reply

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Viresh Kumar @ 2014-03-27 11:29 UTC (permalink / raw)
  To: ego@linux.vnet.ibm.com
  Cc: Linux PM list, Rafael J. Wysocki, linuxppc-dev@ozlabs.org,
	Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327112031.GC27777@in.ibm.com>

On 27 March 2014 16:50, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> (That said, in the future if we want to export additional
> information, say pertaining to the voltage, we will have to keep a
> separate array anyway :-) )

Lets handle that once it comes.

> However, as of now, I am wary about reusing a variable provided by
> the core, whose type is fixed (which is not the type I am interested
> in) and whose interpretation is ambiguous (since it is also being used
> as a BOOST indicator).

I understand that..

> When we have a .is_boost field, or a flags field in the cpufreq_table,

Probably we might keep it as is and take the patch which I already
sent. In that case as well your stuff will continue to work..

> and once the .driver_data field is completely opaque, I would be
> comfortable making use of .driver_data to store the pstate_ids. So
> I'll send out those patches once the fixes are present in the
> core. For now, let this code stay as it is. I think we should be ok
> with the additional overhead of 1kb as of now.

Chances are more that all my changes would make it before this
driver and then you would be required to update this driver
before submission.

^ permalink raw reply

* Re: [PATCH] phy/at8031: enable at8031 to work on interrupt mode
From: Sergei Shtylyov @ 2014-03-27 11:52 UTC (permalink / raw)
  To: Zhao Qiang, linuxppc-dev, netdev, B07421
  Cc: mugunthanvnm, linux-kernel, helmut.schaa, zonque, R63061, davem
In-Reply-To: <1395901116-16034-1-git-send-email-B45475@freescale.com>

Hello.

On 27-03-2014 10:18, Zhao Qiang wrote:

> The at8031 can work on polling mode and interrupt mode.
> Add ack_interrupt and config intr funcs to enable
> interrupt mode for it.

> Signed-off-by: Zhao Qiang <B45475@freescale.com>
> ---
>   drivers/net/phy/at803x.c | 30 ++++++++++++++++++++++++++++++
>   1 file changed, 30 insertions(+)

> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index bc71947..d034ef5 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
[...]
> @@ -191,6 +194,31 @@ static int at803x_config_init(struct phy_device *phydev)
>   	return 0;
>   }
>
> +static int at803x_ack_interrupt(struct phy_device *phydev)
> +{
> +	int err;
> +
> +	err = phy_read(phydev, AT803X_INSR);

    Could make this an initializer...

> +
> +	return (err < 0) ? err : 0;
> +}
> +
> +static int at803x_config_intr(struct phy_device *phydev)
> +{
> +	int err;
> +	int value;
> +
> +	value = phy_read(phydev, AT803X_INER);
> +
> +	if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
> +		err = phy_write(phydev, AT803X_INER,
> +				(value | AT803X_INER_INIT));

    Inner parens not needed.

> +	else
> +		err = phy_write(phydev, AT803X_INER, value);

    Why are you not clearing the bits here? Why write back what has been read 
at all?

WBR, Sergei

^ permalink raw reply

* [PATCH] ALSA: snd-aoa: Add another layout entry for PowerBook6,5
From: Adam Smith @ 2014-03-27 11:52 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: stefang

Either one or a combination of commits
3a3dd0186f619b74e61e6f29dddcaf59af7d3cac "Improve detection of devices
from device-tree" and 26b0d14106954ae46d2f4f7eec3481828a210f7d "Adapt
to new i2c probing scheme" broke the snd-powermac module for my
PowerBook6,5 machine (12" iBook late 2004).  As I understand things,
these machines should be moving over to use snd-aoa.

The attached patch (against a 3.13.0 kernel) creates a new snd-aoa
layout entry for PowerBook6,5 machines.  I've tested it on my iBook and
sound now appears to be fully working.  The module snd-aoa-i2sbus needs
to be added to /etc/modules to ensure all the necessary snd-aoa modules
are loaded. 

Signed-off-by: Adam Smith <adogcalledsummer@gmail.com>

diff -uprN a/sound/aoa/fabrics/layout.c b/sound/aoa/fabrics/layout.c
--- a/sound/aoa/fabrics/layout.c	2014-01-20 02:40:07.000000000 +0000
+++ b/sound/aoa/fabrics/layout.c	2014-03-05 18:31:11.748445177 +0000
@@ -113,6 +113,7 @@ MODULE_ALIAS("sound-layout-100");
 MODULE_ALIAS("aoa-device-id-14");
 MODULE_ALIAS("aoa-device-id-22");
 MODULE_ALIAS("aoa-device-id-35");
+MODULE_ALIAS("aoa-device-id-38");
 MODULE_ALIAS("aoa-device-id-44");

 /* onyx with all but microphone connected */
@@ -363,6 +364,12 @@ static struct layout layouts[] = {
 	  },
 	},
 	/* PowerBook6,5 */
+	{ .device_id = 38,
+	  .codecs[0] = {
+		.name = "tas",
+		.connections = tas_connections_noline,
+	  },
+	},
 	{ .device_id = 44,
 	  .codecs[0] = {
 		.name = "tas",
diff -uprN a/sound/aoa/soundbus/i2sbus/core.c b/sound/aoa/soundbus/i2sbus/core.c
--- a/sound/aoa/soundbus/i2sbus/core.c	2014-01-20 02:40:07.000000000 +0000
+++ b/sound/aoa/soundbus/i2sbus/core.c	2014-03-05 18:32:41.184445303 +0000
@@ -203,7 +203,7 @@ static int i2sbus_add_dev(struct macio_d
 			 * so restrict to those we do handle for now.
 			 */
 			if (id && (*id == 22 || *id == 14 || *id == 35 ||
-				   *id == 44)) {
+				   *id == 38 || *id == 44)) {
 				snprintf(dev->sound.modalias, 32,
 					 "aoa-device-id-%d", *id);
 				ok = 1;

^ permalink raw reply

* [PATCH 1/2] video/fsl: Fix the sleep function for FSL DIU module
From: Jason Jin @ 2014-03-27 11:37 UTC (permalink / raw)
  To: b07421, timur
  Cc: linux-fbdev, linuxppc-dev, r58472, jason.jin, Wang Dongsheng

For deep sleep, the diu module will power off, when wake up from the deep
sleep, more registers need to be reinitialized.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Jason Jin <Jason.Jin@freescale.com>
---
V2: Coding style clean up based on Timur's comments.

 drivers/video/fsl-diu-fb.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fsl-diu-fb.c b/drivers/video/fsl-diu-fb.c
index e8758b9..4bc4730 100644
--- a/drivers/video/fsl-diu-fb.c
+++ b/drivers/video/fsl-diu-fb.c
@@ -1628,9 +1628,15 @@ static int fsl_diu_suspend(struct platform_device *ofdev, pm_message_t state)
 static int fsl_diu_resume(struct platform_device *ofdev)
 {
 	struct fsl_diu_data *data;
+	unsigned int i;
 
 	data = dev_get_drvdata(&ofdev->dev);
-	enable_lcdc(data->fsl_diu_info);
+	fsl_diu_enable_interrupts(data);
+	update_lcdc(data->fsl_diu_info);
+	for (i = 0; i < NUM_AOIS; i++) {
+		if (data->mfb[i].count)
+			fsl_diu_enable_panel(&data->fsl_diu_info[i]);
+	}
 
 	return 0;
 }
-- 
1.8.0

^ permalink raw reply related

* [PATCH 2/2] Make the diu driver work without board level initilization
From: Jason Jin @ 2014-03-27 11:38 UTC (permalink / raw)
  To: b07421, timur; +Cc: linux-fbdev, linuxppc-dev, r58472, jason.jin

So far the DIU driver does not have a mechanism to do the
board specific initialization. So on some platforms,
such as P1022, 8610 and 5121, The board specific initialization
is implmented in the platform file such p10222_ds.

Actually, the DIU is already intialized in the u-boot, the pin sharing
and the signal routing are also set in u-boot. So we can leverage that
in kernel driver to avoid board sepecific initialization, especially
for the corenet platform, which is the abstraction for serveral
platfroms.

The potential problem is that when the system wakeup from the deep
sleep, some platform settings may need to be re-initialized. The CPLD
and FPGA settings will be kept, but the pixel clock register which
usually locate at the global utility space need to be reinitialized.

Generally, the pixel clock setting was implemented in the platform
file, But the pixel clock register itself should be part of the DIU
module, And for P1022,8610 and T1040, the pixel clock register have the
same structure, So we can consider to move the pixel clock setting
from the platform to the diu driver. This patch provide the options
set the pixel clock in the diu driver. But the original platform pixel
clock setting stil can be used for P1022,8610 and 512x without any
update. To implement the pixel clock setting in the diu driver. the
following update in the diu dts node was needed.
display:display@180000 {
	compatible = "fsl,t1040-diu", "fsl,diu";
-	reg = <0x180000 1000>;
+	reg = <0x180000 1000 0xfc028 4>;
+	pixclk = <0 255 0>;
 	interrupts = <74 2 0 0>;
}
The 0xfc028 is the offset for pixel clock register. the 3 segment of
the pixclk stand for the PXCKDLYDIR, the max of PXCK and the PXCKDLY
which will be used by the pixel clock register setting.

This was tested on T1040 platform. For other platform, the original
node together with the platform settings still can work.

Signed-off-by: Jason Jin <Jason.Jin@freescale.com>
---
V2: Remove the pixel clock register saving for suspend.
add the pixel clock setting in driver.

 drivers/video/fsl-diu-fb.c | 61 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fsl-diu-fb.c b/drivers/video/fsl-diu-fb.c
index 4bc4730..792038f 100644
--- a/drivers/video/fsl-diu-fb.c
+++ b/drivers/video/fsl-diu-fb.c
@@ -50,6 +50,7 @@
 #define INT_PARERR	0x08	/* Display parameters error interrupt */
 #define INT_LS_BF_VS	0x10	/* Lines before vsync. interrupt */
 
+#define PIXCLKCR_PXCKEN 0x80000000
 /*
  * List of supported video modes
  *
@@ -372,6 +373,8 @@ struct fsl_diu_data {
 	unsigned int irq;
 	enum fsl_diu_monitor_port monitor_port;
 	struct diu __iomem *diu_reg;
+	void __iomem *pixelclk_reg;
+	u32 pixclkcr[3];
 	spinlock_t reg_lock;
 	u8 dummy_aoi[4 * 4 * 4];
 	struct diu_ad dummy_ad __aligned(8);
@@ -479,7 +482,10 @@ static enum fsl_diu_monitor_port fsl_diu_name_to_port(const char *s)
 			port = FSL_DIU_PORT_DLVDS;
 	}
 
-	return diu_ops.valid_monitor_port(port);
+	if (diu_ops.valid_monitor_port)
+		return diu_ops.valid_monitor_port(port);
+	else
+		return port;
 }
 
 /*
@@ -798,6 +804,35 @@ static void set_fix(struct fb_info *info)
 	fix->ypanstep = 1;
 }
 
+static void set_pixel_clock(struct fsl_diu_data *data, unsigned int pixclock)
+{
+	unsigned long freq;
+	u64 temp;
+	u32 pxclk;
+	u32 pxclkdl_dir, pxckmax, pxclk_delay;
+
+	/* Convert pixclock from a wavelength to a frequency */
+	temp = 1000000000000ULL;
+	do_div(temp, pixclock);
+	freq = temp;
+
+	pxclkdl_dir = data->pixclkcr[0] << 30;
+	pxckmax =  data->pixclkcr[1];
+	pxclk_delay = data->pixclkcr[2] << 8;
+
+	/*
+	 * 'pxclk' is the ratio of the platform clock to the pixel clock.
+	 * This number is programmed into the PIXCLKCR register, and the valid
+	 * range of values is 2- pxckmax.
+	 */
+	pxclk = DIV_ROUND_CLOSEST(fsl_get_sys_freq(), freq);
+	pxclk = clamp_t(u32, pxclk, 2, pxckmax);
+
+	out_be32(data->pixelclk_reg, 0);
+	out_be32(data->pixelclk_reg, PIXCLKCR_PXCKEN
+			| pxclkdl_dir | (pxclk << 16) | pxclk_delay);
+}
+
 static void update_lcdc(struct fb_info *info)
 {
 	struct fb_var_screeninfo *var = &info->var;
@@ -846,7 +881,13 @@ static void update_lcdc(struct fb_info *info)
 
 	out_be32(&hw->vsyn_para, temp);
 
-	diu_ops.set_pixel_clock(var->pixclock);
+	/* If the pixel clock setting function can not be used on the platform,
+	 * then use the platform one.
+	 */
+	if (diu_ops.set_pixel_clock)
+		diu_ops.set_pixel_clock(var->pixclock);
+	else
+		set_pixel_clock(data, var->pixclock);
 
 #ifndef CONFIG_PPC_MPC512x
 	/*
@@ -1752,6 +1793,22 @@ static int fsl_diu_probe(struct platform_device *pdev)
 		goto error;
 	}
 
+	if (!diu_ops.set_pixel_clock) {
+		data->pixelclk_reg = of_iomap(np, 1);
+		if (!data->pixelclk_reg) {
+			dev_err(&pdev->dev, "Cannot map pixelclk registers, please \
+				provide the diu_ops for pixclk setting instead.\n");
+			ret = -EFAULT;
+			goto error;
+		}
+		/*Get the pixclkcr settings: PXCKDLYDIR; MAXPXCK, PXCKDLY*/
+		ret = of_property_read_u32_array(np, "pixclk", data->pixclkcr, 3);
+		if (ret) {
+			dev_err(&pdev->dev, "Cannot get pixelclk registers information.\n");
+			goto error;
+		}
+	}
+
 	/* Get the IRQ of the DIU */
 	data->irq = irq_of_parse_and_map(np, 0);
 
-- 
1.8.0

^ permalink raw reply related

* Re: [PATCH v4] powernv, cpufreq: cpufreq driver for powernv platform
From: Benjamin Herrenschmidt @ 2014-03-27 12:56 UTC (permalink / raw)
  To: ego
  Cc: Linux PM list, Viresh Kumar, Rafael J. Wysocki,
	linuxppc-dev@ozlabs.org, Anton Blanchard, Srivatsa S. Bhat
In-Reply-To: <20140327112031.GC27777@in.ibm.com>

On Thu, 2014-03-27 at 16:50 +0530, Gautham R Shenoy wrote:
> 
> Well, in the scenarios that we're interested in, it is highly unlikely
> that CONFIG_PREMPT is set. Hence we'll default to
> raw_smp_processor_id() anyway. So, I think we can retain
> smp_processor_id(). 

We don't know that. Some people are interested in running preempt
on these things. If raw is ok to call here, please use it.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag
From: Mark Brown @ 2014-03-27 13:05 UTC (permalink / raw)
  To: David Laight
  Cc: alsa-devel@alsa-project.org, Li.Xiubo@freescale.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	'Nicolin Chen'
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D0F6EA538@AcuExch.aculab.com>

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]

On Thu, Mar 27, 2014 at 09:41:20AM +0000, David Laight wrote:
> From: Mark Brown

> > The trace is already conditional?  I'd also expect to see the driver
> > only acknowledging sources it knows about and only reporting that the
> > interrupt was handled if it saw one of them - right now all interrupts
> > are unconditionally acknowleged.

> The traces are separately conditional on their own bits.
> That is a lot of checks that will normally be false.

Oh, I see what you mean.  I'm not sure that level of optimisation is
worth it, the overhead of taking an interrupt in the first place is
going to dwarf the optimisation and the indentation probably won't help
writing clear messages.

> Also the driver may need to clear all the active interrupt
> bits in order to make the IRQ go away.
> It should trace that bits it doesn't expect to be set though.

The interrupt core will eventually take care of it if the interrupt is
left screaming with nothing handling it (and provide diagnostics).  I
would *hope* that this is only happening for unmasked interrupt sources
we explicitly unmask anyway, we really don't want interrupts on FIFO
level changes.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH v2] ASoC: fsl_sai: Add isr to deal with error flag
From: Mark Brown @ 2014-03-27 13:26 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: alsa-devel, linux-kernel, timur, Li.Xiubo, David.Laight,
	linuxppc-dev
In-Reply-To: <1395918419-20739-1-git-send-email-Guangyu.Chen@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On Thu, Mar 27, 2014 at 07:06:59PM +0800, Nicolin Chen wrote:
> It's quite cricial to clear error flags because SAI might hang if getting
> FIFO underrun during playback (I haven't confirmed the same issue on Rx
> overflow though).
> 
> So this patch enables those irq and adds isr() to clear the flags so as to
> keep playback entirely safe.

So, I've applied this since we're (hopefully!) very near the merge
window opening and it seems like it should be an improvement overall.
However a few things below:

> +	/* Only handle those what we enabled */
> +	mask = (FSL_SAI_FLAGS >> FSL_SAI_CSR_xIE_SHIFT) << FSL_SAI_CSR_xF_SHIFT;

The shifting here could use a comment.

> +	regmap_update_bits(sai->regmap, FSL_SAI_RCSR,
> +			   FSL_SAI_CSR_xF_W_MASK | FSL_SAI_CSR_FR, xcsr);

Using update_bits() is going to do an extra read, better to do this as:

	if (xcsr)
		regmap_write(sai->regmap, FSL_SAI_RCSR, xcsr);

otherwise we might be ignoring any of the bits that are actually clear
on read (it seems like there are some?).

> +	return IRQ_HANDLED;

I'd expect to see IRQ_NONE if we didn't actually see an interrupt
source.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: Bug in reclaim logic with exhausted nodes?
From: Nishanth Aravamudan @ 2014-03-27 20:33 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm, mgorman, linuxppc-dev, anton, rientjes
In-Reply-To: <alpine.DEB.2.10.1403251323030.26744@nuc>

Hi Christoph,

On 25.03.2014 [13:25:30 -0500], Christoph Lameter wrote:
> On Tue, 25 Mar 2014, Nishanth Aravamudan wrote:
> 
> > On power, very early, we find the 16G pages (gpages in the powerpc arch
> > code) in the device-tree:
> >
> > early_setup ->
> > 	early_init_mmu ->
> > 		htab_initialize ->
> > 			htab_init_page_sizes ->
> > 				htab_dt_scan_hugepage_blocks ->
> > 					memblock_reserve
> > 						which marks the memory
> > 						as reserved
> > 					add_gpage
> > 						which saves the address
> > 						off so future calls for
> > 						alloc_bootmem_huge_page()
> >
> > hugetlb_init ->
> > 		hugetlb_init_hstates ->
> > 			hugetlb_hstate_alloc_pages ->
> > 				alloc_bootmem_huge_page
> >
> > > Not sure if I understand that correctly.
> >
> > Basically this is present memory that is "reserved" for the 16GB usage
> > per the LPAR configuration. We honor that configuration in Linux based
> > upon the contents of the device-tree. It just so happens in the
> > configuration from my original e-mail that a consequence of this is that
> > a NUMA node has memory (topologically), but none of that memory is free,
> > nor will it ever be free.
> 
> Well dont do that
> 
> > Perhaps, in this case, we could just remove that node from the N_MEMORY
> > mask? Memory allocations will never succeed from the node, and we can
> > never free these 16GB pages. It is really not any different than a
> > memoryless node *except* when you are using the 16GB pages.
> 
> That looks to be the correct way to handle things. Maybe mark the node as
> offline or somehow not present so that the kernel ignores it.

This is a SLUB condition:

mm/slub.c::early_kmem_cache_node_alloc():
...
        page = new_slab(kmem_cache_node, GFP_NOWAIT, node);
...
        if (page_to_nid(page) != node) {
                printk(KERN_ERR "SLUB: Unable to allocate memory from "
                                "node %d\n", node);
                printk(KERN_ERR "SLUB: Allocating a useless per node structure "
                                "in order to be able to continue\n");
        }
...

Since this is quite early, and we have not set up the nodemasks yet,
does it make sense to perhaps have a temporary init-time nodemask that
we set bits in here, and "fix-up" those nodes when we setup the
nodemasks?

Thanks,
Nish

^ permalink raw reply

* Re: [PATCH 4/4] powerpc/powernv: Fix little endian issues OPAL error log code
From: Anton Blanchard @ 2014-03-27 22:04 UTC (permalink / raw)
  To: Stewart Smith; +Cc: paulus, linuxppc-dev
In-Reply-To: <87ob0supw1.fsf@river.i-did-not-set--mail-host-address--so-tickle-me>


Hi Stewart,

> Do we also need any magic for the getting of error logs themselves?
> 
> You may want to check the md5 of the logs themselves on be and le.

Good idea. An md5 of the logs in LE and BE on this box checks out, so I
think we are good.

Anton

^ permalink raw reply

* [PATCH 1/2] powerpc/powernv: Fix endian issues with OPAL async code
From: Anton Blanchard @ 2014-03-27 22:17 UTC (permalink / raw)
  To: benh, paulus, neelegup, sbhat; +Cc: linuxppc-dev


Byteswap struct opal_msg in one place instead of requiring all the
callers to do it.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: b/arch/powerpc/include/asm/opal.h
===================================================================
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -421,6 +421,12 @@ enum OpalSysparamPerm {
 	OPAL_SYSPARAM_RW        = (OPAL_SYSPARAM_READ | OPAL_SYSPARAM_WRITE),
 };
 
+struct raw_opal_msg {
+	__be32 msg_type;
+	__be32 reserved;
+	__be64 params[8];
+};
+
 struct opal_msg {
 	uint32_t msg_type;
 	uint32_t reserved;
Index: b/arch/powerpc/platforms/powernv/opal.c
===================================================================
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -288,9 +288,11 @@ static void opal_handle_message(void)
 	 * TODO: pre-allocate a message buffer depending on opal-msg-size
 	 * value in /proc/device-tree.
 	 */
+	static struct raw_opal_msg raw_msg;
 	static struct opal_msg msg;
+	int i;
 
-	ret = opal_get_msg(__pa(&msg), sizeof(msg));
+	ret = opal_get_msg(__pa(&raw_msg), sizeof(raw_msg));
 	/* No opal message pending. */
 	if (ret == OPAL_RESOURCE)
 		return;
@@ -302,6 +304,11 @@ static void opal_handle_message(void)
 		return;
 	}
 
+	msg.msg_type = be32_to_cpu(raw_msg.msg_type);
+	msg.reserved = be32_to_cpu(raw_msg.reserved);
+	for (i = 0; i < ARRAY_SIZE(raw_msg.params); i++)
+		msg.params[i] = be64_to_cpu(raw_msg.params[i]);
+
 	/* Sanity check */
 	if (msg.msg_type > OPAL_MSG_TYPE_MAX) {
 		pr_warning("%s: Unknown message type: %u\n",

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox