LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH] powerpc: add Book E support to 64-bit hibernation
From: Wang Dongsheng-B40534 @ 2013-04-02  5:28 UTC (permalink / raw)
  To: Wood Scott-B07421, Johannes Berg; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1363990490.24790.12@snotra>

Hi scott & Johannes,

Thanks for reviewing.

@scott, About this patch, could you please help ack this patch?

> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Saturday, March 23, 2013 6:15 AM
> To: Johannes Berg
> Cc: Wang Dongsheng-B40534; linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH] powerpc: add Book E support to 64-bit hibernation
>=20
> On 03/22/2013 05:58:37 AM, Johannes Berg wrote:
> > On Tue, 2013-03-19 at 17:16 -0500, Scott Wood wrote:
> >
> > > > > I wonder about kernel modules, though flushing 32 MiB wouldn't
> > be
> > > > > adequate there.
> > > >
> > > > Good question, but would they be running? You have to have
> > everything
> > > > built in that you need to load the image? Or maybe not, with the
> > > > userspace image restoration that became possible at some point...
> > >
> > > Is that all that's being restored in this step, or would we be
> > loading
> > > all modules that were loaded before suspend (as they're normally not
> > > swappable)?  I'm not too familiar with what gets saved where.
> >
> > Yes, they would be restored since full memory is restored, but the
> > kernel doing the restore hasn't typically had a chance to load
> > modules.
> > Although with uswsusp there probably are ways it can already have.
>=20
> If we're loading executable code that wasn't there before[1], we need to
> flush -- but there's no reason why modules would only be in the lower 32
> MiB.
>=20
> -Scott
>=20
> [1] within the resume process, not before suspend

^ permalink raw reply

* Re: [linuxppc-release][PATCH] powerpc/pci-hotplug: fix init issue of rescanned pci device
From: Bjorn Helgaas @ 2013-04-01 16:29 UTC (permalink / raw)
  To: Chen Yuanquan-B41889
  Cc: 松本博郎, linux-pci@vger.kernel.org,
	Zang Roy-R61911, linuxppc-dev
In-Reply-To: <51596196.5080506@freescale.com>

On Mon, Apr 1, 2013 at 4:29 AM, Chen Yuanquan-B41889
<B41889@freescale.com> wrote:
> On 12/08/2012 05:15 AM, Bjorn Helgaas wrote:
>>
>> On Thu, Dec 6, 2012 at 4:23 AM, Chen Yuanquan-B41889
>> <B41889@freescale.com> wrote:
>>>
>>> On 12/06/2012 05:30 AM, Bjorn Helgaas wrote:
>>>>
>>>> On Wed, Dec 5, 2012 at 2:29 AM, Chen Yuanquan-B41889
>>>> <B41889@freescale.com> wrote:
>>>>>
>>>>> On 12/05/2012 04:26 PM, Benjamin Herrenschmidt wrote:
>>>>>>
>>>>>> On Wed, 2012-12-05 at 16:20 +0800, Chen Yuanquan-B41889 wrote:
>>>>>>>
>>>>>>> On 12/05/2012 03:17 PM, Benjamin Herrenschmidt wrote:
>>>>>>>>
>>>>>>>> On Wed, 2012-12-05 at 10:31 +0800, Yuanquan Chen wrote:
>>>>>>>>>
>>>>>>>>> On powerpc arch, some fixup work of PCI/PCI-e device is just done
>>>>>>>>> during the
>>>>>>>>> first scan at booting time. For the PCI/PCI-e device rescanned
>>>>>>>>> after
>>>>>>>>> linux OS
>>>>>>>>> booting up, the fixup work won't be done, which leads to
>>>>>>>>> dma_set_mask
>>>>>>>>> error or
>>>>>>>>> irq related issue in rescanned PCI/PCI-e device's driver. So, it
>>>>>>>>> does
>>>>>>>>> the same
>>>>>>>>> fixup work for the rescanned device to avoid this issue.
>>>>>>>>
>>>>>>>> Hrm, the patch is a bit gross. First the code shouldn't be
>>>>>>>> copy/pasted
>>>>>>>> that way but factored out.
>>>>>>
>>>>>> Please, at least format your email properly so I can try to undertand
>>>>>> without needing aspirin.
>>>>>>
>>>>>>> There's a judgement "if (!bus->is_added)" before calling of
>>>>>>> pcibios_fixup_bus in pci_scan_child_bus, so for the rescanned device,
>>>>>>> the fixup won't execute, which leads to fatal error in driver of
>>>>>>> rescanned
>>>>>>> device on freescale  powerpc, no this issues on x86 arch.
>>>>>>
>>>>>> First, none of that invalidates my statement that you shouldn't
>>>>>> duplicate a whole block of code like this. Even if your approach is
>>>>>> correct (which is debated separately), at the very least you should
>>>>>> factor the code out into a common function between the two copies.
>>>>>>
>>>>>>> Remove the judgement, let it to do the pcibios_fixup_bus
>>>>>>> directly, the error won't occur for the rescanned device. But it's
>>>>>>> general code, not proper to change here, so copy the
>>>>>>> pcibios_fixup_bus
>>>>>>> work to  pcibios_enable_device.
>>>>>>>
>>>>>>>> I'm surprised also that is_added is false when
>>>>>>>> pcibios_enable_device()
>>>>>>>> gets called ... that looks strange to me. At what point is that
>>>>>>>> enable
>>>>>>>> happening in the hotplug sequence ?
>>>>>>>
>>>>>>> All devices are rescanned and then call the pci_enable_devices and
>>>>>>> pci_bus_add_devices.
>>>>>>
>>>>>> Where ? How ? What is the sequence happening ? In any case, I think if
>>>>>> we need a proper fixup done per-device like that after scan we ought
>>>>>> to
>>>>>> create a new hook at the generic level rather than that sort of hack.
>>>>>>
>>>>> echo 1 > rescan to trigger dev_rescan_store:
>>>>>
>>>>> dev_rescan_store->pci_rescan_bus->pci_scan_child_bus,
>>>>> pci_assign_unassigned_bus_resources,
>>>>> pci_enable_bridges, pci_bus_add_devices
>>>>>
>>>>>
>>>>>
>>>>> pci_enable_bridges->pci_enable_device->__pci_enable_device_flags->do_pci_enable_device->
>>>>> pcibios_enable_device
>>>>>
>>>>> pci_bus_add_devices->pci_bus_add_device->"dev->is_added = 1"
>>>>>
>>>>> Yeah, it's general fixup code for every rescanned PCI/PCI-e device on
>>>>> powerpc at runtime. So if
>>>>> we want to call it in a ppc_md member, we need to wrap it as a function
>>>>> and
>>>>> assign it in every ppc_md,
>>>>> it isn't proper for the general code.
>>>>>
>>>>> Regards,
>>>>> yuanquan
>>>>>
>>>>>
>>>>>>> The patch code will be called by pci_enable_devices. The
>>>>>>> "dev->is_added"
>>>>>>> is set in pci_bus_add_device
>>>>>>> which is called by pci_bus_add_devices. So "dev->is_added" is false
>>>>>>> when
>>>>>>> checking it in pcibios_enable_device
>>>>>>> for the rescanned device.
>>>>>>
>>>>>> Who calls pci_enable_device() in the rescan case ? Why isn't it left
>>>>>> to
>>>>>> the driver ? I don't think we can rely on that behaviour not to
>>>>>> change.
>>>>>>
>>>>>>>> How do you trigger the rescan anyway ?
>>>>>>>
>>>>>>> Use the interface under /sys :
>>>>>>> echo 1 > /sys/bus/pci/devices/xxx/remove
>>>>>>>
>>>>>>> then echo 1 to the pci device which is the bus of the removed device
>>>>>>> echo 1 > /sys/bus/pci/devices/xxxx/rescan
>>>>>>> the removed device will be scanned and it's driver module will be
>>>>>>> loaded
>>>>>>> automatically.
>>>>>>
>>>>>> Yeah this code path are known to be fishy. I think the problem is at
>>>>>> the
>>>>>> generic abstraction level and that's where it needs to be fixed.
>>>>>>
>>>>>> Cheers,
>>>>>> Ben.
>>>>>>
>>>>>>> Regards,
>>>>>>> yuanquan
>>>>>>>>
>>>>>>>> I think the problem needs to be solve at a higher level, I'm adding
>>>>>>>> linux-pci & Bjorn to the CC list.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Ben.
>>>>>>>>
>>>>>>>>> Signed-off-by: Yuanquan Chen <B41889@freescale.com>
>>>>>>>>> ---
>>>>>>>>>      arch/powerpc/kernel/pci-common.c |   20 ++++++++++++++++++++
>>>>>>>>>      1 file changed, 20 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/arch/powerpc/kernel/pci-common.c
>>>>>>>>> b/arch/powerpc/kernel/pci-common.c
>>>>>>>>> index 7f94f76..f0fb070 100644
>>>>>>>>> --- a/arch/powerpc/kernel/pci-common.c
>>>>>>>>> +++ b/arch/powerpc/kernel/pci-common.c
>>>>>>>>> @@ -1496,6 +1496,26 @@ int pcibios_enable_device(struct pci_dev
>>>>>>>>> *dev,
>>>>>>>>> int mask)
>>>>>>>>>                   if (ppc_md.pcibios_enable_device_hook(dev))
>>>>>>>>>                           return -EINVAL;
>>>>>>>>>      +    if (!dev->is_added) {
>>>>>>>>> +               /*
>>>>>>>>> +                * Fixup NUMA node as it may not be setup yet by
>>>>>>>>> the
>>>>>>>>> generic
>>>>>>>>> +                * code and is needed by the DMA init
>>>>>>>>> +                */
>>>>>>>>> +               set_dev_node(&dev->dev, pcibus_to_node(dev->bus));
>>>>>>>>> +
>>>>>>>>> +               /* Hook up default DMA ops */
>>>>>>>>> +               set_dma_ops(&dev->dev, pci_dma_ops);
>>>>>>>>> +               set_dma_offset(&dev->dev, PCI_DRAM_OFFSET);
>>>>>>>>> +
>>>>>>>>> +               /* Additional platform DMA/iommu setup */
>>>>>>>>> +               if (ppc_md.pci_dma_dev_setup)
>>>>>>>>> +                       ppc_md.pci_dma_dev_setup(dev);
>>>>>>>>> +
>>>>>>>>> +               /* Read default IRQs and fixup if necessary */
>>>>>>>>> +               pci_read_irq_line(dev);
>>>>>>>>> +               if (ppc_md.pci_irq_fixup)
>>>>>>>>> +                       ppc_md.pci_irq_fixup(dev);
>>>>>>>>> +       }
>>>>>>>>>           return pci_enable_resources(dev, mask);
>>>>>>>>>      }
>>>>
>>>> Is this the same issue Hiroo MATSUMOTO was working on earlier?
>>>> (http://comments.gmane.org/gmane.linux.ports.ppc.embedded/50080)
>>>
>>>
>>> Yeah, that's the exact problem I encountered. Please push it forward.
>>
>> Well, as I mentioned, there are unresolved issues, so it's not just a
>> matter of applying the most recent patch.  If you're interested in
>> this problem and have some hardware to test, you can help by looking
>> into some of the things I mentioned in the message at the URL below.
>
>
> Hi Helgaas , Matsumoto,
>
> Do you have new update about this issue? I will go on to investigate this
> issue in the next
> days.

No update from me.  If you can help resolve this, please do.

Bjorn

^ permalink raw reply

* Re: weird elf header issues, is it binutils or my linker script?
From: Chris Friesen @ 2013-04-01 15:02 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <1AEA14BF-7AFA-43F4-8082-F0D3031C60A8@kernel.crashing.org>

On 03/29/2013 06:01 AM, Segher Boessenkool wrote:
>> PHDRS
>> {
>> headers PT_PHDR PHDRS ;
>> interp PT_INTERP ;
>> <snip>
>> }
>>
>> SECTIONS
>> {
>> /* Read-only sections, merged into text segment: */
>> PROVIDE (__executable_start = 0xf2000000); . = 0xf2000000 +
>> SIZEOF_HEADERS;
>> .interp : { *(.interp) } :text :interp
>> <snip>
>> }
>>
>> So I'm wondering...is this something wrong with our linker script,
>> or is there a bug in our binutils? I'm no linker expert, but the
>> interpreter sections in the script seem to match the binutils
>> documentation that I found and I don't see anything that would be
>> messing with the length.
>>
>> Any suggestions on where to look?
>
> It looks like your .interp input section lacks the required
> zero-termination.

That's the weird thing....the actual interpreter string "/lib/ld.so.1" 
is in fact null-terminated, but the length in the elf headers is 
incorrect (0x30 instead of 0xd) and so when the kernel checks the last 
character in the array it sees a nonzero value.

What I don't understand is where the "/lib/ld.so.1" string is coming 
from and how the length gets set to the invalid value.

Chris

^ permalink raw reply

* [PATCH 5/9] cpufreq: Notify all policy->cpus in cpufreq_notify_transition()
From: Viresh Kumar @ 2013-04-01 12:57 UTC (permalink / raw)
  To: rjw
  Cc: linux-mips, linux-ia64, linux-sh, Viresh Kumar, Sekhar Nori,
	sparclinux, Guan Xuetao, Hans-Christian Egtvedt, Jesper Nilsson,
	Kukjin Kim, Stephen Warren, cpufreq, Haavard Skinnemoen,
	cbe-oss-dev, Fenghua Yu, Mike Frysinger, Arnd Bergmann, linux-pm,
	Haojian Zhuang, Mikael Starvik, Borislav Petkov, Ben Dooks,
	Thomas Renninger, linux-arm-kernel, Tony Luck, Eric Miao,
	linux-cris-kernel, linux-kernel, Ralf Baechle, Paul Mundt,
	Sascha Hauer, linuxppc-dev, David S. Miller
In-Reply-To: <cover.1364820620.git.viresh.kumar@linaro.org>

policy->cpus contains all online cpus that have single shared clock line. And
their frequencies are always updated together.

Many SMP system's cpufreq drivers take care of this in individual drivers but
the best place for this code is in cpufreq core.

This patch modifies cpufreq_notify_transition() to notify frequency change for
all cpus in policy->cpus and hence updates all users of this API.

Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-cris-kernel@axis.com
Cc: linux-ia64@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: linux-mips@linux-mips.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
---
 arch/arm/mach-davinci/cpufreq.c              |  5 +-
 arch/arm/mach-imx/cpufreq.c                  |  5 +-
 arch/arm/mach-integrator/cpu.c               |  6 +--
 arch/arm/mach-pxa/cpufreq-pxa2xx.c           |  5 +-
 arch/arm/mach-pxa/cpufreq-pxa3xx.c           |  5 +-
 arch/arm/mach-s3c24xx/cpufreq.c              |  8 +--
 arch/arm/mach-sa1100/cpu-sa1100.c            |  5 +-
 arch/arm/mach-sa1100/cpu-sa1110.c            |  5 +-
 arch/arm/mach-tegra/cpu-tegra.c              | 15 +++---
 arch/avr32/mach-at32ap/cpufreq.c             |  5 +-
 arch/blackfin/mach-common/cpufreq.c          | 79 ++++++++++++----------------
 arch/cris/arch-v32/mach-a3/cpufreq.c         | 20 +++----
 arch/cris/arch-v32/mach-fs/cpufreq.c         | 17 +++---
 arch/ia64/kernel/cpufreq/acpi-cpufreq.c      | 22 ++++----
 arch/mips/kernel/cpufreq/loongson2_cpufreq.c |  5 +-
 arch/powerpc/platforms/cell/cbe_cpufreq.c    |  5 +-
 arch/powerpc/platforms/pasemi/cpufreq.c      |  5 +-
 arch/powerpc/platforms/powermac/cpufreq_32.c | 14 ++---
 arch/powerpc/platforms/powermac/cpufreq_64.c |  5 +-
 arch/sh/kernel/cpufreq.c                     |  5 +-
 arch/sparc/kernel/us2e_cpufreq.c             | 13 ++---
 arch/sparc/kernel/us3_cpufreq.c              | 13 ++---
 arch/unicore32/kernel/cpu-ucv2.c             |  5 +-
 drivers/cpufreq/acpi-cpufreq.c               | 11 +---
 drivers/cpufreq/cpufreq-cpu0.c               | 12 ++---
 drivers/cpufreq/cpufreq-nforce2.c            |  5 +-
 drivers/cpufreq/cpufreq.c                    | 45 +++++++++-------
 drivers/cpufreq/dbx500-cpufreq.c             |  6 +--
 drivers/cpufreq/e_powersaver.c               | 11 ++--
 drivers/cpufreq/elanfreq.c                   | 10 ++--
 drivers/cpufreq/exynos-cpufreq.c             |  7 +--
 drivers/cpufreq/gx-suspmod.c                 | 11 ++--
 drivers/cpufreq/imx6q-cpufreq.c              | 12 ++---
 drivers/cpufreq/kirkwood-cpufreq.c           | 10 ++--
 drivers/cpufreq/longhaul.c                   | 18 ++++---
 drivers/cpufreq/maple-cpufreq.c              |  5 +-
 drivers/cpufreq/omap-cpufreq.c               | 11 +---
 drivers/cpufreq/p4-clockmod.c                | 10 +---
 drivers/cpufreq/pcc-cpufreq.c                |  5 +-
 drivers/cpufreq/powernow-k6.c                | 12 ++---
 drivers/cpufreq/powernow-k7.c                | 10 ++--
 drivers/cpufreq/powernow-k8.c                | 16 +++---
 drivers/cpufreq/s3c2416-cpufreq.c            |  5 +-
 drivers/cpufreq/s3c64xx-cpufreq.c            |  7 ++-
 drivers/cpufreq/s5pv210-cpufreq.c            |  5 +-
 drivers/cpufreq/sc520_freq.c                 | 10 ++--
 drivers/cpufreq/spear-cpufreq.c              |  7 +--
 drivers/cpufreq/speedstep-centrino.c         | 24 ++-------
 drivers/cpufreq/speedstep-ich.c              | 12 +----
 drivers/cpufreq/speedstep-smi.c              |  5 +-
 include/linux/cpufreq.h                      |  4 +-
 51 files changed, 238 insertions(+), 340 deletions(-)

diff --git a/arch/arm/mach-davinci/cpufreq.c b/arch/arm/mach-davinci/cpufreq.c
index 4729eaa..55eb870 100644
--- a/arch/arm/mach-davinci/cpufreq.c
+++ b/arch/arm/mach-davinci/cpufreq.c
@@ -90,7 +90,6 @@ static int davinci_target(struct cpufreq_policy *policy,
 
 	freqs.old = davinci_getspeed(0);
 	freqs.new = clk_round_rate(armclk, target_freq * 1000) / 1000;
-	freqs.cpu = 0;
 
 	if (freqs.old == freqs.new)
 		return ret;
@@ -102,7 +101,7 @@ static int davinci_target(struct cpufreq_policy *policy,
 	if (ret)
 		return -EINVAL;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* if moving to higher frequency, up the voltage beforehand */
 	if (pdata->set_voltage && freqs.new > freqs.old) {
@@ -126,7 +125,7 @@ static int davinci_target(struct cpufreq_policy *policy,
 		pdata->set_voltage(idx);
 
 out:
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return ret;
 }
diff --git a/arch/arm/mach-imx/cpufreq.c b/arch/arm/mach-imx/cpufreq.c
index d8c75c3..cfce5e3 100644
--- a/arch/arm/mach-imx/cpufreq.c
+++ b/arch/arm/mach-imx/cpufreq.c
@@ -87,13 +87,12 @@ static int mxc_set_target(struct cpufreq_policy *policy,
 
 	freqs.old = clk_get_rate(cpu_clk) / 1000;
 	freqs.new = freq_Hz / 1000;
-	freqs.cpu = 0;
 	freqs.flags = 0;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	ret = set_cpu_freq(freq_Hz);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return ret;
 }
diff --git a/arch/arm/mach-integrator/cpu.c b/arch/arm/mach-integrator/cpu.c
index 590c192..df863c3 100644
--- a/arch/arm/mach-integrator/cpu.c
+++ b/arch/arm/mach-integrator/cpu.c
@@ -123,14 +123,12 @@ static int integrator_set_target(struct cpufreq_policy *policy,
 	vco = icst_hz_to_vco(&cclk_params, target_freq * 1000);
 	freqs.new = icst_hz(&cclk_params, vco) / 1000;
 
-	freqs.cpu = policy->cpu;
-
 	if (freqs.old == freqs.new) {
 		set_cpus_allowed(current, cpus_allowed);
 		return 0;
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	cm_osc = __raw_readl(CM_OSC);
 
@@ -151,7 +149,7 @@ static int integrator_set_target(struct cpufreq_policy *policy,
 	 */
 	set_cpus_allowed(current, cpus_allowed);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/arch/arm/mach-pxa/cpufreq-pxa2xx.c b/arch/arm/mach-pxa/cpufreq-pxa2xx.c
index 6a7aeab..f1ca4da 100644
--- a/arch/arm/mach-pxa/cpufreq-pxa2xx.c
+++ b/arch/arm/mach-pxa/cpufreq-pxa2xx.c
@@ -311,7 +311,6 @@ static int pxa_set_target(struct cpufreq_policy *policy,
 	new_freq_mem = pxa_freq_settings[idx].membus;
 	freqs.old = policy->cur;
 	freqs.new = new_freq_cpu;
-	freqs.cpu = policy->cpu;
 
 	if (freq_debug)
 		pr_debug("Changing CPU frequency to %d Mhz, (SDRAM %d Mhz)\n",
@@ -327,7 +326,7 @@ static int pxa_set_target(struct cpufreq_policy *policy,
 	 * you should add a notify client with any platform specific
 	 * Vcc changing capability
 	 */
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* Calculate the next MDREFR.  If we're slowing down the SDRAM clock
 	 * we need to preset the smaller DRI before the change.	 If we're
@@ -382,7 +381,7 @@ static int pxa_set_target(struct cpufreq_policy *policy,
 	 * you should add a notify client with any platform specific
 	 * SDRAM refresh timer adjustments
 	 */
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	/*
 	 * Even if voltage setting fails, we don't report it, as the frequency
diff --git a/arch/arm/mach-pxa/cpufreq-pxa3xx.c b/arch/arm/mach-pxa/cpufreq-pxa3xx.c
index b85b4ab..8c45b2b 100644
--- a/arch/arm/mach-pxa/cpufreq-pxa3xx.c
+++ b/arch/arm/mach-pxa/cpufreq-pxa3xx.c
@@ -184,7 +184,6 @@ static int pxa3xx_cpufreq_set(struct cpufreq_policy *policy,
 
 	freqs.old = policy->cur;
 	freqs.new = next->cpufreq_mhz * 1000;
-	freqs.cpu = policy->cpu;
 
 	pr_debug("CPU frequency from %d MHz to %d MHz%s\n",
 			freqs.old / 1000, freqs.new / 1000,
@@ -193,14 +192,14 @@ static int pxa3xx_cpufreq_set(struct cpufreq_policy *policy,
 	if (freqs.old == target_freq)
 		return 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	local_irq_save(flags);
 	__update_core_freq(next);
 	__update_bus_freq(next);
 	local_irq_restore(flags);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/arch/arm/mach-s3c24xx/cpufreq.c b/arch/arm/mach-s3c24xx/cpufreq.c
index 5f181e7..3c0e78e 100644
--- a/arch/arm/mach-s3c24xx/cpufreq.c
+++ b/arch/arm/mach-s3c24xx/cpufreq.c
@@ -204,7 +204,6 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
 	freqs.old = cpu_cur.freq;
 	freqs.new = cpu_new.freq;
 
-	freqs.freqs.cpu = 0;
 	freqs.freqs.old = cpu_cur.freq.armclk / 1000;
 	freqs.freqs.new = cpu_new.freq.armclk / 1000;
 
@@ -218,9 +217,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
 	s3c_cpufreq_updateclk(clk_pclk, cpu_new.freq.pclk);
 
 	/* start the frequency change */
-
-	if (policy)
-		cpufreq_notify_transition(&freqs.freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_PRECHANGE);
 
 	/* If hclk is staying the same, then we do not need to
 	 * re-write the IO or the refresh timings whilst we are changing
@@ -264,8 +261,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
 	local_irq_restore(flags);
 
 	/* notify everyone we've done this */
-	if (policy)
-		cpufreq_notify_transition(&freqs.freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_POSTCHANGE);
 
 	s3c_freq_dbg("%s: finished\n", __func__);
 	return 0;
diff --git a/arch/arm/mach-sa1100/cpu-sa1100.c b/arch/arm/mach-sa1100/cpu-sa1100.c
index e8f4d1e..3268761 100644
--- a/arch/arm/mach-sa1100/cpu-sa1100.c
+++ b/arch/arm/mach-sa1100/cpu-sa1100.c
@@ -201,9 +201,8 @@ static int sa1100_target(struct cpufreq_policy *policy,
 
 	freqs.old = cur;
 	freqs.new = sa11x0_ppcr_to_freq(new_ppcr);
-	freqs.cpu = 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (freqs.new > cur)
 		sa1100_update_dram_timings(cur, freqs.new);
@@ -213,7 +212,7 @@ static int sa1100_target(struct cpufreq_policy *policy,
 	if (freqs.new < cur)
 		sa1100_update_dram_timings(cur, freqs.new);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/arch/arm/mach-sa1100/cpu-sa1110.c b/arch/arm/mach-sa1100/cpu-sa1110.c
index 48c45b0..38a7733 100644
--- a/arch/arm/mach-sa1100/cpu-sa1110.c
+++ b/arch/arm/mach-sa1100/cpu-sa1110.c
@@ -258,7 +258,6 @@ static int sa1110_target(struct cpufreq_policy *policy,
 
 	freqs.old = sa11x0_getspeed(0);
 	freqs.new = sa11x0_ppcr_to_freq(ppcr);
-	freqs.cpu = 0;
 
 	sdram_calculate_timing(&sd, freqs.new, sdram);
 
@@ -279,7 +278,7 @@ static int sa1110_target(struct cpufreq_policy *policy,
 	sd.mdcas[2] = 0xaaaaaaaa;
 #endif
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/*
 	 * The clock could be going away for some time.  Set the SDRAMs
@@ -327,7 +326,7 @@ static int sa1110_target(struct cpufreq_policy *policy,
 	 */
 	sdram_update_refresh(freqs.new, sdram);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/arch/arm/mach-tegra/cpu-tegra.c b/arch/arm/mach-tegra/cpu-tegra.c
index e3d6e15..11ca730 100644
--- a/arch/arm/mach-tegra/cpu-tegra.c
+++ b/arch/arm/mach-tegra/cpu-tegra.c
@@ -106,7 +106,8 @@ out:
 	return ret;
 }
 
-static int tegra_update_cpu_speed(unsigned long rate)
+static int tegra_update_cpu_speed(struct cpufreq_policy *policy,
+		unsigned long rate)
 {
 	int ret = 0;
 	struct cpufreq_freqs freqs;
@@ -128,8 +129,7 @@ static int tegra_update_cpu_speed(unsigned long rate)
 	else
 		clk_set_rate(emc_clk, 100000000);  /* emc 50Mhz */
 
-	for_each_online_cpu(freqs.cpu)
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 #ifdef CONFIG_CPU_FREQ_DEBUG
 	printk(KERN_DEBUG "cpufreq-tegra: transition: %u --> %u\n",
@@ -143,8 +143,7 @@ static int tegra_update_cpu_speed(unsigned long rate)
 		return ret;
 	}
 
-	for_each_online_cpu(freqs.cpu)
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
@@ -181,7 +180,7 @@ static int tegra_target(struct cpufreq_policy *policy,
 
 	target_cpu_speed[policy->cpu] = freq;
 
-	ret = tegra_update_cpu_speed(tegra_cpu_highest_speed());
+	ret = tegra_update_cpu_speed(policy, tegra_cpu_highest_speed());
 
 out:
 	mutex_unlock(&tegra_cpu_lock);
@@ -193,10 +192,12 @@ static int tegra_pm_notify(struct notifier_block *nb, unsigned long event,
 {
 	mutex_lock(&tegra_cpu_lock);
 	if (event == PM_SUSPEND_PREPARE) {
+		struct cpufreq_policy *policy = cpufreq_cpu_get(0);
 		is_suspended = true;
 		pr_info("Tegra cpufreq suspend: setting frequency to %d kHz\n",
 			freq_table[0].frequency);
-		tegra_update_cpu_speed(freq_table[0].frequency);
+		tegra_update_cpu_speed(policy, freq_table[0].frequency);
+		cpufreq_cpu_put(policy);
 	} else if (event == PM_POST_SUSPEND) {
 		is_suspended = false;
 	}
diff --git a/arch/avr32/mach-at32ap/cpufreq.c b/arch/avr32/mach-at32ap/cpufreq.c
index 18b7656..6544887 100644
--- a/arch/avr32/mach-at32ap/cpufreq.c
+++ b/arch/avr32/mach-at32ap/cpufreq.c
@@ -61,7 +61,6 @@ static int at32_set_target(struct cpufreq_policy *policy,
 
 	freqs.old = at32_get_speed(0);
 	freqs.new = (freq + 500) / 1000;
-	freqs.cpu = 0;
 	freqs.flags = 0;
 
 	if (!ref_freq) {
@@ -69,7 +68,7 @@ static int at32_set_target(struct cpufreq_policy *policy,
 		loops_per_jiffy_ref = boot_cpu_data.loops_per_jiffy;
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	if (freqs.old < freqs.new)
 		boot_cpu_data.loops_per_jiffy = cpufreq_scale(
 				loops_per_jiffy_ref, ref_freq, freqs.new);
@@ -77,7 +76,7 @@ static int at32_set_target(struct cpufreq_policy *policy,
 	if (freqs.new < freqs.old)
 		boot_cpu_data.loops_per_jiffy = cpufreq_scale(
 				loops_per_jiffy_ref, ref_freq, freqs.new);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	pr_debug("cpufreq: set frequency %lu Hz\n", freq);
 
diff --git a/arch/blackfin/mach-common/cpufreq.c b/arch/blackfin/mach-common/cpufreq.c
index d88bd31..995511e80 100644
--- a/arch/blackfin/mach-common/cpufreq.c
+++ b/arch/blackfin/mach-common/cpufreq.c
@@ -127,13 +127,13 @@ unsigned long cpu_set_cclk(int cpu, unsigned long new)
 }
 #endif
 
-static int bfin_target(struct cpufreq_policy *poli,
+static int bfin_target(struct cpufreq_policy *policy,
 			unsigned int target_freq, unsigned int relation)
 {
 #ifndef CONFIG_BF60x
 	unsigned int plldiv;
 #endif
-	unsigned int index, cpu;
+	unsigned int index;
 	unsigned long cclk_hz;
 	struct cpufreq_freqs freqs;
 	static unsigned long lpj_ref;
@@ -144,59 +144,48 @@ static int bfin_target(struct cpufreq_policy *poli,
 	cycles_t cycles;
 #endif
 
-	for_each_online_cpu(cpu) {
-		struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+	if (cpufreq_frequency_table_target(policy, bfin_freq_table, target_freq,
+				relation, &index))
+		return -EINVAL;
 
-		if (!policy)
-			continue;
+	cclk_hz = bfin_freq_table[index].frequency;
 
-		if (cpufreq_frequency_table_target(policy, bfin_freq_table,
-				 target_freq, relation, &index))
-			return -EINVAL;
+	freqs.old = bfin_getfreq_khz(0);
+	freqs.new = cclk_hz;
 
-		cclk_hz = bfin_freq_table[index].frequency;
+	pr_debug("cpufreq: changing cclk to %lu; target = %u, oldfreq = %u\n",
+			cclk_hz, target_freq, freqs.old);
 
-		freqs.old = bfin_getfreq_khz(0);
-		freqs.new = cclk_hz;
-		freqs.cpu = cpu;
-
-		pr_debug("cpufreq: changing cclk to %lu; target = %u, oldfreq = %u\n",
-			 cclk_hz, target_freq, freqs.old);
-
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-		if (cpu == CPUFREQ_CPU) {
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 #ifndef CONFIG_BF60x
-			plldiv = (bfin_read_PLL_DIV() & SSEL) |
-						dpm_state_table[index].csel;
-			bfin_write_PLL_DIV(plldiv);
+	plldiv = (bfin_read_PLL_DIV() & SSEL) | dpm_state_table[index].csel;
+	bfin_write_PLL_DIV(plldiv);
 #else
-			ret = cpu_set_cclk(cpu, freqs.new * 1000);
-			if (ret != 0) {
-				WARN_ONCE(ret, "cpufreq set freq failed %d\n", ret);
-				break;
-			}
+	ret = cpu_set_cclk(policy->cpu, freqs.new * 1000);
+	if (ret != 0) {
+		WARN_ONCE(ret, "cpufreq set freq failed %d\n", ret);
+		return ret;
+	}
 #endif
-			on_each_cpu(bfin_adjust_core_timer, &index, 1);
+	on_each_cpu(bfin_adjust_core_timer, &index, 1);
 #if defined(CONFIG_CYCLES_CLOCKSOURCE)
-			cycles = get_cycles();
-			SSYNC();
-			cycles += 10; /* ~10 cycles we lose after get_cycles() */
-			__bfin_cycles_off +=
-			    (cycles << __bfin_cycles_mod) - (cycles << index);
-			__bfin_cycles_mod = index;
+	cycles = get_cycles();
+	SSYNC();
+	cycles += 10; /* ~10 cycles we lose after get_cycles() */
+	__bfin_cycles_off += (cycles << __bfin_cycles_mod) - (cycles << index);
+	__bfin_cycles_mod = index;
 #endif
-			if (!lpj_ref_freq) {
-				lpj_ref = loops_per_jiffy;
-				lpj_ref_freq = freqs.old;
-			}
-			if (freqs.new != freqs.old) {
-				loops_per_jiffy = cpufreq_scale(lpj_ref,
-						lpj_ref_freq, freqs.new);
-			}
-		}
-		/* TODO: just test case for cycles clock source, remove later */
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	if (!lpj_ref_freq) {
+		lpj_ref = loops_per_jiffy;
+		lpj_ref_freq = freqs.old;
 	}
+	if (freqs.new != freqs.old) {
+		loops_per_jiffy = cpufreq_scale(lpj_ref,
+				lpj_ref_freq, freqs.new);
+	}
+
+	/* TODO: just test case for cycles clock source, remove later */
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	pr_debug("cpufreq: done\n");
 	return ret;
diff --git a/arch/cris/arch-v32/mach-a3/cpufreq.c b/arch/cris/arch-v32/mach-a3/cpufreq.c
index ee391ec..ee142c4 100644
--- a/arch/cris/arch-v32/mach-a3/cpufreq.c
+++ b/arch/cris/arch-v32/mach-a3/cpufreq.c
@@ -27,23 +27,17 @@ static unsigned int cris_freq_get_cpu_frequency(unsigned int cpu)
 	return clk_ctrl.pll ? 200000 : 6000;
 }
 
-static void cris_freq_set_cpu_state(unsigned int state)
+static void cris_freq_set_cpu_state(struct cpufreq_policy *policy,
+		unsigned int state)
 {
-	int i = 0;
 	struct cpufreq_freqs freqs;
 	reg_clkgen_rw_clk_ctrl clk_ctrl;
 	clk_ctrl = REG_RD(clkgen, regi_clkgen, rw_clk_ctrl);
 
-#ifdef CONFIG_SMP
-	for_each_present_cpu(i)
-#endif
-	{
-		freqs.old = cris_freq_get_cpu_frequency(i);
-		freqs.new = cris_freq_table[state].frequency;
-		freqs.cpu = i;
-	}
+	freqs.old = cris_freq_get_cpu_frequency(policy->cpu);
+	freqs.new = cris_freq_table[state].frequency;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	local_irq_disable();
 
@@ -57,7 +51,7 @@ static void cris_freq_set_cpu_state(unsigned int state)
 
 	local_irq_enable();
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 };
 
 static int cris_freq_verify(struct cpufreq_policy *policy)
@@ -75,7 +69,7 @@ static int cris_freq_target(struct cpufreq_policy *policy,
 			target_freq, relation, &newstate))
 		return -EINVAL;
 
-	cris_freq_set_cpu_state(newstate);
+	cris_freq_set_cpu_state(policy, newstate);
 
 	return 0;
 }
diff --git a/arch/cris/arch-v32/mach-fs/cpufreq.c b/arch/cris/arch-v32/mach-fs/cpufreq.c
index d92cf70..1295223 100644
--- a/arch/cris/arch-v32/mach-fs/cpufreq.c
+++ b/arch/cris/arch-v32/mach-fs/cpufreq.c
@@ -27,20 +27,17 @@ static unsigned int cris_freq_get_cpu_frequency(unsigned int cpu)
 	return clk_ctrl.pll ? 200000 : 6000;
 }
 
-static void cris_freq_set_cpu_state(unsigned int state)
+static void cris_freq_set_cpu_state(struct cpufreq_policy *policy,
+		unsigned int state)
 {
-	int i;
 	struct cpufreq_freqs freqs;
 	reg_config_rw_clk_ctrl clk_ctrl;
 	clk_ctrl = REG_RD(config, regi_config, rw_clk_ctrl);
 
-	for_each_possible_cpu(i) {
-		freqs.old = cris_freq_get_cpu_frequency(i);
-		freqs.new = cris_freq_table[state].frequency;
-		freqs.cpu = i;
-	}
+	freqs.old = cris_freq_get_cpu_frequency(policy->cpu);
+	freqs.new = cris_freq_table[state].frequency;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	local_irq_disable();
 
@@ -54,7 +51,7 @@ static void cris_freq_set_cpu_state(unsigned int state)
 
 	local_irq_enable();
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 };
 
 static int cris_freq_verify(struct cpufreq_policy *policy)
@@ -71,7 +68,7 @@ static int cris_freq_target(struct cpufreq_policy *policy,
 	    (policy, cris_freq_table, target_freq, relation, &newstate))
 		return -EINVAL;
 
-	cris_freq_set_cpu_state(newstate);
+	cris_freq_set_cpu_state(policy, newstate);
 
 	return 0;
 }
diff --git a/arch/ia64/kernel/cpufreq/acpi-cpufreq.c b/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
index f09b174..4700fef 100644
--- a/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
+++ b/arch/ia64/kernel/cpufreq/acpi-cpufreq.c
@@ -137,7 +137,7 @@ migrate_end:
 static int
 processor_set_freq (
 	struct cpufreq_acpi_io	*data,
-	unsigned int		cpu,
+	struct cpufreq_policy   *policy,
 	int			state)
 {
 	int			ret = 0;
@@ -149,8 +149,8 @@ processor_set_freq (
 	pr_debug("processor_set_freq\n");
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-	if (smp_processor_id() != cpu) {
+	set_cpus_allowed_ptr(current, cpumask_of(policy->cpu));
+	if (smp_processor_id() != policy->cpu) {
 		retval = -EAGAIN;
 		goto migrate_end;
 	}
@@ -170,12 +170,11 @@ processor_set_freq (
 		data->acpi_data.state, state);
 
 	/* cpufreq frequency struct */
-	cpufreq_freqs.cpu = cpu;
 	cpufreq_freqs.old = data->freq_table[data->acpi_data.state].frequency;
 	cpufreq_freqs.new = data->freq_table[state].frequency;
 
 	/* notify cpufreq */
-	cpufreq_notify_transition(&cpufreq_freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &cpufreq_freqs, CPUFREQ_PRECHANGE);
 
 	/*
 	 * First we write the target state's 'control' value to the
@@ -189,17 +188,20 @@ processor_set_freq (
 	ret = processor_set_pstate(value);
 	if (ret) {
 		unsigned int tmp = cpufreq_freqs.new;
-		cpufreq_notify_transition(&cpufreq_freqs, CPUFREQ_POSTCHANGE);
+		cpufreq_notify_transition(policy, &cpufreq_freqs,
+				CPUFREQ_POSTCHANGE);
 		cpufreq_freqs.new = cpufreq_freqs.old;
 		cpufreq_freqs.old = tmp;
-		cpufreq_notify_transition(&cpufreq_freqs, CPUFREQ_PRECHANGE);
-		cpufreq_notify_transition(&cpufreq_freqs, CPUFREQ_POSTCHANGE);
+		cpufreq_notify_transition(policy, &cpufreq_freqs,
+				CPUFREQ_PRECHANGE);
+		cpufreq_notify_transition(policy, &cpufreq_freqs,
+				CPUFREQ_POSTCHANGE);
 		printk(KERN_WARNING "Transition failed with error %d\n", ret);
 		retval = -ENODEV;
 		goto migrate_end;
 	}
 
-	cpufreq_notify_transition(&cpufreq_freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &cpufreq_freqs, CPUFREQ_POSTCHANGE);
 
 	data->acpi_data.state = state;
 
@@ -240,7 +242,7 @@ acpi_cpufreq_target (
 	if (result)
 		return (result);
 
-	result = processor_set_freq(data, policy->cpu, next_state);
+	result = processor_set_freq(data, policy, next_state);
 
 	return (result);
 }
diff --git a/arch/mips/kernel/cpufreq/loongson2_cpufreq.c b/arch/mips/kernel/cpufreq/loongson2_cpufreq.c
index 3237c52..bafda70 100644
--- a/arch/mips/kernel/cpufreq/loongson2_cpufreq.c
+++ b/arch/mips/kernel/cpufreq/loongson2_cpufreq.c
@@ -80,7 +80,6 @@ static int loongson2_cpufreq_target(struct cpufreq_policy *policy,
 
 	pr_debug("cpufreq: requested frequency %u Hz\n", target_freq * 1000);
 
-	freqs.cpu = cpu;
 	freqs.old = loongson2_cpufreq_get(cpu);
 	freqs.new = freq;
 	freqs.flags = 0;
@@ -89,7 +88,7 @@ static int loongson2_cpufreq_target(struct cpufreq_policy *policy,
 		return 0;
 
 	/* notifiers */
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	set_cpus_allowed_ptr(current, &cpus_allowed);
 
@@ -97,7 +96,7 @@ static int loongson2_cpufreq_target(struct cpufreq_policy *policy,
 	clk_set_rate(cpuclk, freq);
 
 	/* notifiers */
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	pr_debug("cpufreq: set frequency %u kHz\n", freq);
 
diff --git a/arch/powerpc/platforms/cell/cbe_cpufreq.c b/arch/powerpc/platforms/cell/cbe_cpufreq.c
index d4c39e3..718c6a3 100644
--- a/arch/powerpc/platforms/cell/cbe_cpufreq.c
+++ b/arch/powerpc/platforms/cell/cbe_cpufreq.c
@@ -156,10 +156,9 @@ static int cbe_cpufreq_target(struct cpufreq_policy *policy,
 
 	freqs.old = policy->cur;
 	freqs.new = cbe_freqs[cbe_pmode_new].frequency;
-	freqs.cpu = policy->cpu;
 
 	mutex_lock(&cbe_switch_mutex);
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("setting frequency for cpu %d to %d kHz, " \
 		 "1/%d of max frequency\n",
@@ -169,7 +168,7 @@ static int cbe_cpufreq_target(struct cpufreq_policy *policy,
 
 	rc = set_pmode(policy->cpu, cbe_pmode_new);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	mutex_unlock(&cbe_switch_mutex);
 
 	return rc;
diff --git a/arch/powerpc/platforms/pasemi/cpufreq.c b/arch/powerpc/platforms/pasemi/cpufreq.c
index 890f30e..be1e795 100644
--- a/arch/powerpc/platforms/pasemi/cpufreq.c
+++ b/arch/powerpc/platforms/pasemi/cpufreq.c
@@ -273,10 +273,9 @@ static int pas_cpufreq_target(struct cpufreq_policy *policy,
 
 	freqs.old = policy->cur;
 	freqs.new = pas_freqs[pas_astate_new].frequency;
-	freqs.cpu = policy->cpu;
 
 	mutex_lock(&pas_switch_mutex);
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("setting frequency for cpu %d to %d kHz, 1/%d of max frequency\n",
 		 policy->cpu,
@@ -288,7 +287,7 @@ static int pas_cpufreq_target(struct cpufreq_policy *policy,
 	for_each_online_cpu(i)
 		set_astate(i, pas_astate_new);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	mutex_unlock(&pas_switch_mutex);
 
 	ppc_proc_freq = freqs.new * 1000ul;
diff --git a/arch/powerpc/platforms/powermac/cpufreq_32.c b/arch/powerpc/platforms/powermac/cpufreq_32.c
index 311b804..3104fad 100644
--- a/arch/powerpc/platforms/powermac/cpufreq_32.c
+++ b/arch/powerpc/platforms/powermac/cpufreq_32.c
@@ -335,7 +335,8 @@ static int pmu_set_cpu_speed(int low_speed)
 	return 0;
 }
 
-static int do_set_cpu_speed(int speed_mode, int notify)
+static int do_set_cpu_speed(struct cpufreq_policy *policy, int speed_mode,
+		int notify)
 {
 	struct cpufreq_freqs freqs;
 	unsigned long l3cr;
@@ -343,13 +344,12 @@ static int do_set_cpu_speed(int speed_mode, int notify)
 
 	freqs.old = cur_freq;
 	freqs.new = (speed_mode == CPUFREQ_HIGH) ? hi_freq : low_freq;
-	freqs.cpu = smp_processor_id();
 
 	if (freqs.old == freqs.new)
 		return 0;
 
 	if (notify)
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+		cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	if (speed_mode == CPUFREQ_LOW &&
 	    cpu_has_feature(CPU_FTR_L3CR)) {
 		l3cr = _get_L3CR();
@@ -366,7 +366,7 @@ static int do_set_cpu_speed(int speed_mode, int notify)
 			_set_L3CR(prev_l3cr);
 	}
 	if (notify)
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+		cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	cur_freq = (speed_mode == CPUFREQ_HIGH) ? hi_freq : low_freq;
 
 	return 0;
@@ -393,7 +393,7 @@ static int pmac_cpufreq_target(	struct cpufreq_policy *policy,
 			target_freq, relation, &newstate))
 		return -EINVAL;
 
-	rc = do_set_cpu_speed(newstate, 1);
+	rc = do_set_cpu_speed(policy, newstate, 1);
 
 	ppc_proc_freq = cur_freq * 1000ul;
 	return rc;
@@ -442,7 +442,7 @@ static int pmac_cpufreq_suspend(struct cpufreq_policy *policy)
 	no_schedule = 1;
 	sleep_freq = cur_freq;
 	if (cur_freq == low_freq && !is_pmu_based)
-		do_set_cpu_speed(CPUFREQ_HIGH, 0);
+		do_set_cpu_speed(policy, CPUFREQ_HIGH, 0);
 	return 0;
 }
 
@@ -458,7 +458,7 @@ static int pmac_cpufreq_resume(struct cpufreq_policy *policy)
 	 * is that we force a switch to whatever it was, which is
 	 * probably high speed due to our suspend() routine
 	 */
-	do_set_cpu_speed(sleep_freq == low_freq ?
+	do_set_cpu_speed(policy, sleep_freq == low_freq ?
 			 CPUFREQ_LOW : CPUFREQ_HIGH, 0);
 
 	ppc_proc_freq = cur_freq * 1000ul;
diff --git a/arch/powerpc/platforms/powermac/cpufreq_64.c b/arch/powerpc/platforms/powermac/cpufreq_64.c
index 9650c60..7ba4234 100644
--- a/arch/powerpc/platforms/powermac/cpufreq_64.c
+++ b/arch/powerpc/platforms/powermac/cpufreq_64.c
@@ -339,11 +339,10 @@ static int g5_cpufreq_target(struct cpufreq_policy *policy,
 
 	freqs.old = g5_cpu_freqs[g5_pmode_cur].frequency;
 	freqs.new = g5_cpu_freqs[newstate].frequency;
-	freqs.cpu = 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	rc = g5_switch_freq(newstate);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	mutex_unlock(&g5_switch_mutex);
 
diff --git a/arch/sh/kernel/cpufreq.c b/arch/sh/kernel/cpufreq.c
index e68b45b..2c7bd94 100644
--- a/arch/sh/kernel/cpufreq.c
+++ b/arch/sh/kernel/cpufreq.c
@@ -69,15 +69,14 @@ static int sh_cpufreq_target(struct cpufreq_policy *policy,
 
 	dev_dbg(dev, "requested frequency %u Hz\n", target_freq * 1000);
 
-	freqs.cpu	= cpu;
 	freqs.old	= sh_cpufreq_get(cpu);
 	freqs.new	= (freq + 500) / 1000;
 	freqs.flags	= 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	set_cpus_allowed_ptr(current, &cpus_allowed);
 	clk_set_rate(cpuclk, freq);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	dev_dbg(dev, "set frequency %lu Hz\n", freq);
 
diff --git a/arch/sparc/kernel/us2e_cpufreq.c b/arch/sparc/kernel/us2e_cpufreq.c
index 489fc15..abe963d 100644
--- a/arch/sparc/kernel/us2e_cpufreq.c
+++ b/arch/sparc/kernel/us2e_cpufreq.c
@@ -248,8 +248,10 @@ static unsigned int us2e_freq_get(unsigned int cpu)
 	return clock_tick / estar_to_divisor(estar);
 }
 
-static void us2e_set_cpu_divider_index(unsigned int cpu, unsigned int index)
+static void us2e_set_cpu_divider_index(struct cpufreq_policy *policy,
+		unsigned int index)
 {
+	unsigned int cpu = policy->cpu;
 	unsigned long new_bits, new_freq;
 	unsigned long clock_tick, divisor, old_divisor, estar;
 	cpumask_t cpus_allowed;
@@ -272,14 +274,13 @@ static void us2e_set_cpu_divider_index(unsigned int cpu, unsigned int index)
 
 	freqs.old = clock_tick / old_divisor;
 	freqs.new = new_freq;
-	freqs.cpu = cpu;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (old_divisor != divisor)
 		us2e_transition(estar, new_bits, clock_tick * 1000,
 				old_divisor, divisor);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	set_cpus_allowed_ptr(current, &cpus_allowed);
 }
@@ -295,7 +296,7 @@ static int us2e_freq_target(struct cpufreq_policy *policy,
 					   target_freq, relation, &new_index))
 		return -EINVAL;
 
-	us2e_set_cpu_divider_index(policy->cpu, new_index);
+	us2e_set_cpu_divider_index(policy, new_index);
 
 	return 0;
 }
@@ -335,7 +336,7 @@ static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy)
 static int us2e_freq_cpu_exit(struct cpufreq_policy *policy)
 {
 	if (cpufreq_us2e_driver)
-		us2e_set_cpu_divider_index(policy->cpu, 0);
+		us2e_set_cpu_divider_index(policy, 0);
 
 	return 0;
 }
diff --git a/arch/sparc/kernel/us3_cpufreq.c b/arch/sparc/kernel/us3_cpufreq.c
index eb1624b..7ceb9c8 100644
--- a/arch/sparc/kernel/us3_cpufreq.c
+++ b/arch/sparc/kernel/us3_cpufreq.c
@@ -96,8 +96,10 @@ static unsigned int us3_freq_get(unsigned int cpu)
 	return ret;
 }
 
-static void us3_set_cpu_divider_index(unsigned int cpu, unsigned int index)
+static void us3_set_cpu_divider_index(struct cpufreq_policy *policy,
+		unsigned int index)
 {
+	unsigned int cpu = policy->cpu;
 	unsigned long new_bits, new_freq, reg;
 	cpumask_t cpus_allowed;
 	struct cpufreq_freqs freqs;
@@ -131,14 +133,13 @@ static void us3_set_cpu_divider_index(unsigned int cpu, unsigned int index)
 
 	freqs.old = get_current_freq(cpu, reg);
 	freqs.new = new_freq;
-	freqs.cpu = cpu;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	reg &= ~SAFARI_CFG_DIV_MASK;
 	reg |= new_bits;
 	write_safari_cfg(reg);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	set_cpus_allowed_ptr(current, &cpus_allowed);
 }
@@ -156,7 +157,7 @@ static int us3_freq_target(struct cpufreq_policy *policy,
 					   &new_index))
 		return -EINVAL;
 
-	us3_set_cpu_divider_index(policy->cpu, new_index);
+	us3_set_cpu_divider_index(policy, new_index);
 
 	return 0;
 }
@@ -192,7 +193,7 @@ static int __init us3_freq_cpu_init(struct cpufreq_policy *policy)
 static int us3_freq_cpu_exit(struct cpufreq_policy *policy)
 {
 	if (cpufreq_us3_driver)
-		us3_set_cpu_divider_index(policy->cpu, 0);
+		us3_set_cpu_divider_index(policy, 0);
 
 	return 0;
 }
diff --git a/arch/unicore32/kernel/cpu-ucv2.c b/arch/unicore32/kernel/cpu-ucv2.c
index 4a99f62..ba5a71c 100644
--- a/arch/unicore32/kernel/cpu-ucv2.c
+++ b/arch/unicore32/kernel/cpu-ucv2.c
@@ -52,15 +52,14 @@ static int ucv2_target(struct cpufreq_policy *policy,
 	struct cpufreq_freqs freqs;
 	struct clk *mclk = clk_get(NULL, "MAIN_CLK");
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (!clk_set_rate(mclk, target_freq * 1000)) {
 		freqs.old = cur;
 		freqs.new = target_freq;
-		freqs.cpu = 0;
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 57a8774..11b8b4b 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -423,7 +423,6 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
 	struct drv_cmd cmd;
 	unsigned int next_state = 0; /* Index into freq_table */
 	unsigned int next_perf_state = 0; /* Index into perf table */
-	unsigned int i;
 	int result = 0;
 
 	pr_debug("acpi_cpufreq_target %d (%d)\n", target_freq, policy->cpu);
@@ -486,10 +485,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
 
 	freqs.old = perf->states[perf->state].core_frequency * 1000;
 	freqs.new = data->freq_table[next_state].frequency;
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	drv_write(&cmd);
 
@@ -502,10 +498,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
 		}
 	}
 
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	perf->state = next_perf_state;
 
 out:
diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
index a7e51bd..6561853 100644
--- a/drivers/cpufreq/cpufreq-cpu0.c
+++ b/drivers/cpufreq/cpufreq-cpu0.c
@@ -46,7 +46,7 @@ static int cpu0_set_target(struct cpufreq_policy *policy,
 	struct opp *opp;
 	unsigned long volt = 0, volt_old = 0, tol = 0;
 	long freq_Hz;
-	unsigned int index, cpu;
+	unsigned int index;
 	int ret;
 
 	ret = cpufreq_frequency_table_target(policy, freq_table, target_freq,
@@ -66,10 +66,7 @@ static int cpu0_set_target(struct cpufreq_policy *policy,
 	if (freqs.old == freqs.new)
 		return 0;
 
-	for_each_online_cpu(cpu) {
-		freqs.cpu = cpu;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (cpu_reg) {
 		rcu_read_lock();
@@ -121,10 +118,7 @@ static int cpu0_set_target(struct cpufreq_policy *policy,
 	}
 
 post_notify:
-	for_each_online_cpu(cpu) {
-		freqs.cpu = cpu;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return ret;
 }
diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
index 13d311e..224a478 100644
--- a/drivers/cpufreq/cpufreq-nforce2.c
+++ b/drivers/cpufreq/cpufreq-nforce2.c
@@ -263,7 +263,6 @@ static int nforce2_target(struct cpufreq_policy *policy,
 
 	freqs.old = nforce2_get(policy->cpu);
 	freqs.new = target_fsb * fid * 100;
-	freqs.cpu = 0;		/* Only one CPU on nForce2 platforms */
 
 	if (freqs.old == freqs.new)
 		return 0;
@@ -271,7 +270,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
 	pr_debug("Old CPU frequency %d kHz, new %d kHz\n",
 	       freqs.old, freqs.new);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* Disable IRQs */
 	/* local_irq_save(flags); */
@@ -286,7 +285,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
 	/* Enable IRQs */
 	/* local_irq_restore(flags); */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 85963fc..0198cd0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -249,19 +249,9 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 #endif
 
 
-/**
- * cpufreq_notify_transition - call notifier chain and adjust_jiffies
- * on frequency transition.
- *
- * This function calls the transition notifiers and the "adjust_jiffies"
- * function. It is called twice on all CPU frequency changes that have
- * external effects.
- */
-void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
+void __cpufreq_notify_transition(struct cpufreq_policy *policy,
+		struct cpufreq_freqs *freqs, unsigned int state)
 {
-	struct cpufreq_policy *policy;
-	unsigned long flags;
-
 	BUG_ON(irqs_disabled());
 
 	if (cpufreq_disabled())
@@ -271,10 +261,6 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
-	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-
 	switch (state) {
 
 	case CPUFREQ_PRECHANGE:
@@ -308,6 +294,20 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		break;
 	}
 }
+/**
+ * cpufreq_notify_transition - call notifier chain and adjust_jiffies
+ * on frequency transition.
+ *
+ * This function calls the transition notifiers and the "adjust_jiffies"
+ * function. It is called twice on all CPU frequency changes that have
+ * external effects.
+ */
+void cpufreq_notify_transition(struct cpufreq_policy *policy,
+		struct cpufreq_freqs *freqs, unsigned int state)
+{
+	for_each_cpu(freqs->cpu, policy->cpus)
+		__cpufreq_notify_transition(policy, freqs, state);
+}
 EXPORT_SYMBOL_GPL(cpufreq_notify_transition);
 
 
@@ -1141,16 +1141,23 @@ static void handle_update(struct work_struct *work)
 static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 				unsigned int new_freq)
 {
+	struct cpufreq_policy *policy;
 	struct cpufreq_freqs freqs;
+	unsigned long flags;
+
 
 	pr_debug("Warning: CPU frequency out of sync: cpufreq and timing "
 	       "core thinks of %u, is %u kHz.\n", old_freq, new_freq);
 
-	freqs.cpu = cpu;
 	freqs.old = old_freq;
 	freqs.new = new_freq;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
+	policy = per_cpu(cpufreq_cpu_data, cpu);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 }
 
 
diff --git a/drivers/cpufreq/dbx500-cpufreq.c b/drivers/cpufreq/dbx500-cpufreq.c
index 72f0c3e..7192a6d 100644
--- a/drivers/cpufreq/dbx500-cpufreq.c
+++ b/drivers/cpufreq/dbx500-cpufreq.c
@@ -55,8 +55,7 @@ static int dbx500_cpufreq_target(struct cpufreq_policy *policy,
 		return 0;
 
 	/* pre-change notification */
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* update armss clk frequency */
 	ret = clk_set_rate(armss_clk, freqs.new * 1000);
@@ -68,8 +67,7 @@ static int dbx500_cpufreq_target(struct cpufreq_policy *policy,
 	}
 
 	/* post change notification */
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/e_powersaver.c b/drivers/cpufreq/e_powersaver.c
index 3fffbe6..37380fb 100644
--- a/drivers/cpufreq/e_powersaver.c
+++ b/drivers/cpufreq/e_powersaver.c
@@ -104,7 +104,7 @@ static unsigned int eps_get(unsigned int cpu)
 }
 
 static int eps_set_state(struct eps_cpu_data *centaur,
-			 unsigned int cpu,
+			 struct cpufreq_policy *policy,
 			 u32 dest_state)
 {
 	struct cpufreq_freqs freqs;
@@ -112,10 +112,9 @@ static int eps_set_state(struct eps_cpu_data *centaur,
 	int err = 0;
 	int i;
 
-	freqs.old = eps_get(cpu);
+	freqs.old = eps_get(policy->cpu);
 	freqs.new = centaur->fsb * ((dest_state >> 8) & 0xff);
-	freqs.cpu = cpu;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* Wait while CPU is busy */
 	rdmsr(MSR_IA32_PERF_STATUS, lo, hi);
@@ -162,7 +161,7 @@ postchange:
 		current_multiplier);
 	}
 #endif
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	return err;
 }
 
@@ -190,7 +189,7 @@ static int eps_target(struct cpufreq_policy *policy,
 
 	/* Make frequency transition */
 	dest_state = centaur->freq_table[newstate].index & 0xffff;
-	ret = eps_set_state(centaur, cpu, dest_state);
+	ret = eps_set_state(centaur, policy, dest_state);
 	if (ret)
 		printk(KERN_ERR "eps: Timeout!\n");
 	return ret;
diff --git a/drivers/cpufreq/elanfreq.c b/drivers/cpufreq/elanfreq.c
index 960671f..658d860 100644
--- a/drivers/cpufreq/elanfreq.c
+++ b/drivers/cpufreq/elanfreq.c
@@ -117,15 +117,15 @@ static unsigned int elanfreq_get_cpu_frequency(unsigned int cpu)
  *	There is no return value.
  */
 
-static void elanfreq_set_cpu_state(unsigned int state)
+static void elanfreq_set_cpu_state(struct cpufreq_policy *policy,
+		unsigned int state)
 {
 	struct cpufreq_freqs    freqs;
 
 	freqs.old = elanfreq_get_cpu_frequency(0);
 	freqs.new = elan_multiplier[state].clock;
-	freqs.cpu = 0; /* elanfreq.c is UP only driver */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	printk(KERN_INFO "elanfreq: attempting to set frequency to %i kHz\n",
 			elan_multiplier[state].clock);
@@ -161,7 +161,7 @@ static void elanfreq_set_cpu_state(unsigned int state)
 	udelay(10000);
 	local_irq_enable();
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 };
 
 
@@ -188,7 +188,7 @@ static int elanfreq_target(struct cpufreq_policy *policy,
 				target_freq, relation, &newstate))
 		return -EINVAL;
 
-	elanfreq_set_cpu_state(newstate);
+	elanfreq_set_cpu_state(policy, newstate);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/exynos-cpufreq.c b/drivers/cpufreq/exynos-cpufreq.c
index 78057a3..c0c4ce5 100644
--- a/drivers/cpufreq/exynos-cpufreq.c
+++ b/drivers/cpufreq/exynos-cpufreq.c
@@ -70,7 +70,6 @@ static int exynos_cpufreq_scale(unsigned int target_freq)
 
 	freqs.old = policy->cur;
 	freqs.new = target_freq;
-	freqs.cpu = policy->cpu;
 
 	if (freqs.new == freqs.old)
 		goto out;
@@ -105,8 +104,7 @@ static int exynos_cpufreq_scale(unsigned int target_freq)
 	}
 	arm_volt = volt_table[index];
 
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* When the new frequency is higher than current frequency */
 	if ((freqs.new > freqs.old) && !safe_arm_volt) {
@@ -131,8 +129,7 @@ static int exynos_cpufreq_scale(unsigned int target_freq)
 
 	exynos_info->set_freq(old_index, index);
 
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	/* When the new frequency is lower than current frequency */
 	if ((freqs.new < freqs.old) ||
diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
index 456bee0..3dfc99b 100644
--- a/drivers/cpufreq/gx-suspmod.c
+++ b/drivers/cpufreq/gx-suspmod.c
@@ -251,14 +251,13 @@ static unsigned int gx_validate_speed(unsigned int khz, u8 *on_duration,
  * set cpu speed in khz.
  **/
 
-static void gx_set_cpuspeed(unsigned int khz)
+static void gx_set_cpuspeed(struct cpufreq_policy *policy, unsigned int khz)
 {
 	u8 suscfg, pmer1;
 	unsigned int new_khz;
 	unsigned long flags;
 	struct cpufreq_freqs freqs;
 
-	freqs.cpu = 0;
 	freqs.old = gx_get_cpuspeed(0);
 
 	new_khz = gx_validate_speed(khz, &gx_params->on_duration,
@@ -266,11 +265,9 @@ static void gx_set_cpuspeed(unsigned int khz)
 
 	freqs.new = new_khz;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	local_irq_save(flags);
 
-
-
 	if (new_khz != stock_freq) {
 		/* if new khz == 100% of CPU speed, it is special case */
 		switch (gx_params->cs55x0->device) {
@@ -317,7 +314,7 @@ static void gx_set_cpuspeed(unsigned int khz)
 
 	gx_params->pci_suscfg = suscfg;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	pr_debug("suspend modulation w/ duration of ON:%d us, OFF:%d us\n",
 		gx_params->on_duration * 32, gx_params->off_duration * 32);
@@ -397,7 +394,7 @@ static int cpufreq_gx_target(struct cpufreq_policy *policy,
 		tmp_freq = gx_validate_speed(tmp_freq, &tmp1, &tmp2);
 	}
 
-	gx_set_cpuspeed(tmp_freq);
+	gx_set_cpuspeed(policy, tmp_freq);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index 54e336d..b78bc35 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -50,7 +50,7 @@ static int imx6q_set_target(struct cpufreq_policy *policy,
 	struct cpufreq_freqs freqs;
 	struct opp *opp;
 	unsigned long freq_hz, volt, volt_old;
-	unsigned int index, cpu;
+	unsigned int index;
 	int ret;
 
 	ret = cpufreq_frequency_table_target(policy, freq_table, target_freq,
@@ -68,10 +68,7 @@ static int imx6q_set_target(struct cpufreq_policy *policy,
 	if (freqs.old == freqs.new)
 		return 0;
 
-	for_each_online_cpu(cpu) {
-		freqs.cpu = cpu;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	rcu_read_lock();
 	opp = opp_find_freq_ceil(cpu_dev, &freq_hz);
@@ -166,10 +163,7 @@ static int imx6q_set_target(struct cpufreq_policy *policy,
 		}
 	}
 
-	for_each_online_cpu(cpu) {
-		freqs.cpu = cpu;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/kirkwood-cpufreq.c b/drivers/cpufreq/kirkwood-cpufreq.c
index 6052476..d36ea8d 100644
--- a/drivers/cpufreq/kirkwood-cpufreq.c
+++ b/drivers/cpufreq/kirkwood-cpufreq.c
@@ -55,7 +55,8 @@ static unsigned int kirkwood_cpufreq_get_cpu_frequency(unsigned int cpu)
 	return kirkwood_freq_table[0].frequency;
 }
 
-static void kirkwood_cpufreq_set_cpu_state(unsigned int index)
+static void kirkwood_cpufreq_set_cpu_state(struct cpufreq_policy *policy,
+		unsigned int index)
 {
 	struct cpufreq_freqs freqs;
 	unsigned int state = kirkwood_freq_table[index].index;
@@ -63,9 +64,8 @@ static void kirkwood_cpufreq_set_cpu_state(unsigned int index)
 
 	freqs.old = kirkwood_cpufreq_get_cpu_frequency(0);
 	freqs.new = kirkwood_freq_table[index].frequency;
-	freqs.cpu = 0; /* Kirkwood is UP */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	dev_dbg(priv.dev, "Attempting to set frequency to %i KHz\n",
 		kirkwood_freq_table[index].frequency);
@@ -99,7 +99,7 @@ static void kirkwood_cpufreq_set_cpu_state(unsigned int index)
 
 		local_irq_enable();
 	}
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 };
 
 static int kirkwood_cpufreq_verify(struct cpufreq_policy *policy)
@@ -117,7 +117,7 @@ static int kirkwood_cpufreq_target(struct cpufreq_policy *policy,
 				target_freq, relation, &index))
 		return -EINVAL;
 
-	kirkwood_cpufreq_set_cpu_state(index);
+	kirkwood_cpufreq_set_cpu_state(policy, index);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
index 1180d53..b448638 100644
--- a/drivers/cpufreq/longhaul.c
+++ b/drivers/cpufreq/longhaul.c
@@ -242,7 +242,8 @@ static void do_powersaver(int cx_address, unsigned int mults_index,
  * Sets a new clock ratio.
  */
 
-static void longhaul_setstate(unsigned int table_index)
+static void longhaul_setstate(struct cpufreq_policy *policy,
+		unsigned int table_index)
 {
 	unsigned int mults_index;
 	int speed, mult;
@@ -267,9 +268,8 @@ static void longhaul_setstate(unsigned int table_index)
 
 	freqs.old = calc_speed(longhaul_get_cpu_mult());
 	freqs.new = speed;
-	freqs.cpu = 0; /* longhaul.c is UP only driver */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("Setting to FSB:%dMHz Mult:%d.%dx (%s)\n",
 			fsb, mult/10, mult%10, print_speed(speed/1000));
@@ -386,7 +386,7 @@ retry_loop:
 		}
 	}
 	/* Report true CPU frequency */
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	if (!bm_timeout)
 		printk(KERN_INFO PFX "Warning: Timeout while waiting for "
@@ -648,7 +648,7 @@ static int longhaul_target(struct cpufreq_policy *policy,
 		return 0;
 
 	if (!can_scale_voltage)
-		longhaul_setstate(table_index);
+		longhaul_setstate(policy, table_index);
 	else {
 		/* On test system voltage transitions exceeding single
 		 * step up or down were turning motherboard off. Both
@@ -663,7 +663,7 @@ static int longhaul_target(struct cpufreq_policy *policy,
 		while (i != table_index) {
 			vid = (longhaul_table[i].index >> 8) & 0x1f;
 			if (vid != current_vid) {
-				longhaul_setstate(i);
+				longhaul_setstate(policy, i);
 				current_vid = vid;
 				msleep(200);
 			}
@@ -672,7 +672,7 @@ static int longhaul_target(struct cpufreq_policy *policy,
 			else
 				i--;
 		}
-		longhaul_setstate(table_index);
+		longhaul_setstate(policy, table_index);
 	}
 	longhaul_index = table_index;
 	return 0;
@@ -998,15 +998,17 @@ static int __init longhaul_init(void)
 
 static void __exit longhaul_exit(void)
 {
+	struct cpufreq_policy *policy = cpufreq_cpu_get(0);
 	int i;
 
 	for (i = 0; i < numscales; i++) {
 		if (mults[i] == maxmult) {
-			longhaul_setstate(i);
+			longhaul_setstate(policy, i);
 			break;
 		}
 	}
 
+	cpufreq_cpu_put(policy);
 	cpufreq_unregister_driver(&longhaul_driver);
 	kfree(longhaul_table);
 }
diff --git a/drivers/cpufreq/maple-cpufreq.c b/drivers/cpufreq/maple-cpufreq.c
index d4c4989..cdd6291 100644
--- a/drivers/cpufreq/maple-cpufreq.c
+++ b/drivers/cpufreq/maple-cpufreq.c
@@ -158,11 +158,10 @@ static int maple_cpufreq_target(struct cpufreq_policy *policy,
 
 	freqs.old = maple_cpu_freqs[maple_pmode_cur].frequency;
 	freqs.new = maple_cpu_freqs[newstate].frequency;
-	freqs.cpu = 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	rc = maple_scom_switch_freq(newstate);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	mutex_unlock(&maple_switch_mutex);
 
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 9128c07..b610edd 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -88,16 +88,12 @@ static int omap_target(struct cpufreq_policy *policy,
 	}
 
 	freqs.old = omap_getspeed(policy->cpu);
-	freqs.cpu = policy->cpu;
 
 	if (freqs.old == freqs.new && policy->cur == freqs.new)
 		return ret;
 
 	/* notifiers */
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	freq = freqs.new * 1000;
 	ret = clk_round_rate(mpu_clk, freq);
@@ -157,10 +153,7 @@ static int omap_target(struct cpufreq_policy *policy,
 
 done:
 	/* notifiers */
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return ret;
 }
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 827629c9..4b2e773 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -125,10 +125,7 @@ static int cpufreq_p4_target(struct cpufreq_policy *policy,
 		return 0;
 
 	/* notifiers */
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* run on each logical CPU,
 	 * see section 13.15.3 of IA32 Intel Architecture Software
@@ -138,10 +135,7 @@ static int cpufreq_p4_target(struct cpufreq_policy *policy,
 		cpufreq_p4_setdc(i, p4clockmod_table[newstate].index);
 
 	/* notifiers */
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 503996a..0de0008 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -215,8 +215,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
 		(pcch_virt_addr + pcc_cpu_data->input_offset));
 
 	freqs.new = target_freq;
-	freqs.cpu = cpu;
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	input_buffer = 0x1 | (((target_freq * 100)
 			       / (ioread32(&pcch_hdr->nominal) * 1000)) << 8);
@@ -237,7 +236,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
 	}
 	iowrite16(0, &pcch_hdr->status);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	pr_debug("target: was SUCCESSFUL for cpu %d\n", cpu);
 	spin_unlock(&pcc_lock);
 
diff --git a/drivers/cpufreq/powernow-k6.c b/drivers/cpufreq/powernow-k6.c
index af23e0b..ea0222a 100644
--- a/drivers/cpufreq/powernow-k6.c
+++ b/drivers/cpufreq/powernow-k6.c
@@ -68,7 +68,8 @@ static int powernow_k6_get_cpu_multiplier(void)
  *
  *   Tries to change the PowerNow! multiplier
  */
-static void powernow_k6_set_state(unsigned int best_i)
+static void powernow_k6_set_state(struct cpufreq_policy *policy,
+		unsigned int best_i)
 {
 	unsigned long outvalue = 0, invalue = 0;
 	unsigned long msrval;
@@ -81,9 +82,8 @@ static void powernow_k6_set_state(unsigned int best_i)
 
 	freqs.old = busfreq * powernow_k6_get_cpu_multiplier();
 	freqs.new = busfreq * clock_ratio[best_i].index;
-	freqs.cpu = 0; /* powernow-k6.c is UP only driver */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* we now need to transform best_i to the BVC format, see AMD#23446 */
 
@@ -98,7 +98,7 @@ static void powernow_k6_set_state(unsigned int best_i)
 	msrval = POWERNOW_IOPORT + 0x0;
 	wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return;
 }
@@ -136,7 +136,7 @@ static int powernow_k6_target(struct cpufreq_policy *policy,
 				target_freq, relation, &newstate))
 		return -EINVAL;
 
-	powernow_k6_set_state(newstate);
+	powernow_k6_set_state(policy, newstate);
 
 	return 0;
 }
@@ -182,7 +182,7 @@ static int powernow_k6_cpu_exit(struct cpufreq_policy *policy)
 	unsigned int i;
 	for (i = 0; i < 8; i++) {
 		if (i == max_multiplier)
-			powernow_k6_set_state(i);
+			powernow_k6_set_state(policy, i);
 	}
 	cpufreq_frequency_table_put_attr(policy->cpu);
 	return 0;
diff --git a/drivers/cpufreq/powernow-k7.c b/drivers/cpufreq/powernow-k7.c
index 334cc2f..53888da 100644
--- a/drivers/cpufreq/powernow-k7.c
+++ b/drivers/cpufreq/powernow-k7.c
@@ -248,7 +248,7 @@ static void change_VID(int vid)
 }
 
 
-static void change_speed(unsigned int index)
+static void change_speed(struct cpufreq_policy *policy, unsigned int index)
 {
 	u8 fid, vid;
 	struct cpufreq_freqs freqs;
@@ -263,15 +263,13 @@ static void change_speed(unsigned int index)
 	fid = powernow_table[index].index & 0xFF;
 	vid = (powernow_table[index].index & 0xFF00) >> 8;
 
-	freqs.cpu = 0;
-
 	rdmsrl(MSR_K7_FID_VID_STATUS, fidvidstatus.val);
 	cfid = fidvidstatus.bits.CFID;
 	freqs.old = fsb * fid_codes[cfid] / 10;
 
 	freqs.new = powernow_table[index].frequency;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* Now do the magic poking into the MSRs.  */
 
@@ -292,7 +290,7 @@ static void change_speed(unsigned int index)
 	if (have_a0 == 1)
 		local_irq_enable();
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 }
 
 
@@ -546,7 +544,7 @@ static int powernow_target(struct cpufreq_policy *policy,
 				relation, &newstate))
 		return -EINVAL;
 
-	change_speed(newstate);
+	change_speed(policy, newstate);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index d13a136..52137a3 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -928,9 +928,10 @@ static int get_transition_latency(struct powernow_k8_data *data)
 static int transition_frequency_fidvid(struct powernow_k8_data *data,
 		unsigned int index)
 {
+	struct cpufreq_policy *policy;
 	u32 fid = 0;
 	u32 vid = 0;
-	int res, i;
+	int res;
 	struct cpufreq_freqs freqs;
 
 	pr_debug("cpu %d transition to index %u\n", smp_processor_id(), index);
@@ -959,10 +960,10 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data,
 	freqs.old = find_khz_freq_from_fid(data->currfid);
 	freqs.new = find_khz_freq_from_fid(fid);
 
-	for_each_cpu(i, data->available_cores) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	policy = cpufreq_cpu_get(smp_processor_id());
+	cpufreq_cpu_put(policy);
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	res = transition_fid_vid(data, fid, vid);
 	if (res)
@@ -970,10 +971,7 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data,
 
 	freqs.new = find_khz_freq_from_fid(data->currfid);
 
-	for_each_cpu(i, data->available_cores) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	return res;
 }
 
diff --git a/drivers/cpufreq/s3c2416-cpufreq.c b/drivers/cpufreq/s3c2416-cpufreq.c
index bcc053b..4f1881e 100644
--- a/drivers/cpufreq/s3c2416-cpufreq.c
+++ b/drivers/cpufreq/s3c2416-cpufreq.c
@@ -256,7 +256,6 @@ static int s3c2416_cpufreq_set_target(struct cpufreq_policy *policy,
 		goto out;
 	}
 
-	freqs.cpu = 0;
 	freqs.flags = 0;
 	freqs.old = s3c_freq->is_dvs ? FREQ_DVS
 				     : clk_get_rate(s3c_freq->armclk) / 1000;
@@ -274,7 +273,7 @@ static int s3c2416_cpufreq_set_target(struct cpufreq_policy *policy,
 	if (!to_dvs && freqs.old == freqs.new)
 		goto out;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (to_dvs) {
 		pr_debug("cpufreq: enter dvs\n");
@@ -287,7 +286,7 @@ static int s3c2416_cpufreq_set_target(struct cpufreq_policy *policy,
 		ret = s3c2416_cpufreq_set_armdiv(s3c_freq, freqs.new);
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 out:
 	mutex_unlock(&cpufreq_lock);
diff --git a/drivers/cpufreq/s3c64xx-cpufreq.c b/drivers/cpufreq/s3c64xx-cpufreq.c
index 6f9490b..27cacb5 100644
--- a/drivers/cpufreq/s3c64xx-cpufreq.c
+++ b/drivers/cpufreq/s3c64xx-cpufreq.c
@@ -84,7 +84,6 @@ static int s3c64xx_cpufreq_set_target(struct cpufreq_policy *policy,
 	if (ret != 0)
 		return ret;
 
-	freqs.cpu = 0;
 	freqs.old = clk_get_rate(armclk) / 1000;
 	freqs.new = s3c64xx_freq_table[i].frequency;
 	freqs.flags = 0;
@@ -95,7 +94,7 @@ static int s3c64xx_cpufreq_set_target(struct cpufreq_policy *policy,
 
 	pr_debug("Transition %d-%dkHz\n", freqs.old, freqs.new);
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 #ifdef CONFIG_REGULATOR
 	if (vddarm && freqs.new > freqs.old) {
@@ -117,7 +116,7 @@ static int s3c64xx_cpufreq_set_target(struct cpufreq_policy *policy,
 		goto err;
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 #ifdef CONFIG_REGULATOR
 	if (vddarm && freqs.new < freqs.old) {
@@ -141,7 +140,7 @@ err_clk:
 	if (clk_set_rate(armclk, freqs.old * 1000) < 0)
 		pr_err("Failed to restore original clock rate\n");
 err:
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return ret;
 }
diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c
index a484aae..5c77570 100644
--- a/drivers/cpufreq/s5pv210-cpufreq.c
+++ b/drivers/cpufreq/s5pv210-cpufreq.c
@@ -229,7 +229,6 @@ static int s5pv210_target(struct cpufreq_policy *policy,
 	}
 
 	freqs.new = s5pv210_freq_table[index].frequency;
-	freqs.cpu = 0;
 
 	if (freqs.new == freqs.old)
 		goto exit;
@@ -256,7 +255,7 @@ static int s5pv210_target(struct cpufreq_policy *policy,
 			goto exit;
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	/* Check if there need to change PLL */
 	if ((index == L0) || (priv_index == L0))
@@ -468,7 +467,7 @@ static int s5pv210_target(struct cpufreq_policy *policy,
 		}
 	}
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	if (freqs.new < freqs.old) {
 		regulator_set_voltage(int_regulator,
diff --git a/drivers/cpufreq/sc520_freq.c b/drivers/cpufreq/sc520_freq.c
index e42e073..f740b13 100644
--- a/drivers/cpufreq/sc520_freq.c
+++ b/drivers/cpufreq/sc520_freq.c
@@ -53,7 +53,8 @@ static unsigned int sc520_freq_get_cpu_frequency(unsigned int cpu)
 	}
 }
 
-static void sc520_freq_set_cpu_state(unsigned int state)
+static void sc520_freq_set_cpu_state(struct cpufreq_policy *policy,
+		unsigned int state)
 {
 
 	struct cpufreq_freqs	freqs;
@@ -61,9 +62,8 @@ static void sc520_freq_set_cpu_state(unsigned int state)
 
 	freqs.old = sc520_freq_get_cpu_frequency(0);
 	freqs.new = sc520_freq_table[state].frequency;
-	freqs.cpu = 0; /* AMD Elan is UP */
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("attempting to set frequency to %i kHz\n",
 			sc520_freq_table[state].frequency);
@@ -75,7 +75,7 @@ static void sc520_freq_set_cpu_state(unsigned int state)
 
 	local_irq_enable();
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 };
 
 static int sc520_freq_verify(struct cpufreq_policy *policy)
@@ -93,7 +93,7 @@ static int sc520_freq_target(struct cpufreq_policy *policy,
 				target_freq, relation, &newstate))
 		return -EINVAL;
 
-	sc520_freq_set_cpu_state(newstate);
+	sc520_freq_set_cpu_state(policy, newstate);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c
index 7e4d773..156829f 100644
--- a/drivers/cpufreq/spear-cpufreq.c
+++ b/drivers/cpufreq/spear-cpufreq.c
@@ -121,7 +121,6 @@ static int spear_cpufreq_target(struct cpufreq_policy *policy,
 				target_freq, relation, &index))
 		return -EINVAL;
 
-	freqs.cpu = policy->cpu;
 	freqs.old = spear_cpufreq_get(0);
 
 	newfreq = spear_cpufreq.freq_tbl[index].frequency * 1000;
@@ -158,8 +157,7 @@ static int spear_cpufreq_target(struct cpufreq_policy *policy,
 	freqs.new = newfreq / 1000;
 	freqs.new /= mult;
 
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	if (mult == 2)
 		ret = spear1340_set_cpu_rate(srcclk, newfreq);
@@ -172,8 +170,7 @@ static int spear_cpufreq_target(struct cpufreq_policy *policy,
 		freqs.new = clk_get_rate(spear_cpufreq.clk) / 1000;
 	}
 
-	for_each_cpu(freqs.cpu, policy->cpus)
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	return ret;
 }
 
diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c
index 3a953d5..3dbbcc3 100644
--- a/drivers/cpufreq/speedstep-centrino.c
+++ b/drivers/cpufreq/speedstep-centrino.c
@@ -457,7 +457,7 @@ static int centrino_target (struct cpufreq_policy *policy,
 	unsigned int	msr, oldmsr = 0, h = 0, cpu = policy->cpu;
 	struct cpufreq_freqs	freqs;
 	int			retval = 0;
-	unsigned int		j, k, first_cpu, tmp;
+	unsigned int		j, first_cpu, tmp;
 	cpumask_var_t covered_cpus;
 
 	if (unlikely(!zalloc_cpumask_var(&covered_cpus, GFP_KERNEL)))
@@ -522,13 +522,8 @@ static int centrino_target (struct cpufreq_policy *policy,
 			pr_debug("target=%dkHz old=%d new=%d msr=%04x\n",
 				target_freq, freqs.old, freqs.new, msr);
 
-			for_each_cpu(k, policy->cpus) {
-				if (!cpu_online(k))
-					continue;
-				freqs.cpu = k;
-				cpufreq_notify_transition(&freqs,
+			cpufreq_notify_transition(policy, &freqs,
 					CPUFREQ_PRECHANGE);
-			}
 
 			first_cpu = 0;
 			/* all but 16 LSB are reserved, treat them with care */
@@ -544,12 +539,7 @@ static int centrino_target (struct cpufreq_policy *policy,
 		cpumask_set_cpu(j, covered_cpus);
 	}
 
-	for_each_cpu(k, policy->cpus) {
-		if (!cpu_online(k))
-			continue;
-		freqs.cpu = k;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	if (unlikely(retval)) {
 		/*
@@ -565,12 +555,8 @@ static int centrino_target (struct cpufreq_policy *policy,
 		tmp = freqs.new;
 		freqs.new = freqs.old;
 		freqs.old = tmp;
-		for_each_cpu(j, policy->cpus) {
-			if (!cpu_online(j))
-				continue;
-			cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-			cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-		}
+		cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+		cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 	}
 	retval = 0;
 
diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c
index e29b59a..e2e5aa9 100644
--- a/drivers/cpufreq/speedstep-ich.c
+++ b/drivers/cpufreq/speedstep-ich.c
@@ -263,7 +263,6 @@ static int speedstep_target(struct cpufreq_policy *policy,
 {
 	unsigned int newstate = 0, policy_cpu;
 	struct cpufreq_freqs freqs;
-	int i;
 
 	if (cpufreq_frequency_table_target(policy, &speedstep_freqs[0],
 				target_freq, relation, &newstate))
@@ -272,7 +271,6 @@ static int speedstep_target(struct cpufreq_policy *policy,
 	policy_cpu = cpumask_any_and(policy->cpus, cpu_online_mask);
 	freqs.old = speedstep_get(policy_cpu);
 	freqs.new = speedstep_freqs[newstate].frequency;
-	freqs.cpu = policy->cpu;
 
 	pr_debug("transiting from %u to %u kHz\n", freqs.old, freqs.new);
 
@@ -280,18 +278,12 @@ static int speedstep_target(struct cpufreq_policy *policy,
 	if (freqs.old == freqs.new)
 		return 0;
 
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	smp_call_function_single(policy_cpu, _speedstep_set_state, &newstate,
 				 true);
 
-	for_each_cpu(i, policy->cpus) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/speedstep-smi.c b/drivers/cpufreq/speedstep-smi.c
index 6a457fc..f5a6b70 100644
--- a/drivers/cpufreq/speedstep-smi.c
+++ b/drivers/cpufreq/speedstep-smi.c
@@ -252,14 +252,13 @@ static int speedstep_target(struct cpufreq_policy *policy,
 
 	freqs.old = speedstep_freqs[speedstep_get_state()].frequency;
 	freqs.new = speedstep_freqs[newstate].frequency;
-	freqs.cpu = 0; /* speedstep.c is UP only driver */
 
 	if (freqs.old == freqs.new)
 		return 0;
 
-	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	speedstep_set_state(newstate);
-	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
 
 	return 0;
 }
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 4bbc572..037d36a 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -278,8 +278,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
 
 
-void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state);
-
+void cpufreq_notify_transition(struct cpufreq_policy *policy,
+		struct cpufreq_freqs *freqs, unsigned int state);
 
 static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy, unsigned int min, unsigned int max)
 {
-- 
1.7.12.rc2.18.g61b472e

^ permalink raw reply related

* Re: [linuxppc-release][PATCH] powerpc/pci-hotplug: fix init issue of rescanned pci device
From: Chen Yuanquan-B41889 @ 2013-04-01 10:29 UTC (permalink / raw)
  To: Bjorn Helgaas, 松本博郎
  Cc: linux-pci, linuxppc-dev, r61911
In-Reply-To: <CAErSpo4skBkgOLRMhZQboP5aD+7HPyARmfFhoxEvnjiVb7Qy1Q@mail.gmail.com>

On 12/08/2012 05:15 AM, Bjorn Helgaas wrote:
> On Thu, Dec 6, 2012 at 4:23 AM, Chen Yuanquan-B41889
> <B41889@freescale.com> wrote:
>> On 12/06/2012 05:30 AM, Bjorn Helgaas wrote:
>>> On Wed, Dec 5, 2012 at 2:29 AM, Chen Yuanquan-B41889
>>> <B41889@freescale.com> wrote:
>>>> On 12/05/2012 04:26 PM, Benjamin Herrenschmidt wrote:
>>>>> On Wed, 2012-12-05 at 16:20 +0800, Chen Yuanquan-B41889 wrote:
>>>>>> On 12/05/2012 03:17 PM, Benjamin Herrenschmidt wrote:
>>>>>>> On Wed, 2012-12-05 at 10:31 +0800, Yuanquan Chen wrote:
>>>>>>>> On powerpc arch, some fixup work of PCI/PCI-e device is just done
>>>>>>>> during the
>>>>>>>> first scan at booting time. For the PCI/PCI-e device rescanned after
>>>>>>>> linux OS
>>>>>>>> booting up, the fixup work won't be done, which leads to dma_set_mask
>>>>>>>> error or
>>>>>>>> irq related issue in rescanned PCI/PCI-e device's driver. So, it does
>>>>>>>> the same
>>>>>>>> fixup work for the rescanned device to avoid this issue.
>>>>>>> Hrm, the patch is a bit gross. First the code shouldn't be copy/pasted
>>>>>>> that way but factored out.
>>>>> Please, at least format your email properly so I can try to undertand
>>>>> without needing aspirin.
>>>>>
>>>>>> There's a judgement "if (!bus->is_added)" before calling of
>>>>>> pcibios_fixup_bus in pci_scan_child_bus, so for the rescanned device,
>>>>>> the fixup won't execute, which leads to fatal error in driver of
>>>>>> rescanned
>>>>>> device on freescale  powerpc, no this issues on x86 arch.
>>>>> First, none of that invalidates my statement that you shouldn't
>>>>> duplicate a whole block of code like this. Even if your approach is
>>>>> correct (which is debated separately), at the very least you should
>>>>> factor the code out into a common function between the two copies.
>>>>>
>>>>>> Remove the judgement, let it to do the pcibios_fixup_bus
>>>>>> directly, the error won't occur for the rescanned device. But it's
>>>>>> general code, not proper to change here, so copy the pcibios_fixup_bus
>>>>>> work to  pcibios_enable_device.
>>>>>>
>>>>>>> I'm surprised also that is_added is false when pcibios_enable_device()
>>>>>>> gets called ... that looks strange to me. At what point is that enable
>>>>>>> happening in the hotplug sequence ?
>>>>>> All devices are rescanned and then call the pci_enable_devices and
>>>>>> pci_bus_add_devices.
>>>>> Where ? How ? What is the sequence happening ? In any case, I think if
>>>>> we need a proper fixup done per-device like that after scan we ought to
>>>>> create a new hook at the generic level rather than that sort of hack.
>>>>>
>>>> echo 1 > rescan to trigger dev_rescan_store:
>>>>
>>>> dev_rescan_store->pci_rescan_bus->pci_scan_child_bus,
>>>> pci_assign_unassigned_bus_resources,
>>>> pci_enable_bridges, pci_bus_add_devices
>>>>
>>>>
>>>> pci_enable_bridges->pci_enable_device->__pci_enable_device_flags->do_pci_enable_device->
>>>> pcibios_enable_device
>>>>
>>>> pci_bus_add_devices->pci_bus_add_device->"dev->is_added = 1"
>>>>
>>>> Yeah, it's general fixup code for every rescanned PCI/PCI-e device on
>>>> powerpc at runtime. So if
>>>> we want to call it in a ppc_md member, we need to wrap it as a function
>>>> and
>>>> assign it in every ppc_md,
>>>> it isn't proper for the general code.
>>>>
>>>> Regards,
>>>> yuanquan
>>>>
>>>>
>>>>>> The patch code will be called by pci_enable_devices. The
>>>>>> "dev->is_added"
>>>>>> is set in pci_bus_add_device
>>>>>> which is called by pci_bus_add_devices. So "dev->is_added" is false
>>>>>> when
>>>>>> checking it in pcibios_enable_device
>>>>>> for the rescanned device.
>>>>> Who calls pci_enable_device() in the rescan case ? Why isn't it left to
>>>>> the driver ? I don't think we can rely on that behaviour not to change.
>>>>>
>>>>>>> How do you trigger the rescan anyway ?
>>>>>> Use the interface under /sys :
>>>>>> echo 1 > /sys/bus/pci/devices/xxx/remove
>>>>>>
>>>>>> then echo 1 to the pci device which is the bus of the removed device
>>>>>> echo 1 > /sys/bus/pci/devices/xxxx/rescan
>>>>>> the removed device will be scanned and it's driver module will be
>>>>>> loaded
>>>>>> automatically.
>>>>> Yeah this code path are known to be fishy. I think the problem is at the
>>>>> generic abstraction level and that's where it needs to be fixed.
>>>>>
>>>>> Cheers,
>>>>> Ben.
>>>>>
>>>>>> Regards,
>>>>>> yuanquan
>>>>>>> I think the problem needs to be solve at a higher level, I'm adding
>>>>>>> linux-pci & Bjorn to the CC list.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Ben.
>>>>>>>
>>>>>>>> Signed-off-by: Yuanquan Chen <B41889@freescale.com>
>>>>>>>> ---
>>>>>>>>      arch/powerpc/kernel/pci-common.c |   20 ++++++++++++++++++++
>>>>>>>>      1 file changed, 20 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/powerpc/kernel/pci-common.c
>>>>>>>> b/arch/powerpc/kernel/pci-common.c
>>>>>>>> index 7f94f76..f0fb070 100644
>>>>>>>> --- a/arch/powerpc/kernel/pci-common.c
>>>>>>>> +++ b/arch/powerpc/kernel/pci-common.c
>>>>>>>> @@ -1496,6 +1496,26 @@ int pcibios_enable_device(struct pci_dev *dev,
>>>>>>>> int mask)
>>>>>>>>                   if (ppc_md.pcibios_enable_device_hook(dev))
>>>>>>>>                           return -EINVAL;
>>>>>>>>      +    if (!dev->is_added) {
>>>>>>>> +               /*
>>>>>>>> +                * Fixup NUMA node as it may not be setup yet by the
>>>>>>>> generic
>>>>>>>> +                * code and is needed by the DMA init
>>>>>>>> +                */
>>>>>>>> +               set_dev_node(&dev->dev, pcibus_to_node(dev->bus));
>>>>>>>> +
>>>>>>>> +               /* Hook up default DMA ops */
>>>>>>>> +               set_dma_ops(&dev->dev, pci_dma_ops);
>>>>>>>> +               set_dma_offset(&dev->dev, PCI_DRAM_OFFSET);
>>>>>>>> +
>>>>>>>> +               /* Additional platform DMA/iommu setup */
>>>>>>>> +               if (ppc_md.pci_dma_dev_setup)
>>>>>>>> +                       ppc_md.pci_dma_dev_setup(dev);
>>>>>>>> +
>>>>>>>> +               /* Read default IRQs and fixup if necessary */
>>>>>>>> +               pci_read_irq_line(dev);
>>>>>>>> +               if (ppc_md.pci_irq_fixup)
>>>>>>>> +                       ppc_md.pci_irq_fixup(dev);
>>>>>>>> +       }
>>>>>>>>           return pci_enable_resources(dev, mask);
>>>>>>>>      }
>>> Is this the same issue Hiroo MATSUMOTO was working on earlier?
>>> (http://comments.gmane.org/gmane.linux.ports.ppc.embedded/50080)
>>
>> Yeah, that's the exact problem I encountered. Please push it forward.
> Well, as I mentioned, there are unresolved issues, so it's not just a
> matter of applying the most recent patch.  If you're interested in
> this problem and have some hardware to test, you can help by looking
> into some of the things I mentioned in the message at the URL below.

Hi Helgaas , Matsumoto,

Do you have new update about this issue? I will go on to investigate 
this issue in the next
days.

Thanks,
Yuanquan

>>> We went round and round on those patches (partly my fault for
>>> excessive bike-shedding), and then we stalled out because of an
>>> ordering issue with CardBus init and an IRQ quirk.
>>>
>>> Here's the last status I remember:
>>> http://marc.info/?l=linux-pci&m=135006501620378&w=2
>>>
>>> Bjorn
>>>
>>>
>>>
>>
>
>

^ permalink raw reply

* RE: [PATCH v3] cpufreq: Add cpufreq driver for Freescale e500mc SoCs
From: Tang Yuantian-B29983 @ 2013-04-01  2:16 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	cpufreq@vger.kernel.org, Linux PM list
In-Reply-To: <CAOh2x=k2cchYWnX0AitOjCOfP_mC-=cm2sm4qhrGv0S9ywzP2g@mail.gmail.com>

DQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IGNwdWZyZXEtb3duZXJAdmdl
ci5rZXJuZWwub3JnIFttYWlsdG86Y3B1ZnJlcS1vd25lckB2Z2VyLmtlcm5lbC5vcmddDQo+IE9u
IEJlaGFsZiBPZiBWaXJlc2ggS3VtYXINCj4gU2VudDogMjAxM8TqM9TCMzDI1SAyMTo1Mg0KPiBU
bzogVGFuZyBZdWFudGlhbi1CMjk5ODMNCj4gQ2M6IFJhZmFlbCBKLiBXeXNvY2tpOyBjcHVmcmVx
QHZnZXIua2VybmVsLm9yZzsgTGludXggUE0gbGlzdDsgbGludXhwcGMtDQo+IGRldkBsaXN0cy5v
emxhYnMub3JnOyBMaSBZYW5nLVI1ODQ3Mg0KPiBTdWJqZWN0OiBSZTogW1BBVENIIHYzXSBjcHVm
cmVxOiBBZGQgY3B1ZnJlcSBkcml2ZXIgZm9yIEZyZWVzY2FsZSBlNTAwbWMNCj4gU29Dcw0KPiAN
Cj4gT24gRnJpLCBNYXIgMjksIDIwMTMgYXQgMTE6MjIgQU0sICA8WXVhbnRpYW4uVGFuZ0BmcmVl
c2NhbGUuY29tPiB3cm90ZToNCj4gPiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9jcHVmcmVxL3BwYy1j
b3JlbmV0LWNwdWZyZXEuYw0KPiA+IGIvZHJpdmVycy9jcHVmcmVxL3BwYy1jb3JlbmV0LWNwdWZy
ZXEuYw0KPiANCj4gPiArc3RhdGljIGludCBjb3JlbmV0X2NwdWZyZXFfY3B1X2luaXQoc3RydWN0
IGNwdWZyZXFfcG9saWN5ICpwb2xpY3kpIHsNCj4gDQo+ID4gKyAgICAgICBmb3IgKGkgPSAwOyBp
IDwgY291bnQ7IGkrKykgew0KPiA+ICsgICAgICAgICAgICAgICB0YWJsZVtpXS5pbmRleCA9IGk7
DQo+IA0KPiBPbmUgbW9yZSB0aGluZywgeW91IGRvbid0IG5lZWQgdG8gc2V0IGluZGV4IGF0IGFs
bCBhcyB5b3UgYXJlbid0IHVzaW5nIGl0Lg0KPiBBbmQgY3B1ZnJlcSBjb3JlIG5ldmVyIHVzZXMg
aXQuDQo+IA0KT0ssIG5vIHByb2JsZW0uDQoNClRoYW5rcywNCll1YW50aWFuDQoNCj4gPiArICAg
ICAgICAgICAgICAgY2xrID0gb2ZfY2xrX2dldChkYXRhLT5wYXJlbnQsIGkpOw0KPiA+ICsgICAg
ICAgICAgICAgICB0YWJsZVtpXS5mcmVxdWVuY3kgPSBjbGtfZ2V0X3JhdGUoY2xrKSAvIDEwMDA7
DQo+ID4gKyAgICAgICB9DQo+IC0tDQo+IFRvIHVuc3Vic2NyaWJlIGZyb20gdGhpcyBsaXN0OiBz
ZW5kIHRoZSBsaW5lICJ1bnN1YnNjcmliZSBjcHVmcmVxIiBpbiB0aGUNCj4gYm9keSBvZiBhIG1l
c3NhZ2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZyBNb3JlIG1ham9yZG9tbyBpbmZvIGF0
DQo+IGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbA0KDQo=

^ permalink raw reply

* Re: MPC5121e, Linux, simple IO ports
From: Anatolij Gustschin @ 2013-03-31  9:51 UTC (permalink / raw)
  To: CF Studelec; +Cc: linuxppc-dev
In-Reply-To: <514ACAD6.6010909@studelec-sa.com>

Hello,

On Thu, 21 Mar 2013 09:54:46 +0100
CF Studelec <cfoissac@studelec-sa.com> wrote:

> Hello,
> 
> This is a simple board base on mpc5121e MCU.
> 
> 
> Gpio is detected: kernel is compiled with its support - i got
> gpiochip_find_base: found new base @224 in dmesg - on kernel 3.0.4.
> 
>  
> But i'm unable to access it through /sys/class/gpio. I can successfully
> export a pin (ie, if i type cat 224 > export, gpio224 is created), but i
> can't successfully control it:
> 
>  
> 
> echo "out" > /sys/class/gpio/gpio224/direction
> 
> cat /sys/class/gpio/gpio224/value
> 
> 0
> 
> echo 1 > /sys/class/gpio/gpio224/value
> 
> cat /sys/class/gpio/gpio224/value
> 
> 0
>
> My need is a simple chipselect (well, 3 chipselect exactly ), for a
> custom design. Running linux is mandatory. I suspect the MPC5121e to be
> in the bad function mode, so:
> 
> - does anyone knows how to change mode ? What register can i acces and
> how ? does it worths a try ?

For GPIO0 - GPIO7 pins you additionally need to configure the pin mux
for GPIO mode, see IO_CONTROL_GP register description in the manual.

> - does anyone successfully performed simple IO control ?

We use it all the time, i.e. as SD-Card detect and/or
write-protect GPIOs, as SPI Chips-Select GPIOs, etc. It works.

> Last thing, i have tryed to access internal registers through /dev/mem,
> but no success. There are very few ressources available for this
> microcontroler, but i'm stick to it. Perhaps anybody knows how to access
> (read) internal registers with /proc or sys-fs ?

Accessing registers using /dev/mem should work (tested using
devmem2.c tool with recent v3.9-rc4 kernel. Should work with
older kernels too).

Alternatively (as a quick test) you could configure the registers
under U-Boot using "mw" command, assuming you are using U-Boot as
a bootloader.

HTH,

Anatolij

--
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0  Fax: +49-8142-66989-80 Email: office@denx.de

^ permalink raw reply

* Re: [PATCH 9/9] powerpc: cpufreq: move cpufreq driver to drivers/cpufreq
From: Viresh Kumar @ 2013-03-31  4:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Olof Johansson,
	linuxppc-dev
  Cc: robin.randhawa, linux-pm, Viresh Kumar, Liviu.Dudau, linux-kernel,
	cpufreq, Rafael J. Wysocki, Steve.Bannister, arnd.bergmann,
	arvind.chauhan, linaro-kernel, charles.garcia-tobin
In-Reply-To: <4180fd5b43906421fc155126e8b7bea5fb3ff375.1364229828.git.viresh.kumar@linaro.org>

On 25 March 2013 22:24, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> This patch moves cpufreq driver of powerpc platform to drivers/cpufreq.
>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Olof Johansson <olof@lixom.net>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  arch/powerpc/platforms/Kconfig                     | 31 ----------------------
>  arch/powerpc/platforms/pasemi/Makefile             |  1 -
>  arch/powerpc/platforms/powermac/Makefile           |  2 --
>  drivers/cpufreq/Kconfig.powerpc                    | 26 ++++++++++++++++++
>  drivers/cpufreq/Makefile                           |  3 +++
>  .../cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c  |  0
>  .../cpufreq/pmac32-cpufreq.c                       |  0
>  .../cpufreq/pmac64-cpufreq.c                       |  0
>  8 files changed, 29 insertions(+), 34 deletions(-)
>  rename arch/powerpc/platforms/pasemi/cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c (100%)
>  rename arch/powerpc/platforms/powermac/cpufreq_32.c => drivers/cpufreq/pmac32-cpufreq.c (100%)
>  rename arch/powerpc/platforms/powermac/cpufreq_64.c => drivers/cpufreq/pmac64-cpufreq.c (100%)

Benjamin/Paul/Olof,

Any comments on this?

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/mpc512x: add platform code for MPC5125.
From: Anatolij Gustschin @ 2013-03-30 22:32 UTC (permalink / raw)
  To: Matteo Facchinetti; +Cc: gregkh, linuxppc-dev
In-Reply-To: <1363801314-16967-4-git-send-email-matteo.facchinetti@sirius-es.it>

On Wed, 20 Mar 2013 18:41:54 +0100
Matteo Facchinetti <matteo.facchinetti@sirius-es.it> wrote:
...
> diff --git a/arch/powerpc/boot/dts/mpc5125twr.dts b/arch/powerpc/boot/dts/mpc5125twr.dts
> new file mode 100644
> index 0000000..afcad7a
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/mpc5125twr.dts
...
> +
> +		diu@2100 {
> +			device_type = "display";

device_type is deprecated (for nodes other than cpu and memory).

...
> +		serial@11100 {
> +			device_type = "serial";
> +			compatible = "fsl,mpc5125-psc-uart", "fsl,mpc5125-psc";
> +			port-number = <0>;

port-number property is not used anywhere, please remove.
Remove device_type property, too.

...
> +		// PSC9 uart1 aka ttyPSC1
> +		serial@11900 {
> +			device_type = "serial";
> +			compatible = "fsl,mpc5125-psc-uart", "fsl,mpc5125-psc";
> +			port-number = <1>;

drop device_type and port-number properties.


> diff --git a/arch/powerpc/platforms/512x/clock.c b/arch/powerpc/platforms/512x/clock.c
> index 52d57d2..a8987dc2 100644
> --- a/arch/powerpc/platforms/512x/clock.c
> +++ b/arch/powerpc/platforms/512x/clock.c
> @@ -29,6 +29,8 @@
>  #include <asm/mpc5121.h>
>  #include <asm/clk_interface.h>
>  
> +#include "mpc512x.h"
> +
>  #undef CLK_DEBUG
>  
>  static int clocks_initialized;
> @@ -683,8 +685,13 @@ static void psc_clks_init(void)
>  	struct device_node *np;
>  	struct platform_device *ofdev;
>  	u32 reg;
> +	char *psc_compat;

it should be const char *.

...
> diff --git a/arch/powerpc/platforms/512x/mpc512x.h b/arch/powerpc/platforms/512x/mpc512x.h
> index c32b399..2b97622 100644
> --- a/arch/powerpc/platforms/512x/mpc512x.h
> +++ b/arch/powerpc/platforms/512x/mpc512x.h
> @@ -15,6 +15,7 @@ extern void __init mpc512x_init_IRQ(void);
>  extern void __init mpc512x_init(void);
>  extern int __init mpc5121_clk_init(void);
>  void __init mpc512x_declare_of_platform_devices(void);
> +extern char *mpc512x_select_psc_compat(void);

const char *.

...
> diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
> index d30235b..fbf67e9 100644
> --- a/arch/powerpc/platforms/512x/mpc512x_shared.c
> +++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
> @@ -350,6 +350,21 @@ void __init mpc512x_declare_of_platform_devices(void)
>  
>  #define DEFAULT_FIFO_SIZE 16
>  
> +char *mpc512x_select_psc_compat(void)

const char *.

> +{
> +	char *psc_compats[] = {
> +		"fsl,mpc5121-psc",
> +		"fsl,mpc5125-psc"
> +	};
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(psc_compats); i++)
> +		if (of_find_compatible_node(NULL, NULL, psc_compats[i]))
> +			return psc_compats[i];

I don't like this, better would be to use something like:

        if (of_machine_is_compatible("fsl,mpc5121"))
                return "fsl,mpc5121-psc";

        if (of_machine_is_compatible("fsl,mpc5125"))
                return "fsl,mpc5125-psc";

but note that it will only work if we add these compatibles
to the compatible list of the root nodes in 5121 and 5125
device trees.

Thanks,

Anatolij

^ permalink raw reply

* Re: [PATCH 1/3] powerpc/512x: move mpc5121_generic platform to mpc512x_generic.
From: Anatolij Gustschin @ 2013-03-30 22:01 UTC (permalink / raw)
  To: Matteo Facchinetti; +Cc: gregkh, linuxppc-dev
In-Reply-To: <1363801314-16967-2-git-send-email-matteo.facchinetti@sirius-es.it>

On Wed, 20 Mar 2013 18:41:52 +0100
Matteo Facchinetti <matteo.facchinetti@sirius-es.it> wrote:

> This provides a base for using 512x_generic platform on mpc5125 boards.
> 
> By this way 512x_GENERIC it could be used for all generic mpc512x boards
> and kernel could be compiled with mpc512x_defconfig.
> 
> Signed-off-by: Matteo Facchinetti <matteo.facchinetti@sirius-es.it>
> ---
>  arch/powerpc/configs/mpc512x_defconfig                         |    2 +-
>  arch/powerpc/platforms/512x/Kconfig                            |    8 ++++----
>  arch/powerpc/platforms/512x/Makefile                           |    2 +-
>  .../platforms/512x/{mpc5121_generic.c => mpc512x_generic.c}    |    0
>  4 files changed, 6 insertions(+), 6 deletions(-)
>  rename arch/powerpc/platforms/512x/{mpc5121_generic.c => mpc512x_generic.c} (100%)

with a minor change to mpc512x_generic.c, applied to next.

Thanks,

Anatolij

^ permalink raw reply

* Re: [PATCH v3] cpufreq: Add cpufreq driver for Freescale e500mc SoCs
From: Viresh Kumar @ 2013-03-30 13:52 UTC (permalink / raw)
  To: Yuantian.Tang
  Cc: Rafael J. Wysocki, linuxppc-dev, cpufreq@vger.kernel.org,
	Linux PM list
In-Reply-To: <1364536350-23788-1-git-send-email-Yuantian.Tang@freescale.com>

On Fri, Mar 29, 2013 at 11:22 AM,  <Yuantian.Tang@freescale.com> wrote:
> diff --git a/drivers/cpufreq/ppc-corenet-cpufreq.c b/drivers/cpufreq/ppc-corenet-cpufreq.c

> +static int corenet_cpufreq_cpu_init(struct cpufreq_policy *policy)
> +{

> +       for (i = 0; i < count; i++) {
> +               table[i].index = i;

One more thing, you don't need to set index at all as you aren't using it.
And cpufreq core never uses it.

> +               clk = of_clk_get(data->parent, i);
> +               table[i].frequency = clk_get_rate(clk) / 1000;
> +       }

^ permalink raw reply

* Re: [PATCH for 3.9] powerpc/ps3: Update ps3_defconfig
From: Geoff Levand @ 2013-03-30  0:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: cbe-oss-dev, linuxppc-dev
In-Reply-To: <1363655503.1075.40.camel@smoke>

Hi Ben,

It would be nice if I could get this in for 3.9.  Did you miss it?

Thanks.

-Geoff

On Mon, 2013-03-18 at 18:11 -0700, Geoff Levand wrote:
> Refresh and set CONFIG_RD_LZMA=y.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
> Hi Ben,
> 
> This sets CONFIG_RD_LZMA=y, which is needed to boot with
> newer initrd images.
> 
> Please apply for 3.9.
> 
> Also available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git for-powerpc
> 
> -Geoff

^ permalink raw reply

* Re: [PATCH V2] powerpc/MPIC: Add get_version API both for internal and external use
From: Scott Wood @ 2013-03-29 21:55 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, hongtao.jia, B07421
In-Reply-To: <1364593919.13310.12@snotra>

On 03/29/2013 04:51:59 PM, Scott Wood wrote:
> On 03/26/2013 12:28:10 AM, Jia Hongtao wrote:
>> diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
>> index d30e6a6..c893a4b 100644
>> --- a/arch/powerpc/sysdev/mpic.c
>> +++ b/arch/powerpc/sysdev/mpic.c
>> @@ -1165,10 +1165,27 @@ static struct irq_domain_ops mpic_host_ops =3D =20
>> {
>>  	.xlate =3D mpic_host_xlate,
>>  };
>>=20
>> +static u32 mpic_get_version(struct mpic *mpic)
>> +{
>> +	u32 brr1;
>> +
>> +	brr1 =3D _mpic_read(mpic->reg_type, &mpic->thiscpuregs,
>> +			MPIC_FSL_BRR1);
>> +
>> +	return brr1 & MPIC_FSL_BRR1_VER;
>> +}
>> +
>>  /*
>>   * Exported functions
>>   */
>>=20
>> +u32 mpic_primary_get_version(void)
>> +{
>> +	struct mpic *mpic =3D mpic_primary;
>> +
>> +	return mpic_get_version(mpic);
>> +}
>=20
> So this just crashes if there is no mpic_primary or it's a non-FSL =20
> MPIC?

...and since it's specifically checking for the FSL version, "fsl" =20
should be in the name.

-Scott=

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/85xx: workaround for chips with MSI hardware errata
From: Scott Wood @ 2013-03-29 21:54 UTC (permalink / raw)
  To: Jia Hongtao; +Cc: hongtao.jia, B07421, linuxppc-dev
In-Reply-To: <1364268527-32068-2-git-send-email-hongtao.jia@freescale.com>

On 03/25/2013 10:28:47 PM, Jia Hongtao wrote:
> The MPIC version 2.0 has a MSI errata (errata PIC1 of mpc8544), It =20
> causes
> that neither MSI nor MSI-X can work fine. This is a workaround to =20
> allow
> MSI-X to function properly.
>=20
> Signed-off-by: Liu Shuo <soniccat.liu@gmail.com>
> Signed-off-by: Li Yang <leoli@freescale.com>
> Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
> ---
>  arch/powerpc/sysdev/fsl_msi.c | 47 =20
> ++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 44 insertions(+), 3 deletions(-)
>=20
> diff --git a/arch/powerpc/sysdev/fsl_msi.c =20
> b/arch/powerpc/sysdev/fsl_msi.c
> index 178c994..d2f8040 100644
> --- a/arch/powerpc/sysdev/fsl_msi.c
> +++ b/arch/powerpc/sysdev/fsl_msi.c
> @@ -28,6 +28,8 @@
>  #include "fsl_msi.h"
>  #include "fsl_pci.h"
>=20
> +#define MSI_HW_ERRATA_ENDIAN 0x00000010

This should probably be kept in the same place as the other =20
msi->features definitions (e.g. FSL_PIC_IP_*).

> +/* MPIC version 2.0 has erratum PIC1 */
> +static int mpic_has_errata(void)
> +{
> +	if (mpic_primary_get_version() =3D=3D 0x0200)
> +		return 1;
> +
> +	return 0;
> +}

mpic_has_erratum_pic1()

> +	if ((features->fsl_pic_ip & FSL_PIC_IP_MASK) =3D=3D =20
> FSL_PIC_IP_MPIC) {
> +		rc =3D mpic_has_errata();
> +		if (rc > 0) {
> +			msi->feature |=3D MSI_HW_ERRATA_ENDIAN;
> +		} else if (rc < 0) {
> +			err =3D rc;
> +			goto error_out;
> +		}

When would mpic_has_errata() ever return a negative value (maybe =20
mpic_primary_get_version could fail, but you don't allow for that in =20
the interface)?

If you're not going to add a way for errors to be returned back, just =20
do:

if (mpic_has_erratum_pic1())
	msi->feature |=3D MSI_HW_ERRATA_ENDIAN;

-Scott=

^ permalink raw reply

* Re: [PATCH V2] powerpc/MPIC: Add get_version API both for internal and external use
From: Scott Wood @ 2013-03-29 21:51 UTC (permalink / raw)
  To: Jia Hongtao; +Cc: hongtao.jia, B07421, linuxppc-dev
In-Reply-To: <1364275690-26790-1-git-send-email-hongtao.jia@freescale.com>

On 03/26/2013 12:28:10 AM, Jia Hongtao wrote:
> MPIC version is useful information for both mpic_alloc() and =20
> mpic_init().
> The patch provide an API to get MPIC version for reusing the code.
> Also, some other IP block may need MPIC version for their own use.
> The API for external use is also provided.
>=20
> Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
> Signed-off-by: Li Yang <leoli@freescale.com>
> ---
> Changes for V2:
> * Using mpic_get_version() to implement mpic_primary_get_version()
>=20
>  arch/powerpc/include/asm/mpic.h |  3 +++
>  arch/powerpc/sysdev/mpic.c      | 26 +++++++++++++++++++-------
>  2 files changed, 22 insertions(+), 7 deletions(-)
>=20
> diff --git a/arch/powerpc/include/asm/mpic.h =20
> b/arch/powerpc/include/asm/mpic.h
> index c0f9ef9..7d1222d 100644
> --- a/arch/powerpc/include/asm/mpic.h
> +++ b/arch/powerpc/include/asm/mpic.h
> @@ -393,6 +393,9 @@ struct mpic
>  #define	MPIC_REGSET_STANDARD		MPIC_REGSET(0)	/* =20
> Original MPIC */
>  #define	MPIC_REGSET_TSI108		MPIC_REGSET(1)	/* =20
> Tsi108/109 PIC */
>=20
> +/* Get the version of primary MPIC */
> +extern u32 mpic_primary_get_version(void);
> +
>  /* Allocate the controller structure and setup the linux irq descs
>   * for the range if interrupts passed in. No HW initialization is
>   * actually performed.
> diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
> index d30e6a6..c893a4b 100644
> --- a/arch/powerpc/sysdev/mpic.c
> +++ b/arch/powerpc/sysdev/mpic.c
> @@ -1165,10 +1165,27 @@ static struct irq_domain_ops mpic_host_ops =3D {
>  	.xlate =3D mpic_host_xlate,
>  };
>=20
> +static u32 mpic_get_version(struct mpic *mpic)
> +{
> +	u32 brr1;
> +
> +	brr1 =3D _mpic_read(mpic->reg_type, &mpic->thiscpuregs,
> +			MPIC_FSL_BRR1);
> +
> +	return brr1 & MPIC_FSL_BRR1_VER;
> +}
> +
>  /*
>   * Exported functions
>   */
>=20
> +u32 mpic_primary_get_version(void)
> +{
> +	struct mpic *mpic =3D mpic_primary;
> +
> +	return mpic_get_version(mpic);
> +}

So this just crashes if there is no mpic_primary or it's a non-FSL MPIC?

-Scott=

^ permalink raw reply

* [PATCH] powerpc: Fix typo "CONFIG_ICSWX_PID"
From: Paul Bolle @ 2013-03-29 21:02 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel, Jimi Xenidis

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
---
Untested. As this typo was introduced in v3.3, with commit
9d670280908013004f173b2b86414d9b6918511b ("powerpc: Split ICSWX ACOP and
PID processing"), which actually added PPC_ICSWX_PID, this surely needs
testing.

 arch/powerpc/mm/icswx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/icswx.c b/arch/powerpc/mm/icswx.c
index 8cdbd86..915412e 100644
--- a/arch/powerpc/mm/icswx.c
+++ b/arch/powerpc/mm/icswx.c
@@ -67,7 +67,7 @@
 
 void switch_cop(struct mm_struct *next)
 {
-#ifdef CONFIG_ICSWX_PID
+#ifdef CONFIG_PPC_ICSWX_PID
 	mtspr(SPRN_PID, next->context.cop_pid);
 #endif
 	mtspr(SPRN_ACOP, next->context.acop);
-- 
1.7.11.7

^ permalink raw reply related

* Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
From: Scott Wood @ 2013-03-29 16:33 UTC (permalink / raw)
  To: Jia Hongtao-B38951
  Cc: Wood Scott-B07421, David Laight, linuxppc-dev@lists.ozlabs.org,
	Stuart Yoder
In-Reply-To: <412C8208B4A0464FA894C5F0C278CD5D01C2EBEA@039-SN1MPN1-003.039d.mgd.msft.net>

On 03/29/2013 03:03:51 AM, Jia Hongtao-B38951 wrote:
>=20
>=20
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Saturday, March 16, 2013 12:35 AM
> > To: Jia Hongtao-B38951
> > Cc: Wood Scott-B07421; David Laight; linuxppc-dev@lists.ozlabs.org;
> > Stuart Yoder
> > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to =20
> fix
> > PCIe erratum on mpc85xx
> >
> > On 03/14/2013 09:47:58 PM, Jia Hongtao-B38951 wrote:
> > >
> > > > -----Original Message-----
> > > > From: Wood Scott-B07421
> > > > Sent: Thursday, March 14, 2013 12:38 AM
> > > > To: David Laight
> > > > Cc: Jia Hongtao-B38951; Wood Scott-B07421;
> > > linuxppc-dev@lists.ozlabs.org;
> > > > Stuart Yoder
> > > > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler =20
> to
> > > fix
> > > > PCIe erratum on mpc85xx
> > > >
> > > > On 03/13/2013 04:40:40 AM, David Laight wrote:
> > > > > > Hmm, seems there's no probe_user_address() -- for userspace =20
> we
> > > > > > basically want the same thing minus the KERNEL_DS.  See
> > > > > > arch/powerpc/perf/callchain.c for an example.
> > > > >
> > > > > Isn't that just copy_from_user() ?
> > > >
> > > > Plus pagefault_disable/enable().
> > > >
> > > > -Scott
> > >
> > > pagefault_disable() is identical to preempt_disable(). So I think =20
> this
> > > could not avoid other cpu to swap out the instruction we want to =20
> read
> > > back.
> > > probe_kernel_address() also have the same issue.
> >
> > That's not the point -- the point is to let the page fault handler =20
> know
> > that it should go directly to bad_page_fault().  Do not pass
> > handle_mm_fault().  Do not collect a page from disk.
> >
> > Granted, we're already in atomic context which will have that effect
> > due to being in the machine check handler, but it's better to be
> > explicit about it and not depend on how pagefault_diasble() is
> > implemented.
> >
> > -Scott
>=20
>=20
> Based on the comments I updated the machine check handler.
>=20
> Changes from last version:
> * Check MSR_GS state
> * Check if the instruction is LD
> * Handle the user space issue
>=20
> The updated machine check handler is as following:
>=20
> int fsl_pci_mcheck_exception(struct pt_regs *regs)
> {
>         unsigned int op, rd;
>         u32 inst;
>         int ret;
>         phys_addr_t addr =3D 0;
>=20
>         /* Let KVM/QEMU deal with the exception */
>         if (regs->msr & MSR_GS)
>                 return 0;
>=20
> #ifdef CONFIG_PHYS_64BIT
>         addr =3D mfspr(SPRN_MCARU);
>         addr <<=3D 32;
> #endif
>         addr +=3D mfspr(SPRN_MCAR);
>=20
>         if (is_in_pci_mem_space(addr)) {
>                 if (user_mode(regs)) {
>                         pagefault_disable();
>                         ret =3D copy_from_user(&(inst), (u32 __user =20
> *)regs->nip, sizeof(inst));
>                         pagefault_enable();

You could use get_user() instead of copy_from_user().

>                 } else {
>                         ret =3D probe_kernel_address(regs->nip, inst);
>                 }
>=20
>                 op =3D get_op(inst);
>                 /* Check if the instruction is LD */
>                 if (!ret && (op =3D=3D 111010)) {
>                         rd =3D get_rt(inst);
>                         regs->gpr[rd] =3D 0xffffffff;
>                 }
>=20
>                 regs->nip +=3D 4;
>                 return 1;
>         }
>=20
>         return 0;
> }
>=20
> BTW, I'm still not sure how to deal with LD instruction with update.

You would need to do the update yourself.  Or just say that's a case =20
you don't handle, and return 0.

Again, please check for the size of the load operation.

-Scott=

^ permalink raw reply

* Re: weird elf header issues, is it binutils or my linker script?
From: Segher Boessenkool @ 2013-03-29 12:01 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <51545BF3.2090204@genband.com>

> PHDRS
> {
>   headers PT_PHDR PHDRS ;
>   interp PT_INTERP ;
> <snip>
> }
>
> SECTIONS
> {
>   /* Read-only sections, merged into text segment: */
>   PROVIDE (__executable_start = 0xf2000000); . = 0xf2000000 +  
> SIZEOF_HEADERS;
>   .interp         : { *(.interp) } :text :interp
> <snip>
> }
>
> So I'm wondering...is this something wrong with our linker script,  
> or is there a bug
> in our binutils?  I'm no linker expert, but the interpreter  
> sections in the script
> seem to match the binutils documentation that I found and I don't  
> see anything that
> would be messing with the length.
>
> Any suggestions on where to look?

It looks like your .interp input section lacks the required zero- 
termination.


Segher

^ permalink raw reply

* [RFC PATCH v2 5/6] powerpc: select HAVE_CONTEXT_TRACKING for pSeries
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

Start context tracking support from pSeries.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/Kconfig |    1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 9a0941b..023b288 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -18,6 +18,7 @@ config PPC_PSERIES
 	select PPC_PCI_CHOICE if EXPERT
 	select ZLIB_DEFLATE
 	select PPC_DOORBELL
+	select HAVE_CONTEXT_TRACKING
 	default y
 
 config PPC_SPLPAR
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 6/6] powerpc: Use generic code for exception handling
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

After the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
  context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
  context_tracking: Restore correct previous context state on exception exit

it is able for this patch to replace the implementation in arch code
with the generic code in above commits.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/context_tracking.h |   29 ---------------
 arch/powerpc/kernel/exceptions-64s.S        |    4 +--
 arch/powerpc/kernel/traps.c                 |   42 +++++++++++++---------
 arch/powerpc/mm/fault.c                     |    7 ++--
 arch/powerpc/mm/hash_utils_64.c             |   51 ++++++++++++++-------------
 5 files changed, 57 insertions(+), 76 deletions(-)

diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h
index 4da287e..b6f5a33 100644
--- a/arch/powerpc/include/asm/context_tracking.h
+++ b/arch/powerpc/include/asm/context_tracking.h
@@ -1,39 +1,10 @@
 #ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
 #define _ASM_POWERPC_CONTEXT_TRACKING_H
 
-#ifndef __ASSEMBLY__
-#include <linux/context_tracking.h>
-#include <asm/ptrace.h>
-
-/*
- * temporarily defined to avoid potential conflicts with the common
- * implementation, these will be removed by a later patch after the common
- * code enters powerpc tree
- */
-#define exception_enter __exception_enter
-#define exception_exit __exception_exit
-
-static inline void __exception_enter(struct pt_regs *regs)
-{
-	user_exit();
-}
-
-static inline void __exception_exit(struct pt_regs *regs)
-{
-#ifdef CONFIG_CONTEXT_TRACKING
-	if (user_mode(regs))
-		user_enter();
-#endif
-}
-
-#else /* __ASSEMBLY__ */
-
 #ifdef CONFIG_CONTEXT_TRACKING
 #define SCHEDULE_USER bl	.schedule_user
 #else
 #define SCHEDULE_USER bl	.schedule
 #endif
 
-#endif /* !__ASSEMBLY__ */
-
 #endif
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 6d82f4f..a8a5361 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1368,17 +1368,15 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_SLB)
 	rlwimi	r4,r0,32-13,30,30	/* becomes _PAGE_USER access bit */
 	ori	r4,r4,1			/* add _PAGE_PRESENT */
 	rlwimi	r4,r5,22+2,31-2,31-2	/* Set _PAGE_EXEC if trap is 0x400 */
-	addi	r6,r1,STACK_FRAME_OVERHEAD
 
 	/*
 	 * r3 contains the faulting address
 	 * r4 contains the required access permissions
 	 * r5 contains the trap number
-	 * r6 contains the address of pt_regs
 	 *
 	 * at return r3 = 0 for success, 1 for page fault, negative for error
 	 */
-	bl	.hash_page_ct		/* build HPTE if possible */
+	bl	.hash_page		/* build HPTE if possible */
 	cmpdi	r3,0			/* see if hash_page succeeded */
 
 	/* Success */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 6228b6b..1b46c2d9 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -35,6 +35,7 @@
 #include <linux/kdebug.h>
 #include <linux/debugfs.h>
 #include <linux/ratelimit.h>
+#include <linux/context_tracking.h>
 
 #include <asm/emulated_ops.h>
 #include <asm/pgtable.h>
@@ -60,7 +61,6 @@
 #include <asm/switch_to.h>
 #include <asm/tm.h>
 #include <asm/debug.h>
-#include <asm/context_tracking.h>
 
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 int (*__debugger)(struct pt_regs *regs) __read_mostly;
@@ -669,8 +669,9 @@ int machine_check_generic(struct pt_regs *regs)
 void machine_check_exception(struct pt_regs *regs)
 {
 	int recover = 0;
+	enum ctx_state prev_state;
 
-	exception_enter(regs);
+	prev_state = exception_enter();
 
 	__get_cpu_var(irq_stat).mce_exceptions++;
 
@@ -712,7 +713,7 @@ void machine_check_exception(struct pt_regs *regs)
 		panic("Unrecoverable Machine check");
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void SMIException(struct pt_regs *regs)
@@ -722,19 +723,21 @@ void SMIException(struct pt_regs *regs)
 
 void unknown_exception(struct pt_regs *regs)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
 	       regs->nip, regs->msr, regs->trap);
 
 	_exception(SIGTRAP, regs, 0, 0);
 
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void instruction_breakpoint_exception(struct pt_regs *regs)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	if (notify_die(DIE_IABR_MATCH, "iabr_match", regs, 5,
 					5, SIGTRAP) == NOTIFY_STOP)
@@ -744,7 +747,7 @@ void instruction_breakpoint_exception(struct pt_regs *regs)
 	_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void RunModeException(struct pt_regs *regs)
@@ -754,7 +757,8 @@ void RunModeException(struct pt_regs *regs)
 
 void __kprobes single_step_exception(struct pt_regs *regs)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	clear_single_step(regs);
 
@@ -767,7 +771,7 @@ void __kprobes single_step_exception(struct pt_regs *regs)
 	_exception(SIGTRAP, regs, TRAP_TRACE, regs->nip);
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 /*
@@ -1020,9 +1024,10 @@ int is_valid_bugaddr(unsigned long addr)
 void __kprobes program_check_exception(struct pt_regs *regs)
 {
 	unsigned int reason = get_reason(regs);
+	enum ctx_state prev_state;
 	extern int do_mathemu(struct pt_regs *regs);
 
-	exception_enter(regs);
+	prev_state = exception_enter();
 
 	/* We can now get here via a FP Unavailable exception if the core
 	 * has no FPU, in that case the reason flags will be 0 */
@@ -1132,14 +1137,15 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 		_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void alignment_exception(struct pt_regs *regs)
 {
 	int sig, code, fixed = 0;
+	enum ctx_state prev_state;
 
-	exception_enter(regs);
+	prev_state = exception_enter();
 
 	/* We restore the interrupt state now */
 	if (!arch_irq_disabled_regs(regs))
@@ -1169,7 +1175,7 @@ void alignment_exception(struct pt_regs *regs)
 		bad_page_fault(regs, regs->dar, sig);
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void StackOverflow(struct pt_regs *regs)
@@ -1198,18 +1204,20 @@ void trace_syscall(struct pt_regs *regs)
 
 void kernel_fp_unavailable_exception(struct pt_regs *regs)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	printk(KERN_EMERG "Unrecoverable FP Unavailable Exception "
 			  "%lx at %lx\n", regs->trap, regs->nip);
 	die("Unrecoverable FP Unavailable Exception", regs, SIGABRT);
 
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void altivec_unavailable_exception(struct pt_regs *regs)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	if (user_mode(regs)) {
 		/* A user program has executed an altivec instruction,
@@ -1223,7 +1231,7 @@ void altivec_unavailable_exception(struct pt_regs *regs)
 	die("Unrecoverable VMX/Altivec Unavailable Exception", regs, SIGABRT);
 
 exit:
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 void vsx_unavailable_exception(struct pt_regs *regs)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 108ab17..141835b 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -32,6 +32,7 @@
 #include <linux/perf_event.h>
 #include <linux/magic.h>
 #include <linux/ratelimit.h>
+#include <linux/context_tracking.h>
 
 #include <asm/firmware.h>
 #include <asm/page.h>
@@ -42,7 +43,6 @@
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
-#include <asm/context_tracking.h>
 #include <mm/mmu_decl.h>
 
 #include "icswx.h"
@@ -480,9 +480,10 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 			    unsigned long error_code)
 {
 	int ret;
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 	ret = __do_page_fault(regs, address, error_code);
-	exception_exit(regs);
+	exception_exit(prev_state);
 	return ret;
 }
 
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 360fba8..eeab30f 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -33,6 +33,7 @@
 #include <linux/init.h>
 #include <linux/signal.h>
 #include <linux/memblock.h>
+#include <linux/context_tracking.h>
 
 #include <asm/processor.h>
 #include <asm/pgtable.h>
@@ -56,7 +57,6 @@
 #include <asm/fadump.h>
 #include <asm/firmware.h>
 #include <asm/tm.h>
-#include <asm/context_tracking.h>
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
@@ -919,13 +919,17 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 	const struct cpumask *tmp;
 	int rc, user_region = 0, local = 0;
 	int psize, ssize;
+	enum ctx_state prev_state;
+
+	prev_state = exception_enter();
 
 	DBG_LOW("hash_page(ea=%016lx, access=%lx, trap=%lx\n",
 		ea, access, trap);
 
 	if ((ea & ~REGION_MASK) >= PGTABLE_RANGE) {
 		DBG_LOW(" out of pgtable range !\n");
- 		return 1;
+		rc = 1;
+		goto exit;
 	}
 
 	/* Get region & vsid */
@@ -935,7 +939,8 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 		mm = current->mm;
 		if (! mm) {
 			DBG_LOW(" user region with no mm !\n");
-			return 1;
+			rc = 1;
+			goto exit;
 		}
 		psize = get_slice_psize(mm, ea);
 		ssize = user_segment_size(ea);
@@ -954,14 +959,17 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 		/* Not a valid range
 		 * Send the problem up to do_page_fault 
 		 */
-		return 1;
+		rc = 1;
+		goto exit;
 	}
 	DBG_LOW(" mm=%p, mm->pgdir=%p, vsid=%016lx\n", mm, mm->pgd, vsid);
 
 	/* Get pgdir */
 	pgdir = mm->pgd;
-	if (pgdir == NULL)
-		return 1;
+	if (pgdir == NULL) {
+		rc = 1;
+		goto exit;
+	}
 
 	/* Check CPU locality */
 	tmp = cpumask_of(smp_processor_id());
@@ -984,7 +992,8 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 	ptep = find_linux_pte_or_hugepte(pgdir, ea, &hugeshift);
 	if (ptep == NULL || !pte_present(*ptep)) {
 		DBG_LOW(" no PTE !\n");
-		return 1;
+		rc = 1;
+		goto exit;
 	}
 
 	/* Add _PAGE_PRESENT to the required access perm */
@@ -995,13 +1004,16 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 	 */
 	if (access & ~pte_val(*ptep)) {
 		DBG_LOW(" no access !\n");
-		return 1;
+		rc = 1;
+		goto exit;
 	}
 
 #ifdef CONFIG_HUGETLB_PAGE
-	if (hugeshift)
-		return __hash_page_huge(ea, access, vsid, ptep, trap, local,
+	if (hugeshift) {
+		rc = __hash_page_huge(ea, access, vsid, ptep, trap, local,
 					ssize, hugeshift, psize);
+		goto exit;
+	}
 #endif /* CONFIG_HUGETLB_PAGE */
 
 #ifndef CONFIG_PPC_64K_PAGES
@@ -1081,22 +1093,12 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 		pte_val(*(ptep + PTRS_PER_PTE)));
 #endif
 	DBG_LOW(" -> rc=%d\n", rc);
+exit:
+	exception_exit(prev_state);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-int hash_page_ct(unsigned long ea, unsigned long access,
-		 unsigned long trap, struct pt_regs *regs)
-{
-	int ret;
-
-	exception_enter(regs);
-	ret = hash_page(ea, access, trap);
-	exception_exit(regs);
-
-	return ret;
-}
-
 void hash_preload(struct mm_struct *mm, unsigned long ea,
 		  unsigned long access, unsigned long trap)
 {
@@ -1223,7 +1225,8 @@ void flush_hash_range(unsigned long number, int local)
  */
 void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
 {
-	exception_enter(regs);
+	enum ctx_state prev_state;
+	prev_state = exception_enter();
 
 	if (user_mode(regs)) {
 #ifdef CONFIG_PPC_SUBPAGE_PROT
@@ -1235,7 +1238,7 @@ void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
 	} else
 		bad_page_fault(regs, address, SIGBUS);
 
-	exception_exit(regs);
+	exception_exit(prev_state);
 }
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 4/6] powerpc: Use the new schedule_user API on userspace preemption
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

This patch corresponds to
[PATCH] x86: Use the new schedule_user API on userspace preemption
  commit 0430499ce9d78691f3985962021b16bf8f8a8048

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/context_tracking.h |   11 +++++++++++
 arch/powerpc/kernel/entry_64.S              |    3 ++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h
index 377146e..4da287e 100644
--- a/arch/powerpc/include/asm/context_tracking.h
+++ b/arch/powerpc/include/asm/context_tracking.h
@@ -1,6 +1,7 @@
 #ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
 #define _ASM_POWERPC_CONTEXT_TRACKING_H
 
+#ifndef __ASSEMBLY__
 #include <linux/context_tracking.h>
 #include <asm/ptrace.h>
 
@@ -25,4 +26,14 @@ static inline void __exception_exit(struct pt_regs *regs)
 #endif
 }
 
+#else /* __ASSEMBLY__ */
+
+#ifdef CONFIG_CONTEXT_TRACKING
+#define SCHEDULE_USER bl	.schedule_user
+#else
+#define SCHEDULE_USER bl	.schedule
+#endif
+
+#endif /* !__ASSEMBLY__ */
+
 #endif
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 256c5bf..f7e4622 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -33,6 +33,7 @@
 #include <asm/irqflags.h>
 #include <asm/ftrace.h>
 #include <asm/hw_irq.h>
+#include <asm/context_tracking.h>
 
 /*
  * System calls.
@@ -618,7 +619,7 @@ _GLOBAL(ret_from_except_lite)
 	andi.	r0,r4,_TIF_NEED_RESCHED
 	beq	1f
 	bl	.restore_interrupts
-	bl	.schedule
+	SCHEDULE_USER
 	b	.ret_from_except_lite
 
 1:	bl	.save_nvgprs
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 3/6] powerpc: Exit user context on notify resume
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

This patch allows RCU usage in do_notify_resume, e.g. signal handling.
It corresponds to
[PATCH] x86: Exit RCU extended QS on notify resume
  commit edf55fda35c7dc7f2d9241c3abaddaf759b457c6

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/signal.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index cf12eae..d63b502 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -13,6 +13,7 @@
 #include <linux/signal.h>
 #include <linux/uprobes.h>
 #include <linux/key.h>
+#include <linux/context_tracking.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/uaccess.h>
 #include <asm/unistd.h>
@@ -159,6 +160,8 @@ static int do_signal(struct pt_regs *regs)
 
 void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 {
+	user_exit();
+
 	if (thread_info_flags & _TIF_UPROBE)
 		uprobe_notify_resume(regs);
 
@@ -169,4 +172,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
 	}
+
+	user_enter();
 }
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 2/6] powerpc: Exception hooks for context tracking subsystem
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.

This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
  commit 6ba3c97a38803883c2eee489505796cb0a727122

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/context_tracking.h |   28 +++++++++
 arch/powerpc/kernel/exceptions-64s.S        |    4 +-
 arch/powerpc/kernel/traps.c                 |   83 ++++++++++++++++++++-------
 arch/powerpc/mm/fault.c                     |   15 ++++-
 arch/powerpc/mm/hash_utils_64.c             |   17 ++++++
 5 files changed, 122 insertions(+), 25 deletions(-)
 create mode 100644 arch/powerpc/include/asm/context_tracking.h

diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h
new file mode 100644
index 0000000..377146e
--- /dev/null
+++ b/arch/powerpc/include/asm/context_tracking.h
@@ -0,0 +1,28 @@
+#ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
+#define _ASM_POWERPC_CONTEXT_TRACKING_H
+
+#include <linux/context_tracking.h>
+#include <asm/ptrace.h>
+
+/*
+ * temporarily defined to avoid potential conflicts with the common
+ * implementation, these will be removed by a later patch after the common
+ * code enters powerpc tree
+ */
+#define exception_enter __exception_enter
+#define exception_exit __exception_exit
+
+static inline void __exception_enter(struct pt_regs *regs)
+{
+	user_exit();
+}
+
+static inline void __exception_exit(struct pt_regs *regs)
+{
+#ifdef CONFIG_CONTEXT_TRACKING
+	if (user_mode(regs))
+		user_enter();
+#endif
+}
+
+#endif
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index a8a5361..6d82f4f 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1368,15 +1368,17 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_SLB)
 	rlwimi	r4,r0,32-13,30,30	/* becomes _PAGE_USER access bit */
 	ori	r4,r4,1			/* add _PAGE_PRESENT */
 	rlwimi	r4,r5,22+2,31-2,31-2	/* Set _PAGE_EXEC if trap is 0x400 */
+	addi	r6,r1,STACK_FRAME_OVERHEAD
 
 	/*
 	 * r3 contains the faulting address
 	 * r4 contains the required access permissions
 	 * r5 contains the trap number
+	 * r6 contains the address of pt_regs
 	 *
 	 * at return r3 = 0 for success, 1 for page fault, negative for error
 	 */
-	bl	.hash_page		/* build HPTE if possible */
+	bl	.hash_page_ct		/* build HPTE if possible */
 	cmpdi	r3,0			/* see if hash_page succeeded */
 
 	/* Success */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 37cc40e..6228b6b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -60,6 +60,7 @@
 #include <asm/switch_to.h>
 #include <asm/tm.h>
 #include <asm/debug.h>
+#include <asm/context_tracking.h>
 
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 int (*__debugger)(struct pt_regs *regs) __read_mostly;
@@ -669,6 +670,8 @@ void machine_check_exception(struct pt_regs *regs)
 {
 	int recover = 0;
 
+	exception_enter(regs);
+
 	__get_cpu_var(irq_stat).mce_exceptions++;
 
 	/* See if any machine dependent calls. In theory, we would want
@@ -683,7 +686,7 @@ void machine_check_exception(struct pt_regs *regs)
 		recover = cur_cpu_spec->machine_check(regs);
 
 	if (recover > 0)
-		return;
+		goto exit;
 
 #if defined(CONFIG_8xx) && defined(CONFIG_PCI)
 	/* the qspan pci read routines can cause machine checks -- Cort
@@ -693,20 +696,23 @@ void machine_check_exception(struct pt_regs *regs)
 	 * -- BenH
 	 */
 	bad_page_fault(regs, regs->dar, SIGBUS);
-	return;
+	goto exit;
 #endif
 
 	if (debugger_fault_handler(regs))
-		return;
+		goto exit;
 
 	if (check_io_access(regs))
-		return;
+		goto exit;
 
 	die("Machine check", regs, SIGBUS);
 
 	/* Must die if the interrupt is not recoverable */
 	if (!(regs->msr & MSR_RI))
 		panic("Unrecoverable Machine check");
+
+exit:
+	exception_exit(regs);
 }
 
 void SMIException(struct pt_regs *regs)
@@ -716,20 +722,29 @@ void SMIException(struct pt_regs *regs)
 
 void unknown_exception(struct pt_regs *regs)
 {
+	exception_enter(regs);
+
 	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
 	       regs->nip, regs->msr, regs->trap);
 
 	_exception(SIGTRAP, regs, 0, 0);
+
+	exception_exit(regs);
 }
 
 void instruction_breakpoint_exception(struct pt_regs *regs)
 {
+	exception_enter(regs);
+
 	if (notify_die(DIE_IABR_MATCH, "iabr_match", regs, 5,
 					5, SIGTRAP) == NOTIFY_STOP)
-		return;
+		goto exit;
 	if (debugger_iabr_match(regs))
-		return;
+		goto exit;
 	_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
+
+exit:
+	exception_exit(regs);
 }
 
 void RunModeException(struct pt_regs *regs)
@@ -739,15 +754,20 @@ void RunModeException(struct pt_regs *regs)
 
 void __kprobes single_step_exception(struct pt_regs *regs)
 {
+	exception_enter(regs);
+
 	clear_single_step(regs);
 
 	if (notify_die(DIE_SSTEP, "single_step", regs, 5,
 					5, SIGTRAP) == NOTIFY_STOP)
-		return;
+		goto exit;
 	if (debugger_sstep(regs))
-		return;
+		goto exit;
 
 	_exception(SIGTRAP, regs, TRAP_TRACE, regs->nip);
+
+exit:
+	exception_exit(regs);
 }
 
 /*
@@ -1002,32 +1022,34 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 	unsigned int reason = get_reason(regs);
 	extern int do_mathemu(struct pt_regs *regs);
 
+	exception_enter(regs);
+
 	/* We can now get here via a FP Unavailable exception if the core
 	 * has no FPU, in that case the reason flags will be 0 */
 
 	if (reason & REASON_FP) {
 		/* IEEE FP exception */
 		parse_fpe(regs);
-		return;
+		goto exit;
 	}
 	if (reason & REASON_TRAP) {
 		/* Debugger is first in line to stop recursive faults in
 		 * rcu_lock, notify_die, or atomic_notifier_call_chain */
 		if (debugger_bpt(regs))
-			return;
+			goto exit;
 
 		/* trap exception */
 		if (notify_die(DIE_BPT, "breakpoint", regs, 5, 5, SIGTRAP)
 				== NOTIFY_STOP)
-			return;
+			goto exit;
 
 		if (!(regs->msr & MSR_PR) &&  /* not user-mode */
 		    report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) {
 			regs->nip += 4;
-			return;
+			goto exit;
 		}
 		_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
-		return;
+		goto exit;
 	}
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	if (reason & REASON_TM) {
@@ -1043,7 +1065,7 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 		if (!user_mode(regs) &&
 		    report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) {
 			regs->nip += 4;
-			return;
+			goto exit;
 		}
 		/* If usermode caused this, it's done something illegal and
 		 * gets a SIGILL slap on the wrist.  We call it an illegal
@@ -1053,7 +1075,7 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 		 */
 		if (user_mode(regs)) {
 			_exception(SIGILL, regs, ILL_ILLOPN, regs->nip);
-			return;
+			goto exit;
 		} else {
 			printk(KERN_EMERG "Unexpected TM Bad Thing exception "
 			       "at %lx (msr 0x%x)\n", regs->nip, reason);
@@ -1077,16 +1099,16 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 	switch (do_mathemu(regs)) {
 	case 0:
 		emulate_single_step(regs);
-		return;
+		goto exit;
 	case 1: {
 			int code = 0;
 			code = __parse_fpscr(current->thread.fpscr.val);
 			_exception(SIGFPE, regs, code, regs->nip);
-			return;
+			goto exit;
 		}
 	case -EFAULT:
 		_exception(SIGSEGV, regs, SEGV_MAPERR, regs->nip);
-		return;
+		goto exit;
 	}
 	/* fall through on any other errors */
 #endif /* CONFIG_MATH_EMULATION */
@@ -1097,10 +1119,10 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 		case 0:
 			regs->nip += 4;
 			emulate_single_step(regs);
-			return;
+			goto exit;
 		case -EFAULT:
 			_exception(SIGSEGV, regs, SEGV_MAPERR, regs->nip);
-			return;
+			goto exit;
 		}
 	}
 
@@ -1108,12 +1130,17 @@ void __kprobes program_check_exception(struct pt_regs *regs)
 		_exception(SIGILL, regs, ILL_PRVOPC, regs->nip);
 	else
 		_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
+
+exit:
+	exception_exit(regs);
 }
 
 void alignment_exception(struct pt_regs *regs)
 {
 	int sig, code, fixed = 0;
 
+	exception_enter(regs);
+
 	/* We restore the interrupt state now */
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
@@ -1125,7 +1152,7 @@ void alignment_exception(struct pt_regs *regs)
 	if (fixed == 1) {
 		regs->nip += 4;	/* skip over emulated instruction */
 		emulate_single_step(regs);
-		return;
+		goto exit;
 	}
 
 	/* Operand address was bad */
@@ -1140,6 +1167,9 @@ void alignment_exception(struct pt_regs *regs)
 		_exception(sig, regs, code, regs->dar);
 	else
 		bad_page_fault(regs, regs->dar, sig);
+
+exit:
+	exception_exit(regs);
 }
 
 void StackOverflow(struct pt_regs *regs)
@@ -1168,23 +1198,32 @@ void trace_syscall(struct pt_regs *regs)
 
 void kernel_fp_unavailable_exception(struct pt_regs *regs)
 {
+	exception_enter(regs);
+
 	printk(KERN_EMERG "Unrecoverable FP Unavailable Exception "
 			  "%lx at %lx\n", regs->trap, regs->nip);
 	die("Unrecoverable FP Unavailable Exception", regs, SIGABRT);
+
+	exception_exit(regs);
 }
 
 void altivec_unavailable_exception(struct pt_regs *regs)
 {
+	exception_enter(regs);
+
 	if (user_mode(regs)) {
 		/* A user program has executed an altivec instruction,
 		   but this kernel doesn't support altivec. */
 		_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
-		return;
+		goto exit;
 	}
 
 	printk(KERN_EMERG "Unrecoverable VMX/Altivec Unavailable Exception "
 			"%lx at %lx\n", regs->trap, regs->nip);
 	die("Unrecoverable VMX/Altivec Unavailable Exception", regs, SIGABRT);
+
+exit:
+	exception_exit(regs);
 }
 
 void vsx_unavailable_exception(struct pt_regs *regs)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 229951f..108ab17 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -42,6 +42,7 @@
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
+#include <asm/context_tracking.h>
 #include <mm/mmu_decl.h>
 
 #include "icswx.h"
@@ -193,8 +194,8 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
  * The return value is 0 if the fault was handled, or the signal
  * number if this is a kernel fault that can't be handled here.
  */
-int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
-			    unsigned long error_code)
+static int __kprobes __do_page_fault(struct pt_regs *regs,
+				unsigned long address, unsigned long error_code)
 {
 	struct vm_area_struct * vma;
 	struct mm_struct *mm = current->mm;
@@ -475,6 +476,16 @@ bad_area_nosemaphore:
 
 }
 
+int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
+			    unsigned long error_code)
+{
+	int ret;
+	exception_enter(regs);
+	ret = __do_page_fault(regs, address, error_code);
+	exception_exit(regs);
+	return ret;
+}
+
 /*
  * bad_page_fault is called when we have a bad access from the kernel.
  * It is called from the DSI and ISI handlers in head.S and from some
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1b6e127..360fba8 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -56,6 +56,7 @@
 #include <asm/fadump.h>
 #include <asm/firmware.h>
 #include <asm/tm.h>
+#include <asm/context_tracking.h>
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
@@ -1084,6 +1085,18 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
+int hash_page_ct(unsigned long ea, unsigned long access,
+		 unsigned long trap, struct pt_regs *regs)
+{
+	int ret;
+
+	exception_enter(regs);
+	ret = hash_page(ea, access, trap);
+	exception_exit(regs);
+
+	return ret;
+}
+
 void hash_preload(struct mm_struct *mm, unsigned long ea,
 		  unsigned long access, unsigned long trap)
 {
@@ -1210,6 +1223,8 @@ void flush_hash_range(unsigned long number, int local)
  */
 void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
 {
+	exception_enter(regs);
+
 	if (user_mode(regs)) {
 #ifdef CONFIG_PPC_SUBPAGE_PROT
 		if (rc == -2)
@@ -1219,6 +1234,8 @@ void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
 			_exception(SIGBUS, regs, BUS_ADRERR, address);
 	} else
 		bad_page_fault(regs, address, SIGBUS);
+
+	exception_exit(regs);
 }
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 1/6] powerpc: Syscall hooks for context tracking subsystem
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev
In-Reply-To: <1364551221-23177-1-git-send-email-zhong@linux.vnet.ibm.com>

This is the syscall slow path hooks for context tracking subsystem,
corresponding to
[PATCH] x86: Syscall hooks for userspace RCU extended QS
  commit bf5a3c13b939813d28ce26c01425054c740d6731

TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there
is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is
better for it to be in the same 16 bits with others in the group, so in the
asm code, andi. with this group could work.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/powerpc/include/asm/thread_info.h |    7 +++++--
 arch/powerpc/kernel/ptrace.c           |    5 +++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 406b7b9..414a261 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -97,7 +97,7 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_PERFMON_CTXSW	6	/* perfmon needs ctxsw calls */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SINGLESTEP		8	/* singlestepping active */
-#define TIF_MEMDIE		9	/* is terminating due to OOM killer */
+#define TIF_NOHZ		9	/* in adaptive nohz mode */
 #define TIF_SECCOMP		10	/* secure computing */
 #define TIF_RESTOREALL		11	/* Restore all regs (implies NOERROR) */
 #define TIF_NOERROR		12	/* Force successful syscall return */
@@ -106,6 +106,7 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_SYSCALL_TRACEPOINT	15	/* syscall tracepoint instrumentation */
 #define TIF_EMULATE_STACK_STORE	16	/* Is an instruction emulation
 						for stack store? */
+#define TIF_MEMDIE		17	/* is terminating due to OOM killer */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
@@ -124,8 +125,10 @@ static inline struct thread_info *current_thread_info(void)
 #define _TIF_UPROBE		(1<<TIF_UPROBE)
 #define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
 #define _TIF_EMULATE_STACK_STORE	(1<<TIF_EMULATE_STACK_STORE)
+#define _TIF_NOHZ		(1<<TIF_NOHZ)
 #define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
-				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
+				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
+				 _TIF_NOHZ)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 				 _TIF_NOTIFY_RESUME | _TIF_UPROBE)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 245c1b6..0b7aad0 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -32,6 +32,7 @@
 #include <trace/syscall.h>
 #include <linux/hw_breakpoint.h>
 #include <linux/perf_event.h>
+#include <linux/context_tracking.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -1778,6 +1779,8 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	long ret = 0;
 
+	user_exit();
+
 	secure_computing_strict(regs->gpr[0]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
@@ -1822,4 +1825,6 @@ void do_syscall_trace_leave(struct pt_regs *regs)
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, step);
+
+	user_enter();
 }
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH v2 0/6] powerpc: Support context tracking for Power pSeries
From: Li Zhong @ 2013-03-29 10:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Li Zhong, fweisbec, paulus, paulmck, linuxppc-dev

These patches try to support context tracking for Power arch, beginning with
64-bit pSeries. The codes are ported from that of the x86_64, and in each
patch, I listed the corresponding patch for x86.

Would you please help review and give your comments?

v2:

I rebased the patches against 3.9-rcs, and also added a patch to replace
the exception handling with the generic code in tip timers/nohz.

I assume these patches would get in through powerpc tree, so I didn't combine
the new patch (#6) with the original one (#2). So that if powerpc tree picks
these, it could pick the first five patches, and apply patch #6 later
when the dependency enters into powerpc tree (maybe on some 3.10-rcs).

I'm also wondering whether it is possible for these to go through
tip timers/nohz, so for now, patches #6 and #2 could be combined into one, and
no need to  worry about the issues caused by arch/common code merging. And it
might also make future changes easier.

Thanks, Zhong

Li Zhong (6):
  powerpc: Syscall hooks for context tracking subsystem
  powerpc: Exception hooks for context tracking subsystem
  powerpc: Exit user context on notify resume
  powerpc: Use the new schedule_user API on userspace preemption
  powerpc: select HAVE_CONTEXT_TRACKING for pSeries
  powerpc: Use generic code for exception handling

 arch/powerpc/include/asm/context_tracking.h |   10 +++
 arch/powerpc/include/asm/thread_info.h      |    7 ++-
 arch/powerpc/kernel/entry_64.S              |    3 +-
 arch/powerpc/kernel/ptrace.c                |    5 ++
 arch/powerpc/kernel/signal.c                |    5 ++
 arch/powerpc/kernel/traps.c                 |   91 ++++++++++++++++++++-------
 arch/powerpc/mm/fault.c                     |   16 ++++-
 arch/powerpc/mm/hash_utils_64.c             |   38 ++++++++---
 arch/powerpc/platforms/pseries/Kconfig      |    1 +
 9 files changed, 140 insertions(+), 36 deletions(-)
 create mode 100644 arch/powerpc/include/asm/context_tracking.h

-- 
1.7.9.5

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox