LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [linuxppc-release] [PATCH 2/2] powerpc/85xx: update SEC node in dts for MPC8572DS
From: Timur Tabi @ 2012-02-08 16:39 UTC (permalink / raw)
  To: Jia Hongtao; +Cc: linuxppc-dev
In-Reply-To: <1328693276-12057-2-git-send-email-B38951@freescale.com>

Jia Hongtao wrote:
> Add sec3.1 support
> 
> Signed-off-by: Jin Qing <b24347@freescale.com>
> Signed-off-by: Zhao Chenhui <b35336@freescale.com>
> Signed-off-by: Jia Hongtao <B38951@freescale.com>
> Signed-off-by: Li Yang <leoli@freescale.com>
> ---
>  arch/powerpc/boot/dts/fsl/mpc8572si-post.dtsi |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/mpc8572si-post.dtsi b/arch/powerpc/boot/dts/fsl/mpc8572si-post.dtsi
> index d44e25a..cdda34f 100644
> --- a/arch/powerpc/boot/dts/fsl/mpc8572si-post.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/mpc8572si-post.dtsi
> @@ -184,7 +184,7 @@
>  /include/ "pq3-etsec1-1.dtsi"
>  /include/ "pq3-etsec1-2.dtsi"
>  /include/ "pq3-etsec1-3.dtsi"
> -/include/ "pq3-sec3.0-0.dtsi"
> +/include/ "pq3-sec3.1-0.dtsi"

This is not "adding SEC 3.1 support".  This patch is saying that the 8572
DTS was using the *wrong* SEC version.  If that's true, you need to
explain why it was wrong.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses
From: Yinghai Lu @ 2012-02-08 17:31 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-arch, Tony Luck, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes, linux-pci,
	Andrew Morton, Linus Torvalds
In-Reply-To: <CAErSpo6BwznWf3rq4J2Ao+rrgR4yWqQ0mQGygkd2KmBuA1OD8g@mail.gmail.com>

On Wed, Feb 8, 2012 at 7:58 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Sat, Feb 4, 2012 at 10:57 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> ---
>> =A0arch/powerpc/kernel/pci-common.c | =A0 =A07 ++++++-
>> =A01 files changed, 6 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-=
common.c
>> index cce98d7..501f29b 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -1732,6 +1732,8 @@ void __devinit pcibios_scan_phb(struct pci_control=
ler *hose)
>> =A0 =A0 =A0 =A0bus->secondary =3D hose->first_busno;
>> =A0 =A0 =A0 =A0hose->bus =3D bus;
>>
>> + =A0 =A0 =A0 pci_bus_insert_busn_res(bus, hose->first_busno, hose->last=
_busno);
>> +
>> =A0 =A0 =A0 =A0/* Get probe mode and perform scan */
>> =A0 =A0 =A0 =A0mode =3D PCI_PROBE_NORMAL;
>> =A0 =A0 =A0 =A0if (node && ppc_md.pci_probe_mode)
>> @@ -1742,8 +1744,11 @@ void __devinit pcibios_scan_phb(struct pci_contro=
ller *hose)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0of_scan_bus(node, bus);
>> =A0 =A0 =A0 =A0}
>>
>> - =A0 =A0 =A0 if (mode =3D=3D PCI_PROBE_NORMAL)
>> + =A0 =A0 =A0 if (mode =3D=3D PCI_PROBE_NORMAL) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pci_bus_update_busn_res_end(bus, 255);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0hose->last_busno =3D bus->subordinate =3D=
 pci_scan_child_bus(bus);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pci_bus_update_busn_res_end(bus, bus->subo=
rdinate);
>> + =A0 =A0 =A0 }
>
> The only architecture-specific thing here is discovering the range of
> bus numbers below a host bridge. =A0The architecture should not have to
> mess around with pci_bus_update_busn_res_end() like this. =A0It should
> be able to say "here's my bus number range" (and of course the PCI
> core can default to 0-255 if the arch doesn't supply a range) and the
> core should take care of the rest.

during the pci_scan_child_bus,  child bus busn_res will be inserted
under parent bus busn_res.

So need to make sure parent busn_res.end is bigger enough.


Yinghai

^ permalink raw reply

* Re: tlb flushing on Power
From: Seth Jennings @ 2012-02-08 17:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Brian King, Robert Jennings, linuxppc-dev, Nitin Gupta,
	Dave Hansen
In-Reply-To: <1327613953.24487.9.camel@pasglop>

Hey Ben,

Thanks for responding.

On 01/26/2012 03:39 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2012-01-26 at 08:41 -0600, Brian King wrote:
>> CC'ing linuxppc-dev...
>>
>>
>> On 01/26/2012 08:18 AM, Seth Jennings wrote:
>>> Hey Dave,
>>>
>>> So I submitted the zsmalloc patches to lkml at the beginning
>>> of the year
>>>
>>> https://lkml.org/lkml/2012/1/9/389
>>>
>>> I found there are two functions Nitin used in the mapping
>>> functions that are not supported in the powerpc arch:
>>> set_pte() and __flush_tlb_one().
> 
>  .../...
> 
> The arch management of page tables can be tricky indeed :-) I need to
> have a better understanding of what you are doing to see how I can try
> to adapt it to power.

You can look at https://lkml.org/lkml/2012/1/9/389 in zsmalloc-main.c,
zs_[un]map_object() functions for the currently uses of set_pte() and
__flush_tlb_one().

> set_pte() is long gone on all archs really (or if it's still there it's
> not meant to be used as is), use set_pte_at().

Problem with set_pte_at() for us is that we don't have an mm_struct to pass
because the mapping is not for a userspace process but for the kernel itself.

However, I do think this is the portable function we need to be using. Just
need to figure out what to pass in for the mm_struct param.

> __flush_tlb_one() doesn't mean anything as an arch independent
> functionality. We have a local_flush_tlb_page() that -might- do what you
> want but why in hell is that patch not using proper existing
> interfaces ?

flush_tlb_page() is the portable function we should be using.  However,
again, it requires a vma_area_struct.  I'm not sure what we should be
passing there.

--
Seth

^ permalink raw reply

* Re: [linuxppc-release] [PATCH 2/2] powerpc/85xx: update SEC node in dts for MPC8572DS
From: Kim Phillips @ 2012-02-08 18:53 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev, Jia Hongtao
In-Reply-To: <4F32A54F.5090305@freescale.com>

On Wed, 8 Feb 2012 10:39:43 -0600
Timur Tabi <timur@freescale.com> wrote:

> Jia Hongtao wrote:
> > -/include/ "pq3-sec3.0-0.dtsi"
> > +/include/ "pq3-sec3.1-0.dtsi"
> 
> This is not "adding SEC 3.1 support".  This patch is saying that the 8572
> DTS was using the *wrong* SEC version.  If that's true, you need to
> explain why it was wrong.

especially since, and according to "MPC8572E PowerQUICCTM III
Integrated Host Processor Family Reference Manual, Rev. 2", the 8572
has a SEC 3.0.

Kim

^ permalink raw reply

* Re: tlb flushing on Power
From: Benjamin Herrenschmidt @ 2012-02-08 21:04 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Brian King, Robert Jennings, linuxppc-dev, Nitin Gupta,
	Dave Hansen
In-Reply-To: <4F32B354.7030306@linux.vnet.ibm.com>


> You can look at https://lkml.org/lkml/2012/1/9/389 in zsmalloc-main.c,
> zs_[un]map_object() functions for the currently uses of set_pte() and
> __flush_tlb_one().
> 
> > set_pte() is long gone on all archs really (or if it's still there it's
> > not meant to be used as is), use set_pte_at().
> 
> Problem with set_pte_at() for us is that we don't have an mm_struct to pass
> because the mapping is not for a userspace process but for the kernel itself.

Then use init_mm

> However, I do think this is the portable function we need to be using. Just
> need to figure out what to pass in for the mm_struct param.
> 
> > __flush_tlb_one() doesn't mean anything as an arch independent
> > functionality. We have a local_flush_tlb_page() that -might- do what you
> > want but why in hell is that patch not using proper existing
> > interfaces ?
> 
> flush_tlb_page() is the portable function we should be using.  However,
> again, it requires a vma_area_struct.  I'm not sure what we should be
> passing there.

Do you need this to be CPU local flush or global ? In the later, 
flush_tlb_kernel_range() is the right API.

If you want per-cpu, we'll have to add a new arch hook.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses
From: Benjamin Herrenschmidt @ 2012-02-08 22:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-arch, Tony Luck, Yinghai Lu, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes, linux-pci,
	Andrew Morton, Linus Torvalds
In-Reply-To: <CAErSpo6BwznWf3rq4J2Ao+rrgR4yWqQ0mQGygkd2KmBuA1OD8g@mail.gmail.com>

On Wed, 2012-02-08 at 07:58 -0800, Bjorn Helgaas wrote:
> The only architecture-specific thing here is discovering the range of
> bus numbers below a host bridge.  The architecture should not have to
> mess around with pci_bus_update_busn_res_end() like this.  It should
> be able to say "here's my bus number range" (and of course the PCI
> core can default to 0-255 if the arch doesn't supply a range) and the
> core should take care of the rest. 

So it's a bit messy in here because we deal with several things.

What the firmware gives us is the range it assigned, but that isn't
necessarily the HW limits (almost never is in fact).

In some cases we honor it, for example when in "probe only" mode where
we prevent any reassigning, and in some case, we ignore it and let the
PCI core renumber things (typically because the FW "forgot" to set aside
bus numbers for a cardbus slot for example, that sort of things).

So it's a bit of a tricky situation.

Off the top of my head, I'm pretty sure that most if not all of our PCI
host bridges simply support a full 0...255 range and there is no sharing
between bridges like on x86, they are just different domains.

But I can't vouch 100% for some of the oddball cases like Pegasos or
some freescale gear.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH v2 ] Implement GET_IP/SET_IP for powerpc architecture.
From: Mike Frysinger @ 2012-02-09  2:56 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: linuxppc-dev, Benjamin Herrenschmidt
In-Reply-To: <20120208145124.GA16600@linux.vnet.ibm.com>

[-- Attachment #1: Type: Text/Plain, Size: 51 bytes --]

Acked-by: Mike Frysinger <vapier@gentoo.org>
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH] powerpc/wsp: Fix IRQ affinity setting
From: Benjamin Herrenschmidt @ 2012-02-09  4:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Jimi Xenidis

We call the cache_hwirq_map() function with a linux IRQ number
but it expects a HW irq number. This triggers a BUG on multic-chip
setups in addition to not doing the right thing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/wsp/ics.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/wsp/ics.c b/arch/powerpc/platforms/wsp/ics.c
index 5768743..97fe82e 100644
--- a/arch/powerpc/platforms/wsp/ics.c
+++ b/arch/powerpc/platforms/wsp/ics.c
@@ -346,7 +346,7 @@ static int wsp_chip_set_affinity(struct irq_data *d,
 	 * For the moment only implement delivery to all cpus or one cpu.
 	 * Get current irq_server for the given irq
 	 */
-	ret = cache_hwirq_map(ics, d->irq, cpumask);
+	ret = cache_hwirq_map(ics, hw_irq, cpumask);
 	if (ret == -1) {
 		char cpulist[128];
 		cpumask_scnprintf(cpulist, sizeof(cpulist), cpumask);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH v2] powerpc: Rework lazy-interrupt handling
From: Benjamin Herrenschmidt @ 2012-02-09  4:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Scott Wood, Stuart Yoder, Anton Blanchard, Laurentiu Tudor,
	Paul Mackerras

>From 0ace17ba6960a8788b1bda3770df254cbbc6a244 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Thu, 9 Feb 2012 15:25:04 +1100
Subject: [PATCH] powerpc: Rework lazy-interrupt handling

The current implementation of lazy interrupts handling has some
issues that this tries to address.

Except on iSeries, we don't do the various workarounds we need to
do on re-enable when returning from an interrupt, which can do an
implicit re-enable, and thus we may still lose or get delayed
decrementer or doorbell interrupts.

The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.

We also hard mask on decrementer interrupts which is sub-optimal.

This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling.

The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.

When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field. We then implement re-emitting of the lost interrupts via either
a re-use of the existing exception frame (exception exit case) or via
the creation of a new one from assembly code (arch_local_irq_enable),
without the need to trigger a fake one using set_dec() or similar.

In addition, this adds a few refinements:

 - We no longer  hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.

 - Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.

 - On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter).

There are additional refinements that we can do on top of this patch:

 - We could remove the ps3 workaround from arch_local_irq_enable(),
I believe that it should no longer be necessary

 - We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.

 - There are additional simplifications of the exception entry/exit path
that I've spotted along the way, such as merging fast_exception_return
with the normal code path.

This patch needs a LOT more testing & review than it had so far !!!

Not-signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2:

- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
  to retrigger an interrupt without preventing hard-enable

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/exception-64s.h        |   21 ++-
 arch/powerpc/include/asm/hw_irq.h               |   51 +++++-
 arch/powerpc/include/asm/irqflags.h             |   13 +-
 arch/powerpc/include/asm/paca.h                 |    2 +-
 arch/powerpc/kernel/asm-offsets.c               |    2 +-
 arch/powerpc/kernel/dbell.c                     |   12 ++
 arch/powerpc/kernel/entry_64.S                  |   96 ++++++-----
 arch/powerpc/kernel/exceptions-64e.S            |  210 ++++++++++++++++-------
 arch/powerpc/kernel/exceptions-64s.S            |   90 ++++++----
 arch/powerpc/kernel/head_64.S                   |    9 -
 arch/powerpc/kernel/idle_book3e.S               |    8 +-
 arch/powerpc/kernel/idle_power4.S               |   17 ++-
 arch/powerpc/kernel/idle_power7.S               |   20 ++-
 arch/powerpc/kernel/irq.c                       |  187 ++++++++++++++------
 arch/powerpc/kernel/time.c                      |   15 ++-
 arch/powerpc/platforms/iseries/Makefile         |    2 +-
 arch/powerpc/platforms/iseries/exception.S      |   11 +-
 arch/powerpc/platforms/iseries/misc.S           |   26 ---
 arch/powerpc/platforms/pseries/processor_idle.c |   24 +++-
 19 files changed, 540 insertions(+), 276 deletions(-)
 delete mode 100644 arch/powerpc/platforms/iseries/misc.S

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 8057f4f..b3f42e9 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -232,23 +232,24 @@ label##_hv:						\
 	EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common,	\
 				 EXC_HV, KVMTEST, vec)
 
-#define __SOFTEN_TEST(h)						\
+#define __SOFTEN_TEST(h, vec)						\
 	lbz	r10,PACASOFTIRQEN(r13);					\
 	cmpwi	r10,0;							\
+	li	r10,(vec)>>8;						\
 	beq	masked_##h##interrupt
-#define _SOFTEN_TEST(h)	__SOFTEN_TEST(h)
+#define _SOFTEN_TEST(h, vec)	__SOFTEN_TEST(h, vec)
 
 #define SOFTEN_TEST_PR(vec)						\
 	KVMTEST_PR(vec);						\
-	_SOFTEN_TEST(EXC_STD)
+	_SOFTEN_TEST(EXC_STD, vec)
 
 #define SOFTEN_TEST_HV(vec)						\
 	KVMTEST(vec);							\
-	_SOFTEN_TEST(EXC_HV)
+	_SOFTEN_TEST(EXC_HV, vec)
 
 #define SOFTEN_TEST_HV_201(vec)						\
 	KVMTEST(vec);							\
-	_SOFTEN_TEST(EXC_STD)
+	_SOFTEN_TEST(EXC_STD, vec)
 
 #define __MASKABLE_EXCEPTION_PSERIES(vec, label, h, extra)		\
 	HMT_MEDIUM;							\
@@ -276,9 +277,9 @@ label##_hv:								\
 #define DISABLE_INTS				\
 	li	r11,0;				\
 	stb	r11,PACASOFTIRQEN(r13);		\
-BEGIN_FW_FTR_SECTION;				\
-	stb	r11,PACAHARDIRQEN(r13);		\
-END_FW_FTR_SECTION_IFCLR(FW_FEATURE_ISERIES);	\
+	lbz	r11,PACAIRQHAPPENED(r13);	\
+	ori	r11,r11,PACA_HAPPENED;		\
+	stb	r11,PACAIRQHAPPENED(r13);	\
 	TRACE_DISABLE_INTS;			\
 BEGIN_FW_FTR_SECTION;				\
 	mfmsr	r10;				\
@@ -289,7 +290,9 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_ISERIES)
 #define DISABLE_INTS				\
 	li	r11,0;				\
 	stb	r11,PACASOFTIRQEN(r13);		\
-	stb	r11,PACAHARDIRQEN(r13);		\
+	lbz	r11,PACAIRQHAPPENED(r13);	\
+	ori	r11,r11,PACA_HAPPENED;		\
+	stb	r11,PACAIRQHAPPENED(r13);	\
 	TRACE_DISABLE_INTS
 #endif /* CONFIG_PPC_ISERIES */
 
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index bb712c9..bd33843 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -11,11 +11,50 @@
 #include <asm/ptrace.h>
 #include <asm/processor.h>
 
+#ifdef CONFIG_PPC64
+
+/*
+ * PACA flags in paca->irq_happened. On BookS these flags are set
+ * by oring in the interrupt vector shifted right by 8, so what
+ * we actually have in there is:
+ *
+ * EE  :  0x50x >> 8 = 0x05
+ * DEC :  0x90x >> 8 = 0x09
+ *
+ * The bits we test are thus 0x4 and 0x8 respectively, with bit
+ * 0x1 always set when "something happened".
+ *
+ * On BookE, we just arbitrarily use the values defined below.
+ *
+ * Note: That "something happened" bit is important as we set it
+ * when manually hard-disabling, for example in the exception
+ * entry path.
+ *
+ * This allows a subsequent arch_local_irq_restore() to "know"
+ * that it can't just return and has to actually hard enable.
+ *
+ * The PACA_HAPPENED_LEVEL mask is a bit mask of what values
+ * can correspond to a "level" sensitive interrupt, ie, for
+ * such values, we must not hard-enable in timer_interrupt
+ * do_IRQ or doorbell interrupts if one of these bits is set
+ */
+#define PACA_HAPPENED		0x01
+#define PACA_HAPPENED_DBELL	0x02
+#define PACA_HAPPENED_EE	0x04
+#define PACA_HAPPENED_DEC	0x08 /* Or FIT */
+#define PACA_HAPPENED_EE_EDGE	0x10 /* BookE only */
+
+#endif /* CONFIG_PPC64 */
+
+#ifndef __ASSEMBLY__
+
 extern void timer_interrupt(struct pt_regs *);
 
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
 
+extern void __reemit_interrupt(unsigned int vector);
+
 static inline unsigned long arch_local_save_flags(void)
 {
 	unsigned long flags;
@@ -42,7 +81,6 @@ static inline unsigned long arch_local_irq_disable(void)
 }
 
 extern void arch_local_irq_restore(unsigned long);
-extern void iseries_handle_interrupts(void);
 
 static inline void arch_local_irq_enable(void)
 {
@@ -72,11 +110,11 @@ static inline bool arch_irqs_disabled(void)
 #define __hard_irq_disable()	__mtmsrd(mfmsr() & ~MSR_EE, 1)
 #endif
 
-#define  hard_irq_disable()			\
-	do {					\
-		__hard_irq_disable();		\
-		get_paca()->soft_enabled = 0;	\
-		get_paca()->hard_enabled = 0;	\
+#define  hard_irq_disable()					\
+	do {							\
+		__hard_irq_disable();				\
+		get_paca()->soft_enabled = 0;			\
+		get_paca()->irq_happened |= PACA_HAPPENED;	\
 	} while(0)
 
 #else /* CONFIG_PPC64 */
@@ -149,5 +187,6 @@ static inline bool arch_irqs_disabled(void)
  */
 struct irq_chip;
 
+#endif  /* __ASSEMBLY__ */
 #endif	/* __KERNEL__ */
 #endif	/* _ASM_POWERPC_HW_IRQ_H */
diff --git a/arch/powerpc/include/asm/irqflags.h b/arch/powerpc/include/asm/irqflags.h
index b0b06d8..4bfbf0a 100644
--- a/arch/powerpc/include/asm/irqflags.h
+++ b/arch/powerpc/include/asm/irqflags.h
@@ -47,16 +47,15 @@
 	b	skip;					\
 95:	TRACE_WITH_FRAME_BUFFER(.trace_hardirqs_on)	\
 	li	en,1;
-#define TRACE_AND_RESTORE_IRQ(en)		\
-	TRACE_AND_RESTORE_IRQ_PARTIAL(en,96f);	\
-	stb	en,PACASOFTIRQEN(r13);		\
-96:
 #else
 #define TRACE_ENABLE_INTS
 #define TRACE_DISABLE_INTS
-#define TRACE_AND_RESTORE_IRQ_PARTIAL(en,skip)
-#define TRACE_AND_RESTORE_IRQ(en)		\
-	stb	en,PACASOFTIRQEN(r13)
+#define TRACE_AND_RESTORE_IRQ_PARTIAL(en,skip)		\
+	cmpdi	en,0;					\
+	bne	95f;					\
+	stb	en,PACASOFTIRQEN(r13);			\
+	b	skip;					\
+95:
 #endif
 #endif
 
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 269c05a..daf813f 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -132,7 +132,7 @@ struct paca_struct {
 	u64 saved_msr;			/* MSR saved here by enter_rtas */
 	u16 trap_save;			/* Used when bad stack is encountered */
 	u8 soft_enabled;		/* irq soft-enable flag */
-	u8 hard_enabled;		/* set if irqs are enabled in MSR */
+	u8 irq_happened;		/* irq happened while soft-disabled */
 	u8 io_sync;			/* writel() needs spin_unlock sync */
 	u8 irq_work_pending;		/* IRQ_WORK interrupt while soft-disable */
 	u8 nap_state_lost;		/* NV GPR values lost in power7_idle */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 04caee7..cdd0d26 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -147,7 +147,7 @@ int main(void)
 	DEFINE(PACAKBASE, offsetof(struct paca_struct, kernelbase));
 	DEFINE(PACAKMSR, offsetof(struct paca_struct, kernel_msr));
 	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
-	DEFINE(PACAHARDIRQEN, offsetof(struct paca_struct, hard_enabled));
+	DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct, irq_happened));
 	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index 2cc451a..16f0e5e 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -37,6 +37,18 @@ void doorbell_exception(struct pt_regs *regs)
 
 	irq_enter();
 
+#ifdef CONFIG_PPC64
+	/* Let's hard enable interrupts now that we have reset
+	 * the DEC (or acked it on BookE)
+	 *
+	 * We skip that if there's a pending EE "level" interrupt
+	 * as an optimization
+	 */
+	get_paca()->irq_happened &= ~PACA_HAPPENED;
+	if (!(get_paca()->irq_happened & PACA_HAPPENED_EE))
+		__hard_irq_enable();
+#endif /* CONFIG_PPC64 */
+
 	smp_ipi_demux();
 
 	irq_exit();
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index d834425..82619e7 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -31,6 +31,7 @@
 #include <asm/bug.h>
 #include <asm/ptrace.h>
 #include <asm/irqflags.h>
+#include <asm/hw_irq.h>
 #include <asm/ftrace.h>
 
 /*
@@ -125,19 +126,7 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
 #endif /* CONFIG_TRACE_IRQFLAGS */
 	li	r10,1
 	stb	r10,PACASOFTIRQEN(r13)
-	stb	r10,PACAHARDIRQEN(r13)
 	std	r10,SOFTE(r1)
-#ifdef CONFIG_PPC_ISERIES
-BEGIN_FW_FTR_SECTION
-	/* Hack for handling interrupts when soft-enabling on iSeries */
-	cmpdi	cr1,r0,0x5555		/* syscall 0x5555 */
-	andi.	r10,r12,MSR_PR		/* from kernel */
-	crand	4*cr0+eq,4*cr1+eq,4*cr0+eq
-	bne	2f
-	b	hardware_interrupt_entry
-2:
-END_FW_FTR_SECTION_IFSET(FW_FEATURE_ISERIES)
-#endif /* CONFIG_PPC_ISERIES */
 
 	/* Hard enable interrupts */
 #ifdef CONFIG_PPC_BOOK3E
@@ -593,23 +582,33 @@ _GLOBAL(ret_from_except_lite)
 	bne	do_work
 #endif
 
+_GLOBAL(fast_exception_return_irq)
 restore:
-BEGIN_FW_FTR_SECTION
 	ld	r5,SOFTE(r1)
-FW_FTR_SECTION_ELSE
-	b	.Liseries_check_pending_irqs
-ALT_FW_FTR_SECTION_END_IFCLR(FW_FEATURE_ISERIES)
-2:
-	TRACE_AND_RESTORE_IRQ(r5);
+	TRACE_AND_RESTORE_IRQ_PARTIAL(r5, 3f);
 
-	/* extract EE bit and use it to restore paca->hard_enabled */
-	ld	r3,_MSR(r1)
-	rldicl	r4,r3,49,63		/* r0 = (r3 >> 15) & 1 */
-	stb	r4,PACAHARDIRQEN(r13)
+	/*
+	 * We are about to soft-enable interrupts (we are hard disabled
+	 * at this point). We check if there's anything that needs to
+	 * be replayed first
+	 */
+	lbz	r0,PACAIRQHAPPENED(r13)
+	cmpwi	cr0,r0,0
+	bne-	4f
+
+	/*
+	 * Get here when nothing happened while soft-disabled, just
+	 * soft-enable and move-on. We will hard-enable as a side
+	 * effect of rfi
+	 */
+2:	li	r0,1
+	stb	r0,PACASOFTIRQEN(r13);
+3:
 
 #ifdef CONFIG_PPC_BOOK3E
 	b	.exception_return_book3e
 #else
+	ld	r3,_MSR(r1)
 	ld	r4,_CTR(r1)
 	ld	r0,_LINK(r1)
 	mtctr	r4
@@ -644,7 +643,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 
 	/*
 	 * r13 is our per cpu area, only restore it if we are returning to
-	 * userspace
+	 * userspace the value stored in the stack frame may belong to
+	 * another CPU.
 	 */
 	andi.	r0,r3,MSR_PR
 	beq	1f
@@ -669,29 +669,36 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 
 #endif /* CONFIG_PPC_BOOK3E */
 
-.Liseries_check_pending_irqs:
-#ifdef CONFIG_PPC_ISERIES
-	ld	r5,SOFTE(r1)
-	cmpdi	0,r5,0
+	/*
+	 * Something did happen, check if a re-emit is needed
+	 * (this also clears paca->irq_happened)
+	 */
+4:	bl	.__check_irq_reemit
+	cmpwi	cr0,r3,0
 	beq	2b
-	/* Check for pending interrupts (iSeries) */
-	ld	r3,PACALPPACAPTR(r13)
-	ld	r3,LPPACAANYINT(r3)
-	cmpdi	r3,0
-	beq+	2b			/* skip do_IRQ if no interrupts */
 
-	li	r3,0
-	stb	r3,PACASOFTIRQEN(r13)	/* ensure we are soft-disabled */
-#ifdef CONFIG_TRACE_IRQFLAGS
-	bl	.trace_hardirqs_off
-	mfmsr	r10
-#endif
-	ori	r10,r10,MSR_EE
-	mtmsrd	r10			/* hard-enable again */
-	addi	r3,r1,STACK_FRAME_OVERHEAD
+	/*
+	 * We need to re-emit an interrupt. We do so by re-using our
+	 * existing exception frame. We first change the trap value,
+	 * but we need to ensure we preserve the low nibble of it
+	 */
+	ld	r4,_TRAP(r1)
+	clrldi	r4,r4,60
+	or	r4,r4,r3
+	std	r4,_TRAP(r1)
+
+	/*
+	 * Then find the right handler and call it. Interrupts are
+	 * still soft-disabled and we keep them that way.
+	 */
+	cmpwi	cr0,r3,0x500
+	beq	1f
+	addi	r3,r1,STACK_FRAME_OVERHEAD;
+	bl	.timer_interrupt
+	b	.ret_from_except
+1:	addi	r3,r1,STACK_FRAME_OVERHEAD;
 	bl	.do_IRQ
-	b	.ret_from_except_lite		/* loop back and handle more */
-#endif
+	b	.ret_from_except
 
 do_work:
 #ifdef CONFIG_PREEMPT
@@ -713,7 +720,6 @@ do_work:
 	 */
 	li	r0,0
 	stb	r0,PACASOFTIRQEN(r13)
-	stb	r0,PACAHARDIRQEN(r13)
 	TRACE_DISABLE_INTS
 
 	/* Call the scheduler with soft IRQs off */
@@ -728,8 +734,6 @@ do_work:
 	rotldi	r10,r10,16
 	mtmsrd	r10,1
 #endif /* CONFIG_PPC_BOOK3E */
-	li	r0,0
-	stb	r0,PACAHARDIRQEN(r13)
 
 	/* Re-test flags and eventually loop */
 	clrrdi	r9,r1,THREAD_SHIFT
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 429983c..7fa4096 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -21,6 +21,7 @@
 #include <asm/exception-64e.h>
 #include <asm/bug.h>
 #include <asm/irqflags.h>
+#include <asm/hw_irq.h>
 #include <asm/ptrace.h>
 #include <asm/ppc-opcode.h>
 #include <asm/mmu.h>
@@ -77,59 +78,55 @@
 #define SPRN_MC_SRR1	SPRN_MCSRR1
 
 #define NORMAL_EXCEPTION_PROLOG(n, addition)				    \
-	EXCEPTION_PROLOG(n, GEN, addition##_GEN)
+	EXCEPTION_PROLOG(n, GEN, addition##_GEN(n))
 
 #define CRIT_EXCEPTION_PROLOG(n, addition)				    \
-	EXCEPTION_PROLOG(n, CRIT, addition##_CRIT)
+	EXCEPTION_PROLOG(n, CRIT, addition##_CRIT(n))
 
 #define DBG_EXCEPTION_PROLOG(n, addition)				    \
-	EXCEPTION_PROLOG(n, DBG, addition##_DBG)
+	EXCEPTION_PROLOG(n, DBG, addition##_DBG(n))
 
 #define MC_EXCEPTION_PROLOG(n, addition)				    \
-	EXCEPTION_PROLOG(n, MC, addition##_MC)
+	EXCEPTION_PROLOG(n, MC, addition##_MC(n))
 
 
 /* Variants of the "addition" argument for the prolog
  */
-#define PROLOG_ADDITION_NONE_GEN
-#define PROLOG_ADDITION_NONE_CRIT
-#define PROLOG_ADDITION_NONE_DBG
-#define PROLOG_ADDITION_NONE_MC
+#define PROLOG_ADDITION_NONE_GEN(n)
+#define PROLOG_ADDITION_NONE_CRIT(n)
+#define PROLOG_ADDITION_NONE_DBG(n)
+#define PROLOG_ADDITION_NONE_MC(n)
 
-#define PROLOG_ADDITION_MASKABLE_GEN					    \
+#define PROLOG_ADDITION_MASKABLE_GEN(n)					    \
 	lbz	r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */	    \
 	cmpwi	cr0,r11,0;		/* yes -> go out of line */	    \
-	beq	masked_interrupt_book3e;
+	beq	masked_interrupt_book3e_##n;
 
-#define PROLOG_ADDITION_2REGS_GEN					    \
+#define PROLOG_ADDITION_2REGS_GEN(n)					    \
 	std	r14,PACA_EXGEN+EX_R14(r13);				    \
 	std	r15,PACA_EXGEN+EX_R15(r13)
 
-#define PROLOG_ADDITION_1REG_GEN					    \
+#define PROLOG_ADDITION_1REG_GEN(n)					    \
 	std	r14,PACA_EXGEN+EX_R14(r13);
 
-#define PROLOG_ADDITION_2REGS_CRIT					    \
+#define PROLOG_ADDITION_2REGS_CRIT(n)					    \
 	std	r14,PACA_EXCRIT+EX_R14(r13);				    \
 	std	r15,PACA_EXCRIT+EX_R15(r13)
 
-#define PROLOG_ADDITION_2REGS_DBG					    \
+#define PROLOG_ADDITION_2REGS_DBG(n)					    \
 	std	r14,PACA_EXDBG+EX_R14(r13);				    \
 	std	r15,PACA_EXDBG+EX_R15(r13)
 
-#define PROLOG_ADDITION_2REGS_MC					    \
+#define PROLOG_ADDITION_2REGS_MC(n)					    \
 	std	r14,PACA_EXMC+EX_R14(r13);				    \
 	std	r15,PACA_EXMC+EX_R15(r13)
 
-#define PROLOG_ADDITION_DOORBELL_GEN					    \
-	lbz	r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */	    \
-	cmpwi	cr0,r11,0;		/* yes -> go out of line */	    \
-	beq	masked_doorbell_book3e
-
 
 /* Core exception code for all exceptions except TLB misses.
  * XXX: Needs to make SPRN_SPRG_GEN depend on exception type
  */
 #define EXCEPTION_COMMON(n, excf, ints)					    \
+exc_##n##_common:							    \
 	std	r0,GPR0(r1);		/* save r0 in stackframe */	    \
 	std	r2,GPR2(r1);		/* save r2 in stackframe */	    \
 	SAVE_4GPRS(3, r1);		/* save r3 - r6 in stackframe */    \
@@ -167,19 +164,25 @@
 	std	r0,RESULT(r1);		/* clear regs->result */	    \
 	ints;
 
-/* Variants for the "ints" argument */
-#define INTS_KEEP
-#define INTS_DISABLE_SOFT						    \
+/* Variants for the "ints" argument. We know r0 is 0 on entry */
+#define INTS_KEEP							    \
 	stb	r0,PACASOFTIRQEN(r13);	/* mark interrupts soft-disabled */ \
 	TRACE_DISABLE_INTS;
-#define INTS_DISABLE_HARD						    \
-	stb	r0,PACAHARDIRQEN(r13); /* and hard disabled */
-#define INTS_DISABLE_ALL						    \
-	INTS_DISABLE_SOFT						    \
-	INTS_DISABLE_HARD
-
-/* This is called by exceptions that used INTS_KEEP (that is did not clear
- * neither soft nor hard IRQ indicators in the PACA. This will restore MSR:EE
+
+/* This second version is meant for exceptions that don't immediately
+ * hard-enable. We set a bit in paca->irq_happened to ensure that
+ * a subsequent call to arch_local_irq_restore() will properly
+ * hard-enable and avoid the fast-path
+ */
+#define INTS_DISABLE							    \
+	stb	r0,PACASOFTIRQEN(r13);	/* mark interrupts soft-disabled */ \
+	lbz	r0,PACAIRQHAPPENED(r13);				    \
+	ori	r0,r0,PACA_HAPPENED;					    \
+	stb	r0,PACAIRQHAPPENED(r13);				    \
+	TRACE_DISABLE_INTS;
+
+/* This is called by exceptions that used INTS_KEEP (that is did
+ * not set hard IRQ indicators in the PACA). This will restore MSR:EE
  * to it's previous value
  *
  * XXX In the long run, we may want to open-code it in order to separate the
@@ -238,7 +241,7 @@ exc_##n##_bad_stack:							    \
 #define MASKABLE_EXCEPTION(trapnum, label, hdlr, ack)			\
 	START_EXCEPTION(label);						\
 	NORMAL_EXCEPTION_PROLOG(trapnum, PROLOG_ADDITION_MASKABLE)	\
-	EXCEPTION_COMMON(trapnum, PACA_EXGEN, INTS_DISABLE_ALL)		\
+	EXCEPTION_COMMON(trapnum, PACA_EXGEN, INTS_DISABLE)		\
 	ack(r8);							\
 	CHECK_NAPPING();						\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;				\
@@ -289,7 +292,7 @@ interrupt_end_book3e:
 /* Critical Input Interrupt */
 	START_EXCEPTION(critical_input);
 	CRIT_EXCEPTION_PROLOG(0x100, PROLOG_ADDITION_NONE)
-//	EXCEPTION_COMMON(0x100, PACA_EXCRIT, INTS_DISABLE_ALL)
+//	EXCEPTION_COMMON(0x100, PACA_EXCRIT, INTS_DISABLE)
 //	bl	special_reg_save_crit
 //	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
@@ -300,7 +303,7 @@ interrupt_end_book3e:
 /* Machine Check Interrupt */
 	START_EXCEPTION(machine_check);
 	CRIT_EXCEPTION_PROLOG(0x200, PROLOG_ADDITION_NONE)
-//	EXCEPTION_COMMON(0x200, PACA_EXMC, INTS_DISABLE_ALL)
+//	EXCEPTION_COMMON(0x200, PACA_EXMC, INTS_DISABLE)
 //	bl	special_reg_save_mc
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
 //	CHECK_NAPPING();
@@ -339,7 +342,7 @@ interrupt_end_book3e:
 	START_EXCEPTION(program);
 	NORMAL_EXCEPTION_PROLOG(0x700, PROLOG_ADDITION_1REG)
 	mfspr	r14,SPRN_ESR
-	EXCEPTION_COMMON(0x700, PACA_EXGEN, INTS_DISABLE_SOFT)
+	EXCEPTION_COMMON(0x700, PACA_EXGEN, INTS_KEEP)
 	std	r14,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	ld	r14,PACA_EXGEN+EX_R14(r13)
@@ -372,7 +375,7 @@ interrupt_end_book3e:
 /* Watchdog Timer Interrupt */
 	START_EXCEPTION(watchdog);
 	CRIT_EXCEPTION_PROLOG(0x9f0, PROLOG_ADDITION_NONE)
-//	EXCEPTION_COMMON(0x9f0, PACA_EXCRIT, INTS_DISABLE_ALL)
+//	EXCEPTION_COMMON(0x9f0, PACA_EXCRIT, INTS_DISABLE)
 //	bl	special_reg_save_crit
 //	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
@@ -450,7 +453,7 @@ interrupt_end_book3e:
 	mfspr	r15,SPRN_SPRG_CRIT_SCRATCH
 	mtspr	SPRN_SPRG_GEN_SCRATCH,r15
 	mfspr	r14,SPRN_DBSR
-	EXCEPTION_COMMON(0xd00, PACA_EXCRIT, INTS_DISABLE_ALL)
+	EXCEPTION_COMMON(0xd00, PACA_EXCRIT, INTS_DISABLE)
 	std	r14,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	mr	r4,r14
@@ -515,7 +518,7 @@ kernel_dbg_exc:
 	mfspr	r15,SPRN_SPRG_DBG_SCRATCH
 	mtspr	SPRN_SPRG_GEN_SCRATCH,r15
 	mfspr	r14,SPRN_DBSR
-	EXCEPTION_COMMON(0xd00, PACA_EXDBG, INTS_DISABLE_ALL)
+	EXCEPTION_COMMON(0xd08, PACA_EXDBG, INTS_DISABLE)
 	std	r14,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	mr	r4,r14
@@ -525,21 +528,22 @@ kernel_dbg_exc:
 	bl	.DebugException
 	b	.ret_from_except
 
-	MASKABLE_EXCEPTION(0x260, perfmon, .performance_monitor_exception, ACK_NONE)
+	START_EXCEPTION(perfmon);
+	NORMAL_EXCEPTION_PROLOG(0x260, PROLOG_ADDITION_NONE)
+	EXCEPTION_COMMON(0x260, PACA_EXGEN, INTS_DISABLE)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	ld	r14,PACA_EXGEN+EX_R14(r13)
+	bl	.save_nvgprs
+	bl	.performance_monitor_exception
+	b	.ret_from_except
 
 /* Doorbell interrupt */
-	START_EXCEPTION(doorbell)
-	NORMAL_EXCEPTION_PROLOG(0x2070, PROLOG_ADDITION_DOORBELL)
-	EXCEPTION_COMMON(0x2070, PACA_EXGEN, INTS_DISABLE_ALL)
-	CHECK_NAPPING()
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	.doorbell_exception
-	b	.ret_from_except_lite
+	MASKABLE_EXCEPTION(0x280, doorbell, .doorbell_exception, ACK_NONE)
 
 /* Doorbell critical Interrupt */
 	START_EXCEPTION(doorbell_crit);
-	CRIT_EXCEPTION_PROLOG(0x2080, PROLOG_ADDITION_NONE)
-//	EXCEPTION_COMMON(0x2080, PACA_EXCRIT, INTS_DISABLE_ALL)
+	CRIT_EXCEPTION_PROLOG(0x2a0, PROLOG_ADDITION_NONE)
+//	EXCEPTION_COMMON(0x280, PACA_EXCRIT, INTS_DISABLE)
 //	bl	special_reg_save_crit
 //	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
@@ -547,38 +551,116 @@ kernel_dbg_exc:
 //	b	ret_from_crit_except
 	b	.
 
+/* Guest Doorbell */
 	MASKABLE_EXCEPTION(0x2c0, guest_doorbell, .unknown_exception, ACK_NONE)
-	MASKABLE_EXCEPTION(0x2e0, guest_doorbell_crit, .unknown_exception, ACK_NONE)
-	MASKABLE_EXCEPTION(0x310, hypercall, .unknown_exception, ACK_NONE)
-	MASKABLE_EXCEPTION(0x320, ehpriv, .unknown_exception, ACK_NONE)
+
+/* Guest Doorbell critical Interrupt */
+	START_EXCEPTION(guest_doorbell_crit);
+	CRIT_EXCEPTION_PROLOG(0x2e0, PROLOG_ADDITION_NONE)
+//	EXCEPTION_COMMON(0x2e0, PACA_EXCRIT, INTS_DISABLE)
+//	bl	special_reg_save_crit
+//	CHECK_NAPPING();
+//	addi	r3,r1,STACK_FRAME_OVERHEAD
+//	bl	.guest_doorbell_critical_exception
+//	b	ret_from_crit_except
+	b	.
+
+/* Hypervisor call */
+	START_EXCEPTION(hypercall);
+	NORMAL_EXCEPTION_PROLOG(0x310, PROLOG_ADDITION_NONE)
+	EXCEPTION_COMMON(0x310, PACA_EXGEN, INTS_KEEP)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	.save_nvgprs
+	INTS_RESTORE_HARD
+	bl	.unknown_exception
+	b	.ret_from_except
+
+/* Embedded Hypervisor priviledged  */
+	START_EXCEPTION(ehpriv);
+	NORMAL_EXCEPTION_PROLOG(0x320, PROLOG_ADDITION_NONE)
+	EXCEPTION_COMMON(0x320, PACA_EXGEN, INTS_KEEP)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	.save_nvgprs
+	INTS_RESTORE_HARD
+	bl	.unknown_exception
+	b	.ret_from_except
 
 
 /*
- * An interrupt came in while soft-disabled; clear EE in SRR1,
- * clear paca->hard_enabled and return.
+ * An interrupt came in while soft-disabled; We mark paca->irq_happened
+ * accordingly and if the interrupt is level sensitive, we hard disable
  */
-masked_doorbell_book3e:
-	mtcr	r10
-	/* Resend the doorbell to fire again when ints enabled */
-	mfspr	r10,SPRN_PIR
-	PPC_MSGSND(r10)
-	b	masked_interrupt_book3e_common
 
-masked_interrupt_book3e:
+masked_interrupt_book3e_0x500:
+	li	r11,PACA_HAPPENED_EE
+	b	masked_interrupt_book3e_full_mask
+
+masked_interrupt_book3e_0x900:
+	ACK_DEC(r11);
+	li	r11,PACA_HAPPENED_DEC
+	b	masked_interrupt_book3e_no_mask
+masked_interrupt_book3e_0x980:
+	ACK_FIT(r11);
+	li	r11,PACA_HAPPENED_DEC
+	b	masked_interrupt_book3e_no_mask
+masked_interrupt_book3e_0x280:
+masked_interrupt_book3e_0x2c0:
+	li	r11,PACA_HAPPENED_DBELL
+	b	masked_interrupt_book3e_no_mask
+
+masked_interrupt_book3e_no_mask:
+	mtcr	r10
+	lbz	r10,PACAIRQHAPPENED(r13)
+	ori	r10,r10,r11
+	stb	r10,PACAIRQHAPPENED(r13)
+	b	1f
+masked_interrupt_book3e_full_mask:
 	mtcr	r10
-masked_interrupt_book3e_common:
-	stb	r11,PACAHARDIRQEN(r13)
+	lbz	r10,PACAIRQHAPPENED(r13)
+	ori	r10,r10,r11
+	stb	r10,PACAIRQHAPPENED(r13)
 	mfspr	r10,SPRN_SRR1
 	rldicl	r11,r10,48,1		/* clear MSR_EE */
 	rotldi	r10,r11,16
 	mtspr	SPRN_SRR1,r10
-	ld	r10,PACA_EXGEN+EX_R10(r13);	/* restore registers */
+1:	ld	r10,PACA_EXGEN+EX_R10(r13);
 	ld	r11,PACA_EXGEN+EX_R11(r13);
 	mfspr	r13,SPRN_SPRG_GEN_SCRATCH;
 	rfi
 	b	.
 
 /*
+ * Called from arch_local_irq_enable when an interrupt needs
+ * to be resent. r3 contains either 0x500,0x900,0x260 or 0x280
+ * to indicate the kind of interrupt. MSR:EE is already off.
+ * We generate a stackframe like if a real interrupt had happened.
+ *
+ * Note: While MSR:EE is off, we need to make sure that _MSR
+ * in the generated frame has EE set to 1 or the exception
+ * handler will not properly re-enable them.
+ */
+_GLOBAL(__reemit_interrupt)
+	/* We are going to jump to the exception common code which
+	 * will retrieve various register values from the PACA which
+	 * we don't give a damn about.
+	 */
+	mflr	r10
+	mfmsr	r11
+	mfcr	r4;
+	mtspr	SPRN_SPRG_GEN_SCRATCH,r13;
+	std	r1,PACA_EXGEN+EX_R1(r13);
+	stw	r4,PACA_EXGEN+EX_CR(r13);
+	ori	r11,r11,MSR_EE
+	subi	r1,r1,INT_FRAME_SIZE;
+	cmpwi	cr0,r3,0x500
+	beq	exc_0x500_common
+	cmpwi	cr0,r3,0x900
+	beq+	exc_0x900_common
+	cmpwi	cr0,r3,0x280
+	beq+	exc_0x280_common
+	blr
+
+/*
  * This is called from 0x300 and 0x400 handlers after the prologs with
  * r14 and r15 containing the fault address and error code, with the
  * original values stashed away in the PACA
@@ -680,6 +762,8 @@ BAD_STACK_TRAMPOLINE(0x000)
 BAD_STACK_TRAMPOLINE(0x100)
 BAD_STACK_TRAMPOLINE(0x200)
 BAD_STACK_TRAMPOLINE(0x260)
+BAD_STACK_TRAMPOLINE(0x280)
+BAD_STACK_TRAMPOLINE(0x2a0)
 BAD_STACK_TRAMPOLINE(0x2c0)
 BAD_STACK_TRAMPOLINE(0x2e0)
 BAD_STACK_TRAMPOLINE(0x300)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index d4be7bb..896eea8 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -12,6 +12,7 @@
  *
  */
 
+#include <asm/hw_irq.h>
 #include <asm/exception-64s.h>
 #include <asm/ptrace.h>
 
@@ -356,34 +357,60 @@ do_stab_bolted_pSeries:
 	KVM_HANDLER_PR(PACA_EXGEN, EXC_STD, 0xf40)
 
 /*
- * An interrupt came in while soft-disabled; clear EE in SRR1,
- * clear paca->hard_enabled and return.
+ * An interrupt came in while soft-disabled. We set paca->irq_happened,
+ * then, if it was a decrementer interrupt, we bump the dec to max and
+ * and return, else we hard disable and return.
  */
-masked_interrupt:
-	stb	r10,PACAHARDIRQEN(r13)
-	mtcrf	0x80,r9
-	ld	r9,PACA_EXGEN+EX_R9(r13)
-	mfspr	r10,SPRN_SRR1
-	rldicl	r10,r10,48,1		/* clear MSR_EE */
-	rotldi	r10,r10,16
-	mtspr	SPRN_SRR1,r10
-	ld	r10,PACA_EXGEN+EX_R10(r13)
-	GET_SCRATCH0(r13)
-	rfid
-	b	.
 
-masked_Hinterrupt:
-	stb	r10,PACAHARDIRQEN(r13)
-	mtcrf	0x80,r9
-	ld	r9,PACA_EXGEN+EX_R9(r13)
-	mfspr	r10,SPRN_HSRR1
-	rldicl	r10,r10,48,1		/* clear MSR_EE */
-	rotldi	r10,r10,16
-	mtspr	SPRN_HSRR1,r10
-	ld	r10,PACA_EXGEN+EX_R10(r13)
-	GET_SCRATCH0(r13)
-	hrfid
+#define MASKED_INTERRUPT(_H)				\
+masked_##_H##interrupt:					\
+	std	r11,PACA_EXGEN+EX_R11(r13);		\
+	lbz	r11,PACAIRQHAPPENED(r13);		\
+	or	r11,r11,r10;				\
+	stb	r11,PACAIRQHAPPENED(r13);		\
+	andi.	r10,r10,PACA_HAPPENED_DEC;		\
+	beq	1f;					\
+	lis	r10,0x7fff;				\
+	ori	r10,r10,0xffff;				\
+	mtspr	SPRN_DEC,r10;				\
+	b	2f;					\
+1:	mfspr	r10,SPRN_##_H##SRR1;			\
+	rldicl	r10,r10,48,1; /* clear MSR_EE */	\
+	rotldi	r10,r10,16;				\
+	mtspr	SPRN_##_H##SRR1,r10;			\
+2:	mtcrf	0x80,r9;				\
+	ld	r9,PACA_EXGEN+EX_R9(r13);		\
+	ld	r10,PACA_EXGEN+EX_R10(r13);		\
+	ld	r11,PACA_EXGEN+EX_R11(r13);		\
+	GET_SCRATCH0(r13);				\
+	##_H##rfid;					\
 	b	.
+	
+	MASKED_INTERRUPT()
+	MASKED_INTERRUPT(H)
+
+/*
+ * Called from arch_local_irq_enable when an interrupt needs
+ * to be resent. r3 contains 0x500 or 0x900 to indicate which
+ * kind of interrupt. MSR:EE is already off. We generate a
+ * stackframe like if a real interrupt had happened.
+ *
+ * Note: While MSR:EE is off, we need to make sure that _MSR
+ * in the generated frame has EE set to 1 or the exception
+ * handler will not properly re-enable them.
+ */
+_GLOBAL(__reemit_interrupt)
+	/* We are going to jump to the exception common code which
+	 * will retrieve various register values from the PACA which
+	 * we don't give a damn about, so we don't bother storing them.
+	 */
+	mfmsr	r12
+	mflr	r11
+	mfcr	r9
+	ori	r12,r12,MSR_EE
+	andi.	r3,r3,0x0800
+	bne	decrementer_common
+	b	hardware_interrupt_common
 
 #ifdef CONFIG_PPC_PSERIES
 /*
@@ -838,18 +865,10 @@ __end_handlers:
  * any task or sent any task a signal, you should use
  * ret_from_except or ret_from_except_lite instead of this.
  */
-fast_exc_return_irq:			/* restores irq state too */
-	ld	r3,SOFTE(r1)
-	TRACE_AND_RESTORE_IRQ(r3);
-	ld	r12,_MSR(r1)
-	rldicl	r4,r12,49,63		/* get MSR_EE to LSB */
-	stb	r4,PACAHARDIRQEN(r13)	/* restore paca->hard_enabled */
-	b	1f
-
 	.globl	fast_exception_return
 fast_exception_return:
 	ld	r12,_MSR(r1)
-1:	ld	r11,_NIP(r1)
+	ld	r11,_NIP(r1)
 	andi.	r3,r12,MSR_RI		/* check if RI is set */
 	beq-	unrecov_fer
 
@@ -973,7 +992,7 @@ BEGIN_FW_FTR_SECTION
 	 * Here we have interrupts hard-disabled, so it is sufficient
 	 * to restore paca->{soft,hard}_enable and get out.
 	 */
-	beq	fast_exc_return_irq	/* Return from exception on success */
+	beq	14f
 END_FW_FTR_SECTION_IFCLR(FW_FEATURE_ISERIES)
 
 	/* For a hash failure, we don't bother re-enabling interrupts */
@@ -1015,6 +1034,7 @@ handle_page_fault:
 	b	.ret_from_except
 
 13:	b	.ret_from_except_lite
+14:	b	.fast_exception_return_irq
 
 /* We have a page fault that hash_page could handle but HV refused
  * the PTE insertion
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 06c7251..ffe08a6 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -564,7 +564,6 @@ _GLOBAL(pmac_secondary_start)
 	 */
 	li	r0,0
 	stb	r0,PACASOFTIRQEN(r13)
-	stb	r0,PACAHARDIRQEN(r13)
 
 	/* Create a temp kernel stack for use before relocation is on.	*/
 	ld	r1,PACAEMERGSP(r13)
@@ -621,13 +620,8 @@ __secondary_start:
 #ifdef CONFIG_PPC_ISERIES
 BEGIN_FW_FTR_SECTION
 	ori	r4,r4,MSR_EE
-	li	r8,1
-	stb	r8,PACAHARDIRQEN(r13)
 END_FW_FTR_SECTION_IFSET(FW_FEATURE_ISERIES)
 #endif
-BEGIN_FW_FTR_SECTION
-	stb	r7,PACAHARDIRQEN(r13)
-END_FW_FTR_SECTION_IFCLR(FW_FEATURE_ISERIES)
 	stb	r7,PACASOFTIRQEN(r13)
 
 	mtspr	SPRN_SRR0,r3
@@ -782,11 +776,8 @@ BEGIN_FW_FTR_SECTION
 	mfmsr	r5
 	ori	r5,r5,MSR_EE		/* Hard Enabled on iSeries*/
 	mtmsrd	r5
-	li	r5,1
 END_FW_FTR_SECTION_IFSET(FW_FEATURE_ISERIES)
 #endif
-	stb	r5,PACAHARDIRQEN(r13)	/* Hard Disabled on others */
-
 	bl	.start_kernel
 
 	/* Not reached */
diff --git a/arch/powerpc/kernel/idle_book3e.S b/arch/powerpc/kernel/idle_book3e.S
index 16c002d..b1199f8 100644
--- a/arch/powerpc/kernel/idle_book3e.S
+++ b/arch/powerpc/kernel/idle_book3e.S
@@ -32,11 +32,11 @@ _GLOBAL(book3e_idle)
 	 * since we may otherwise lose it (doorbells etc...). We know
 	 * that since PACAHARDIRQEN will have been cleared in that case.
 	 */
-	lbz	r3,PACAHARDIRQEN(r13)
+	lbz	r3,PACAIRQHAPPENED(r13)
 	cmpwi	cr0,r3,0
-	beqlr
+	bnelr
 
-	/* Now we are going to mark ourselves as soft and hard enables in
+	/* Now we are going to mark ourselves as soft and hard enabled in
 	 * order to be able to take interrupts while asleep. We inform lockdep
 	 * of that. We don't actually turn interrupts on just yet tho.
 	 */
@@ -46,7 +46,6 @@ _GLOBAL(book3e_idle)
 #endif
 	li	r0,1
 	stb	r0,PACASOFTIRQEN(r13)
-	stb	r0,PACAHARDIRQEN(r13)
 	
 	/* Interrupts will make use return to LR, so get something we want
 	 * in there
@@ -59,7 +58,6 @@ _GLOBAL(book3e_idle)
 	/* Mark them off again in the PACA as well */
 	li	r0,0
 	stb	r0,PACASOFTIRQEN(r13)
-	stb	r0,PACAHARDIRQEN(r13)
 
 	/* Tell lockdep about it */
 #ifdef CONFIG_TRACE_IRQFLAGS
diff --git a/arch/powerpc/kernel/idle_power4.S b/arch/powerpc/kernel/idle_power4.S
index ba31954..c30af92 100644
--- a/arch/powerpc/kernel/idle_power4.S
+++ b/arch/powerpc/kernel/idle_power4.S
@@ -29,14 +29,27 @@ END_FTR_SECTION_IFCLR(CPU_FTR_CAN_NAP)
 	cmpwi	0,r4,0
 	beqlr
 
-	/* Go to NAP now */
+	/* Hard disable */
 	mfmsr	r7
 	rldicl	r0,r7,48,1
 	rotldi	r0,r0,16
 	mtmsrd	r0,1			/* hard-disable interrupts */
+
+	/* Check if something happened while soft-disabled */
+	lbz	r0,PACAIRQHAPPENED(r13)
+	cmpwi	cr0,r0,0
+	bnelr
+
+	/*
+	 * Here we mark ourselves soft-enabled. We should probably
+	 * tell lockdep about it, but the interrupt will re-disable
+	 * immediately so it shouldn't be a big issue. If it becomes
+	 * one, then we should implement things the way we do on
+	 * book3e.
+	 */
 	li	r0,1
 	stb	r0,PACASOFTIRQEN(r13)	/* we'll hard-enable shortly */
-	stb	r0,PACAHARDIRQEN(r13)
+
 BEGIN_FTR_SECTION
 	DSSALL
 	sync
diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S
index fcdff19..61f8cac 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -1,5 +1,5 @@
 /*
- *  This file contains the power_save function for 970-family CPUs.
+ *  This file contains the power_save function for Power7 CPUs
  *
  *  This program is free software; you can redistribute it and/or
  *  modify it under the terms of the GNU General Public License
@@ -51,9 +51,23 @@ _GLOBAL(power7_idle)
 	rldicl	r9,r9,48,1
 	rotldi	r9,r9,16
 	mtmsrd	r9,1			/* hard-disable interrupts */
-	li	r0,0
+
+	/* Check if something happened while soft-disabled */
+	lbz	r0,PACAIRQHAPPENED(r13)
+	cmpwi	cr0,r0,0
+	beq	1f
+	addi	r1,r1,INT_FRAME_SIZE
+	ld	r0,16(r1)
+	mtlr	r0
+	blr
+
+	/*
+	 * Here we mark ourselves soft-diasbled (we should already be
+	 * actually...). The interrupt is only going to happen after
+	 * we return to the caller and it does a local_irq_enable()
+	 */
+1:	li	r0,0
 	stb	r0,PACASOFTIRQEN(r13)	/* we'll hard-enable shortly */
-	stb	r0,PACAHARDIRQEN(r13)
 	stb	r0,PACA_NAPSTATELOST(r13)
 
 	/* Continue saving state */
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 701d4ac..7e5b94b 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -72,6 +72,7 @@
 #include <asm/paca.h>
 #include <asm/firmware.h>
 #include <asm/lv1call.h>
+#include <asm/irqflags.h>
 #endif
 #define CREATE_TRACE_POINTS
 #include <asm/trace.h>
@@ -99,14 +100,14 @@ EXPORT_SYMBOL(irq_desc);
 
 int distribute_irqs = 1;
 
-static inline notrace unsigned long get_hard_enabled(void)
+static inline notrace unsigned long get_irq_happened(void)
 {
-	unsigned long enabled;
+	unsigned long happened;
 
 	__asm__ __volatile__("lbz %0,%1(13)"
-	: "=r" (enabled) : "i" (offsetof(struct paca_struct, hard_enabled)));
+	: "=r" (happened) : "i" (offsetof(struct paca_struct, irq_happened)));
 
-	return enabled;
+	return happened;
 }
 
 static inline notrace void set_soft_enabled(unsigned long enable)
@@ -115,81 +116,140 @@ static inline notrace void set_soft_enabled(unsigned long enable)
 	: : "r" (enable), "i" (offsetof(struct paca_struct, soft_enabled)));
 }
 
-static inline notrace void decrementer_check_overflow(void)
+static inline int decrementer_check_overflow(void)
 {
 	u64 now = get_tb_or_rtc();
 	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
 
-	if (now >= *next_tb)
-		set_dec(1);
+	return now >= *next_tb;
 }
 
-notrace void arch_local_irq_restore(unsigned long en)
+/* This is called whenever we are re-enabling interrupts
+ * and returns either 0 (nothing to do) or 500/900 if there's
+ * either an EE or a DEC to generate.
+ *
+ * This is called in two contexts: From arch_local_irq_restore()
+ * before soft-enabling interrupts, and from the exception exit
+ * path when returning from an interrupt from a soft-disabled to
+ * a soft enabled context. In both case we have interrupts hard
+ * disabled.
+ *
+ * We take care of only clearing the bits we handled in the
+ * PACA irq_happened field since we can only re-emit one at a
+ * time and we don't want to "lose" one.
+ */
+notrace unsigned int __check_irq_reemit(void)
 {
 	/*
-	 * get_paca()->soft_enabled = en;
-	 * Is it ever valid to use local_irq_restore(0) when soft_enabled is 1?
-	 * That was allowed before, and in such a case we do need to take care
-	 * that gcc will set soft_enabled directly via r13, not choose to use
-	 * an intermediate register, lest we're preempted to a different cpu.
+	 * We use local_paca rather than get_paca() to avoid all
+	 * the debug_smp_processor_id() business in this low level
+	 * function
 	 */
-	set_soft_enabled(en);
-	if (!en)
-		return;
+	unsigned char happened = local_paca->irq_happened;
+
+	/* Clear bit 0 which we wouldn't clear otherwise */
+	local_paca->irq_happened &= ~1;
 
-#ifdef CONFIG_PPC_STD_MMU_64
-	if (firmware_has_feature(FW_FEATURE_ISERIES)) {
-		/*
-		 * Do we need to disable preemption here?  Not really: in the
-		 * unlikely event that we're preempted to a different cpu in
-		 * between getting r13, loading its lppaca_ptr, and loading
-		 * its any_int, we might call iseries_handle_interrupts without
-		 * an interrupt pending on the new cpu, but that's no disaster,
-		 * is it?  And the business of preempting us off the old cpu
-		 * would itself involve a local_irq_restore which handles the
-		 * interrupt to that cpu.
-		 *
-		 * But use "local_paca->lppaca_ptr" instead of "get_lppaca()"
-		 * to avoid any preemption checking added into get_paca().
-		 */
-		if (local_paca->lppaca_ptr->int_dword.any_int)
-			iseries_handle_interrupts();
+	/*
+	 * Force the delivery of pending soft-disabled interrupts on PS3.
+	 * Any HV call will have this side effect.
+	 */
+	if (firmware_has_feature(FW_FEATURE_PS3_LV1)) {
+		u64 tmp, tmp2;
+		lv1_get_version_info(&tmp, &tmp2);
 	}
-#endif /* CONFIG_PPC_STD_MMU_64 */
 
 	/*
-	 * if (get_paca()->hard_enabled) return;
-	 * But again we need to take care that gcc gets hard_enabled directly
-	 * via r13, not choose to use an intermediate register, lest we're
-	 * preempted to a different cpu in between the two instructions.
+	 * We may have missed a decrementer interrupt. We check the
+	 * decrementer itself rather than the paca irq_happened field
+	 * in case we also had a rollover while hard disabled
 	 */
-	if (get_hard_enabled())
-		return;
+	local_paca->irq_happened &= ~PACA_HAPPENED_DEC;
+	if (decrementer_check_overflow())
+		return 0x900;
+
+	/* Finally check if an external interrupt happened */
+	local_paca->irq_happened &= ~PACA_HAPPENED_EE;
+	if (happened & PACA_HAPPENED_EE)
+		return 0x500;
+
+#ifdef CONFIG_PPC_BOOK3E
+	/* Finally check if an EPR external interrupt happened
+	 * this bit is typically set if we need to handle another
+	 * "edge" interrupt from within the MPIC "EPR" handler
+	 */
+	local_paca->irq_happened &= ~PACA_HAPPENED_EE_EDGE;
+	if (happened & PACA_HAPPENED_EE_EDGE)
+		return 0x500;
+
+	local_paca->irq_happened &= ~PACA_HAPPENED_DBELL;
+	if (happened & PACA_HAPPENED_DBELL)
+		return 0x280;
+#endif /* CONFIG_PPC_BOOK3E */
+
+	/* There should be nothing left ! */
+	BUG_ON(local_paca->irq_happened != 0);
+
+	return 0;
+}
 
+notrace void arch_local_irq_restore(unsigned long en)
+{
+	unsigned int reemit;
+
+	/* Write the new soft-enabled value */
+	set_soft_enabled(en);
+	if (!en)
+		return;
 	/*
-	 * Need to hard-enable interrupts here.  Since currently disabled,
-	 * no need to take further asm precautions against preemption; but
-	 * use local_paca instead of get_paca() to avoid preemption checking.
+	 * From this point onward, we can take interrupts, preempt,
+	 * etc... unless we got hard-disabled. We check if an event
+	 * happened. If none happened, we know we can just return.
+	 *
+	 * We may have preempted before the check below, in which case
+	 * we are checking the "new" CPU instead of the old one. This
+	 * is only a problem if an event happened on the "old" CPU.
+	 *
+	 * External interrupt events on non-iseries will have caused
+	 * interrupts to be hard-disabled, so there is no problem, we
+	 * cannot have preempted.
+	 *
+	 * That leaves us with EEs on iSeries or decrementer interrupts,
+	 * which I decided to safely ignore. The preemption would have
+	 * itself been the result of an interrupt, upon which return we
+	 * will have checked for pending events on the old CPU.
 	 */
-	local_paca->hard_enabled = en;
+	if (!get_irq_happened())
+		return;
+	/*
+	 * We need to hard disable to get a trusted value from
+	 * __check_irq_reemit(). We also need to soft-disable
+	 * again to avoid warnings in there due to the use of
+	 * per-cpu variables.
+	 */
+	__hard_irq_disable();
+	set_soft_enabled(0);
 
 	/*
-	 * Trigger the decrementer if we have a pending event. Some processors
-	 * only trigger on edge transitions of the sign bit. We might also
-	 * have disabled interrupts long enough that the decrementer wrapped
-	 * to positive.
+	 * Check if anything needs to be re-emitted. We haven't
+	 * soft-enabled yet to avoid warnings in decrementer_check_overflow
+	 * accessing per-cpu variables
 	 */
-	decrementer_check_overflow();
+	reemit = __check_irq_reemit();
+
+	/* We can soft-enable now */
+	set_soft_enabled(1);
 
 	/*
-	 * Force the delivery of pending soft-disabled interrupts on PS3.
-	 * Any HV call will have this side effect.
+	 * And re-emit if we have to. This will return with interrupts
+	 * hard-enabled.
 	 */
-	if (firmware_has_feature(FW_FEATURE_PS3_LV1)) {
-		u64 tmp, tmp2;
-		lv1_get_version_info(&tmp, &tmp2);
+	if (reemit) {
+		__reemit_interrupt(reemit);
+		return;
 	}
 
+	/* Finally, let's ensure we are hard enabled */
 	__hard_irq_enable();
 }
 EXPORT_SYMBOL(arch_local_irq_restore);
@@ -360,8 +420,27 @@ void do_IRQ(struct pt_regs *regs)
 
 	check_stack_overflow();
 
+	/* Query the platform PIC for the interrupt & ack it */
 	irq = ppc_md.get_irq();
 
+#ifdef CONFIG_PPC64
+	/*
+	 * At this point, we are soft-disabled and hard-disabled.
+	 *
+	 * get_irq() will have caused the PIC to lower the EE line
+	 * so we can improve the quality of perf samples by hard
+	 * enabling in order to let performance interrupts through.
+	 *
+	 * In the event where we might have another interrupt pending
+	 * the worst case is that we take it and hard-disable again
+	 * after setting irq_happened, which will cause us to come,
+	 * back when the interrupt exit tests paca->irq_happened again
+	 */
+	get_paca()->irq_happened &= ~PACA_HAPPENED;
+	__hard_irq_enable();
+#endif /* CONFIG_PPC64 */
+
+	/* Process the interrupt */
 	if (irq != NO_IRQ && irq != NO_IRQ_IGNORE)
 		handle_one_irq(irq);
 	else if (irq != NO_IRQ_IGNORE)
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 567dd7c..39f201f 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -259,7 +259,6 @@ void accumulate_stolen_time(void)
 	u64 sst, ust;
 
 	u8 save_soft_enabled = local_paca->soft_enabled;
-	u8 save_hard_enabled = local_paca->hard_enabled;
 
 	/* We are called early in the exception entry, before
 	 * soft/hard_enabled are sync'ed to the expected state
@@ -268,7 +267,6 @@ void accumulate_stolen_time(void)
 	 * complain
 	 */
 	local_paca->soft_enabled = 0;
-	local_paca->hard_enabled = 0;
 
 	sst = scan_dispatch_log(local_paca->starttime_user);
 	ust = scan_dispatch_log(local_paca->starttime);
@@ -277,7 +275,6 @@ void accumulate_stolen_time(void)
 	local_paca->stolen_time += ust + sst;
 
 	local_paca->soft_enabled = save_soft_enabled;
-	local_paca->hard_enabled = save_hard_enabled;
 }
 
 static inline u64 calculate_stolen_time(u64 stop_tb)
@@ -589,6 +586,18 @@ void timer_interrupt(struct pt_regs * regs)
 		do_IRQ(regs);
 #endif
 
+#ifdef CONFIG_PPC64
+	/* Let's hard enable interrupts now that we have reset
+	 * the DEC (or acked it on BookE)
+	 *
+	 * We skip that if there's a pending EE "level" interrupt
+	 * as an optimization
+	 */
+	get_paca()->irq_happened &= ~PACA_HAPPENED;
+	if (!(get_paca()->irq_happened & PACA_HAPPENED_EE))
+		__hard_irq_enable();	
+#endif /* CONFIG_PPC64 */
+
 	old_regs = set_irq_regs(regs);
 	irq_enter();
 
diff --git a/arch/powerpc/platforms/iseries/Makefile b/arch/powerpc/platforms/iseries/Makefile
index a7602b1..7208589 100644
--- a/arch/powerpc/platforms/iseries/Makefile
+++ b/arch/powerpc/platforms/iseries/Makefile
@@ -2,7 +2,7 @@ ccflags-y	:= -mno-minimal-toc
 
 obj-y += exception.o
 obj-y += hvlog.o hvlpconfig.o lpardata.o setup.o dt.o mf.o lpevents.o \
-	hvcall.o proc.o htab.o iommu.o misc.o irq.o
+	hvcall.o proc.o htab.o iommu.o irq.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SMP) += smp.o
 obj-$(CONFIG_VIOPATH) += viopath.o vio.o
diff --git a/arch/powerpc/platforms/iseries/exception.S b/arch/powerpc/platforms/iseries/exception.S
index f519ee1..508f863 100644
--- a/arch/powerpc/platforms/iseries/exception.S
+++ b/arch/powerpc/platforms/iseries/exception.S
@@ -32,6 +32,7 @@
 #include <asm/ptrace.h>
 #include <asm/cputable.h>
 #include <asm/mmu.h>
+#include <asm/hw_irq.h>
 
 #include "exception.h"
 
@@ -261,16 +262,20 @@ system_call_iSeries:
 
 decrementer_iSeries_masked:
 	/* We may not have a valid TOC pointer in here. */
-	li	r11,1
+	li	r11,PACA_HAPPENED_DEC
 	ld	r12,PACALPPACAPTR(r13)
 	stb	r11,LPPACADECRINT(r12)
 	li	r12,-1
 	clrldi	r12,r12,33	/* set DEC to 0x7fffffff */
 	mtspr	SPRN_DEC,r12
-	/* fall through */
+	b	1f
 
 hardware_interrupt_iSeries_masked:
-	mtcrf	0x80,r9		/* Restore regs */
+	li	r11,PACA_HAPPENED_EE
+1:	mtcrf	0x80,r9		/* Restore regs */
+	lbz	r10,PACAIRQHAPPENED(r13)
+	or	r11,r10,r11
+	stb	r11,PACAIRQHAPPENED(r13)
 	ld	r12,PACALPPACAPTR(r13)
 	ld	r11,LPPACASRR0(r12)
 	ld	r12,LPPACASRR1(r12)
diff --git a/arch/powerpc/platforms/iseries/misc.S b/arch/powerpc/platforms/iseries/misc.S
deleted file mode 100644
index 2c6ff0f..0000000
--- a/arch/powerpc/platforms/iseries/misc.S
+++ /dev/null
@@ -1,26 +0,0 @@
-/*
- * This file contains miscellaneous low-level functions.
- *    Copyright (C) 1995-2005 IBM Corp
- *
- * Largely rewritten by Cort Dougan (cort@cs.nmt.edu)
- * and Paul Mackerras.
- * Adapted for iSeries by Mike Corrigan (mikejc@us.ibm.com)
- * PPC64 updates by Dave Engebretsen (engebret@us.ibm.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#include <asm/processor.h>
-#include <asm/asm-offsets.h>
-#include <asm/ppc_asm.h>
-
-	.text
-
-/* Handle pending interrupts in interrupt context */
-_GLOBAL(iseries_handle_interrupts)
-	li	r0,0x5555
-	sc
-	blr
diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
index 085fd3f..019a529 100644
--- a/arch/powerpc/platforms/pseries/processor_idle.c
+++ b/arch/powerpc/platforms/pseries/processor_idle.c
@@ -96,6 +96,25 @@ out:
 	return index;
 }
 
+static int check_and_cede_processor(void)
+{
+	/*
+	 * Interrupts are soft-disabled at this point,
+	 * but not hard disabled. So an interrupt might have
+	 * occurred before entering NAP, and would be potentially
+	 * lost (edge events, decrementer events, etc...) unless
+	 * we first hard disable then check.
+	 *
+	 * We must use the low level __hard_irq_disable() and not
+	 * hard_irq_disable() as the later will set a bit in
+	 * paca->irq_happened (to force re-enable later) which we
+	 * don't need nor want here.
+	 */
+	__hard_irq_disable();
+	if (get_paca()->irq_happened == 0)
+		cede_processor();
+}
+
 static int dedicated_cede_loop(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv,
 				int index)
@@ -108,7 +127,8 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 
 	ppc64_runlatch_off();
 	HMT_medium();
-	cede_processor();
+
+	check_and_cede_processor();
 
 	get_lppaca()->donate_dedicated_cpu = 0;
 	dev->last_residency =
@@ -132,7 +152,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 	 * processor. When returning here, external interrupts
 	 * are enabled.
 	 */
-	cede_processor();
+	check_and_cede_processor();
 
 	dev->last_residency =
 		(int)idle_loop_epilog(in_purr, kt_before);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH] powerpc: Fix WARN_ON in decrementer_check_overflow
From: Benjamin Herrenschmidt @ 2012-02-09  5:34 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hugh Dickins

We use __get_cpu_var() which triggers a false positive warning
in smp_processor_id() thinking interrupts are enabled (at this
point, they are soft-enabled but hard-disabled).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

I was initially planning to fix that with a more in-depth rework
of how we do lazy irq disabling on powerpc, but that patch is
becoming too complex for this release so I'll apply this as a
stop-gap and leave the full rework for -next
 
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 701d4ac..01e2877 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -118,10 +118,14 @@ static inline notrace void set_soft_enabled(unsigned long enable)
 static inline notrace void decrementer_check_overflow(void)
 {
 	u64 now = get_tb_or_rtc();
-	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+	u64 *next_tb;
+
+	preempt_disable();
+	next_tb = &__get_cpu_var(decrementers_next_tb);
 
 	if (now >= *next_tb)
 		set_dec(1);
+	preempt_enable();
 }
 
 notrace void arch_local_irq_restore(unsigned long en)

^ permalink raw reply related

* [PATCH 1/2 v4] powerpc/85xx: Add p1020rdb-pc platform support
From: Zhicheng Fan @ 2012-02-09  5:40 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Zhicheng Fan

From: Zhicheng Fan <b32736@freeescale.com>

Signed-off-by: Zhicheng Fan <b32736@freeescale.com>
---
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c |   26 +++++++++++++++++++++++++-
 1 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index dce8aaf..a0b9c92 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -1,7 +1,7 @@
 /*
  * MPC85xx RDB Board Setup
  *
- * Copyright 2009 Freescale Semiconductor Inc.
+ * Copyright 2009,2012 Freescale Semiconductor Inc.
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -168,10 +168,20 @@ qe_fail:
 
 machine_device_initcall(p2020_rdb, mpc85xx_common_publish_devices);
 machine_device_initcall(p1020_rdb, mpc85xx_common_publish_devices);
+machine_device_initcall(p1020_rdb_pc, mpc85xx_common_publish_devices);
 
 /*
  * Called very early, device-tree isn't unflattened
  */
+static int __init p1020_rdb_pc_probe(void)
+{
+	unsigned long root = of_get_flat_dt_root();
+
+	if (of_flat_dt_is_compatible(root, "fsl,P1020RDB-PC"))
+		return 1;
+	return 0;
+}
+
 static int __init p2020_rdb_probe(void)
 {
 	unsigned long root = of_get_flat_dt_root();
@@ -217,3 +227,17 @@ define_machine(p1020_rdb) {
 	.calibrate_decr		= generic_calibrate_decr,
 	.progress		= udbg_progress,
 };
+
+define_machine(p1020_rdb_pc) {
+	.name			= "P1020RDB-PC",
+	.probe			= p1020_rdb_pc_probe,
+	.setup_arch		= mpc85xx_rdb_setup_arch,
+	.init_IRQ		= mpc85xx_rdb_pic_init,
+#ifdef CONFIG_PCI
+	.pcibios_fixup_bus	= fsl_pcibios_fixup_bus,
+#endif
+	.get_irq		= mpic_get_irq,
+	.restart		= fsl_rstcr_restart,
+	.calibrate_decr		= generic_calibrate_decr,
+	.progress		= udbg_progress,
+};
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH 2/2 v4] powerpc/dts: Add dts for p1020rdb-pc board
From: Zhicheng Fan @ 2012-02-09  5:40 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Zhicheng Fan
In-Reply-To: <1328766036-17697-1-git-send-email-B32736@freescale.com>

From: Zhicheng Fan <b32736@freeescale.com>

P1020RDB-PC Overview
------------------
1Gbyte DDR3 SDRAM
32 Mbyte NAND flash
10 16Mbyte NOR flash
16 Mbyte SPI flash
SD connector to interface with the SD memory card
Real-time clock on I2C bus

PCIe:
- x1 PCIe slot
- x1 mini-PCIe slot

10/100/1000 BaseT Ethernet ports:
- eTSEC1, RGMII: one 10/100/1000 port using VitesseTM VSC7385 L2 switch
- eTSEC2, SGMII: one 10/100/1000 port using VitesseTM VSC8221
- eTSEC3, RGMII: one 10/100/1000 port using AtherosTM AR8021

USB 2.0 port:
- Two USB2.0 Type A receptacles
- One USB2.0 signal to Mini PCIe slot

Dual RJ45 UART ports:
- DUART interface: supports two UARTs up to 115200 bps for console display

Signed-off-by: Zhicheng Fan <b32736@freeescale.com>
---
 arch/powerpc/boot/dts/p1020rdb-pc.dts            |   90 ++++++++
 arch/powerpc/boot/dts/p1020rdb-pc.dtsi           |  247 ++++++++++++++++++++++
 arch/powerpc/boot/dts/p1020rdb-pc_36b.dts        |   90 ++++++++
 arch/powerpc/boot/dts/p1020rdb-pc_camp_core0.dts |   64 ++++++
 arch/powerpc/boot/dts/p1020rdb-pc_camp_core1.dts |  142 +++++++++++++
 5 files changed, 633 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/p1020rdb-pc.dts
 create mode 100644 arch/powerpc/boot/dts/p1020rdb-pc.dtsi
 create mode 100644 arch/powerpc/boot/dts/p1020rdb-pc_36b.dts
 create mode 100644 arch/powerpc/boot/dts/p1020rdb-pc_camp_core0.dts
 create mode 100644 arch/powerpc/boot/dts/p1020rdb-pc_camp_core1.dts

diff --git a/arch/powerpc/boot/dts/p1020rdb-pc.dts b/arch/powerpc/boot/dts/p1020rdb-pc.dts
new file mode 100644
index 0000000..5c333b0
--- /dev/null
+++ b/arch/powerpc/boot/dts/p1020rdb-pc.dts
@@ -0,0 +1,90 @@
+/*
+ * P1020 RDB-PC Device Tree Source
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in the
+ *       documentation and/or other materials provided with the distribution.
+ *     * Neither the name of Freescale Semiconductor nor the
+ *       names of its contributors may be used to endorse or promote products
+ *       derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ "fsl/p1020si-pre.dtsi"
+/ {
+	model = "fsl,P1020RDB-PC";
+	compatible = "fsl,P1020RDB-PC";
+
+	memory {
+		device_type = "memory";
+	};
+
+	lbc: localbus@ffe05000 {
+		reg = <0 0xffe05000 0 0x1000>;
+
+		/* NOR, NAND Flashes and Vitesse 5 port L2 switch */
+		ranges = <0x0 0x0 0x0 0xef000000 0x01000000
+			  0x1 0x0 0x0 0xff800000 0x00040000
+			  0x2 0x0 0x0 0xffb00000 0x00020000
+			  0x3 0x0 0x0 0xffa00000 0x00020000>;
+	};
+
+	soc: soc@ffe00000 {
+		ranges = <0x0 0x0 0xffe00000 0x100000>;
+	};
+
+	pci0: pcie@ffe09000 {
+		ranges = <0x2000000 0x0 0xa0000000 0 0xa0000000 0x0 0x20000000
+			  0x1000000 0x0 0x00000000 0 0xffc10000 0x0 0x10000>;
+		reg = <0 0xffe09000 0 0x1000>;
+		pcie@0 {
+			ranges = <0x2000000 0x0 0xa0000000
+				  0x2000000 0x0 0xa0000000
+				  0x0 0x20000000
+
+				  0x1000000 0x0 0x0
+				  0x1000000 0x0 0x0
+				  0x0 0x100000>;
+		};
+	};
+
+	pci1: pcie@ffe0a000 {
+		reg = <0 0xffe0a000 0 0x1000>;
+		ranges = <0x2000000 0x0 0x80000000 0 0x80000000 0x0 0x20000000
+			  0x1000000 0x0 0x00000000 0 0xffc00000 0x0 0x10000>;
+		pcie@0 {
+			ranges = <0x2000000 0x0 0x80000000
+				  0x2000000 0x0 0x80000000
+				  0x0 0x20000000
+
+				  0x1000000 0x0 0x0
+				  0x1000000 0x0 0x0
+				  0x0 0x100000>;
+		};
+	};
+};
+
+/include/ "p1020rdb-pc.dtsi"
+/include/ "fsl/p1020si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/p1020rdb-pc.dtsi b/arch/powerpc/boot/dts/p1020rdb-pc.dtsi
new file mode 100644
index 0000000..c952cd3
--- /dev/null
+++ b/arch/powerpc/boot/dts/p1020rdb-pc.dtsi
@@ -0,0 +1,247 @@
+/*
+ * P1020 RDB-PC Device Tree Source stub (no addresses or top-level ranges)
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in the
+ *       documentation and/or other materials provided with the distribution.
+ *     * Neither the name of Freescale Semiconductor nor the
+ *       names of its contributors may be used to endorse or promote products
+ *       derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+&lbc {
+	nor@0,0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "cfi-flash";
+		reg = <0x0 0x0 0x1000000>;
+		bank-width = <2>;
+		device-width = <1>;
+
+		partition@0 {
+			/* This location must not be altered  */
+			/* 256KB for Vitesse 7385 Switch firmware */
+			reg = <0x0 0x00040000>;
+			label = "NOR Vitesse-7385 Firmware";
+			read-only;
+		};
+
+		partition@40000 {
+			/* 256KB for DTB Image */
+			reg = <0x00040000 0x00040000>;
+			label = "NOR DTB Image";
+		};
+
+		partition@80000 {
+			/* 3.5 MB for Linux Kernel Image */
+			reg = <0x00080000 0x00380000>;
+			label = "NOR Linux Kernel Image";
+		};
+
+		partition@400000 {
+			/* 11MB for JFFS2 based Root file System */
+			reg = <0x00400000 0x00b00000>;
+			label = "NOR JFFS2 Root File System";
+		};
+
+		partition@f00000 {
+			/* This location must not be altered  */
+			/* 512KB for u-boot Bootloader Image */
+			/* 512KB for u-boot Environment Variables */
+			reg = <0x00f00000 0x00100000>;
+			label = "NOR U-Boot Image";
+			read-only;
+		};
+	};
+
+	nand@1,0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "fsl,p1020-fcm-nand",
+			     "fsl,elbc-fcm-nand";
+		reg = <0x1 0x0 0x40000>;
+
+		partition@0 {
+			/* This location must not be altered  */
+			/* 1MB for u-boot Bootloader Image */
+			reg = <0x0 0x00100000>;
+			label = "NAND U-Boot Image";
+			read-only;
+		};
+
+		partition@100000 {
+			/* 1MB for DTB Image */
+			reg = <0x00100000 0x00100000>;
+			label = "NAND DTB Image";
+		};
+
+		partition@200000 {
+			/* 4MB for Linux Kernel Image */
+			reg = <0x00200000 0x00400000>;
+			label = "NAND Linux Kernel Image";
+		};
+
+		partition@600000 {
+			/* 4MB for Compressed Root file System Image */
+			reg = <0x00600000 0x00400000>;
+			label = "NAND Compressed RFS Image";
+		};
+
+		partition@a00000 {
+			/* 7MB for JFFS2 based Root file System */
+			reg = <0x00a00000 0x00700000>;
+			label = "NAND JFFS2 Root File System";
+		};
+
+		partition@1100000 {
+			/* 15MB for JFFS2 based Root file System */
+			reg = <0x01100000 0x00f00000>;
+			label = "NAND Writable User area";
+		};
+	};
+
+	L2switch@2,0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "vitesse-7385";
+		reg = <0x2 0x0 0x20000>;
+	};
+
+	cpld@3,0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "cpld";
+		reg = <0x3 0x0 0x20000>;
+		read-only;
+	};
+};
+
+&soc {
+	i2c@3000 {
+		rtc@68 {
+			compatible = "pericom,pt7c4338";
+			reg = <0x68>;
+		};
+	};
+
+	spi@7000 {
+		flash@0 {
+			#address-cells = <1>;
+			#size-cells = <1>;
+			compatible = "spansion,s25sl12801";
+			reg = <0>;
+			spi-max-frequency = <40000000>; /* input clock */
+
+			partition@u-boot {
+				/* 512KB for u-boot Bootloader Image */
+				reg = <0x0 0x00080000>;
+				label = "u-boot";
+				read-only;
+			};
+
+			partition@dtb {
+				/* 512KB for DTB Image*/
+				reg = <0x00080000 0x00080000>;
+				label = "dtb";
+			};
+
+			partition@kernel {
+				/* 4MB for Linux Kernel Image */
+				reg = <0x00100000 0x00400000>;
+				label = "kernel";
+			};
+
+			partition@fs {
+				/* 4MB for Compressed RFS Image */
+				reg = <0x00500000 0x00400000>;
+				label = "file system";
+			};
+
+			partition@jffs-fs {
+				/* 7MB for JFFS2 based RFS */
+				reg = <0x00900000 0x00700000>;
+				label = "file system jffs2";
+			};
+		};
+	};
+
+	usb@22000 {
+		phy_type = "ulpi";
+	};
+
+	/* USB2 is shared with localbus, so it must be disabled
+	   by default. We can't put 'status = "disabled";' here
+	   since U-Boot doesn't clear the status property when
+	   it enables USB2. OTOH, U-Boot does create a new node
+	   when there isn't any. So, just comment it out.
+	usb@23000 {
+		phy_type = "ulpi";
+	};
+	*/
+
+	mdio@24000 {
+		phy0: ethernet-phy@0 {
+			interrupt-parent = <&mpic>;
+			interrupts = <3 1>;
+			reg = <0x0>;
+		};
+
+		phy1: ethernet-phy@1 {
+			interrupt-parent = <&mpic>;
+			interrupts = <2 1>;
+			reg = <0x1>;
+		};
+
+		tbi0: tbi-phy@11 {
+			device_type = "tbi-phy";
+			reg = <0x11>;
+		};
+	};
+
+	mdio@25000 {
+		tbi1: tbi-phy@11 {
+			reg = <0x11>;
+			device_type = "tbi-phy";
+		};
+	};
+
+	enet0: ethernet@b0000 {
+		fixed-link = <1 1 1000 0 0>;
+		phy-connection-type = "rgmii-id";
+
+	};
+
+	enet1: ethernet@b1000 {
+		phy-handle = <&phy0>;
+		tbi-handle = <&tbi1>;
+		phy-connection-type = "sgmii";
+	};
+
+	enet2: ethernet@b2000 {
+		phy-handle = <&phy1>;
+		phy-connection-type = "rgmii-id";
+	};
+};
diff --git a/arch/powerpc/boot/dts/p1020rdb-pc_36b.dts b/arch/powerpc/boot/dts/p1020rdb-pc_36b.dts
new file mode 100644
index 0000000..ca736a0
--- /dev/null
+++ b/arch/powerpc/boot/dts/p1020rdb-pc_36b.dts
@@ -0,0 +1,90 @@
+/*
+ * P1020 RDB-PC Device Tree Source (36-bit address map)
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in the
+ *       documentation and/or other materials provided with the distribution.
+ *     * Neither the name of Freescale Semiconductor nor the
+ *       names of its contributors may be used to endorse or promote products
+ *       derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ "fsl/p1020si-pre.dtsi"
+/ {
+	model = "fsl,P1020RDB-PC";
+	compatible = "fsl,P1020RDB-PC";
+
+	memory {
+		device_type = "memory";
+	};
+
+	lbc: localbus@fffe05000 {
+		reg = <0xf 0xffe05000 0 0x1000>;
+
+		/* NOR, NAND Flashes and Vitesse 5 port L2 switch */
+		ranges = <0x0 0x0 0xf 0xef000000 0x01000000
+			  0x1 0x0 0xf 0xff800000 0x00040000
+			  0x2 0x0 0xf 0xffb00000 0x00040000
+			  0x3 0x0 0xf 0xffa00000 0x00020000>;
+	};
+
+	soc: soc@fffe00000 {
+		ranges = <0x0 0xf 0xffe00000 0x100000>;
+	};
+
+	pci0: pcie@fffe09000 {
+		reg = <0xf 0xffe09000 0 0x1000>;
+		ranges = <0x2000000 0x0 0xc0000000 0xc 0x20000000 0x0 0x20000000
+			  0x1000000 0x0 0x00000000 0xf 0xffc10000 0x0 0x10000>;
+		pcie@0 {
+			ranges = <0x2000000 0x0 0xc0000000
+				  0x2000000 0x0 0xc0000000
+				  0x0 0x20000000
+
+				  0x1000000 0x0 0x0
+				  0x1000000 0x0 0x0
+				  0x0 0x100000>;
+		};
+	};
+
+	pci1: pcie@fffe0a000 {
+		reg = <0xf 0xffe0a000 0 0x1000>;
+		ranges = <0x2000000 0x0 0x80000000 0xc 0x00000000 0x0 0x20000000
+			  0x1000000 0x0 0x00000000 0xf 0xffc00000 0x0 0x10000>;
+		pcie@0 {
+			ranges = <0x2000000 0x0 0x80000000
+				  0x2000000 0x0 0x80000000
+				  0x0 0x20000000
+
+				  0x1000000 0x0 0x0
+				  0x1000000 0x0 0x0
+				  0x0 0x100000>;
+		};
+	};
+};
+
+/include/ "p1020rdb.dtsi"
+/include/ "fsl/p1020si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/p1020rdb-pc_camp_core0.dts b/arch/powerpc/boot/dts/p1020rdb-pc_camp_core0.dts
new file mode 100644
index 0000000..9f93b3a
--- /dev/null
+++ b/arch/powerpc/boot/dts/p1020rdb-pc_camp_core0.dts
@@ -0,0 +1,64 @@
+/*
+ * P1020 RDB-PC  Core0 Device Tree Source in CAMP mode.
+ *
+ * In CAMP mode, each core needs to have its own dts. Only mpic and L2 cache
+ * can be shared, all the other devices must be assigned to one core only.
+ * This dts file allows core0 to have memory, l2, i2c, spi, gpio, tdm, dma, usb,
+ * eth1, eth2, sdhc, crypto, global-util, message, pci0, pci1, msi.
+ *
+ * Please note to add "-b 0" for core0's dts compiling.
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/include/ "p1020rdb-pc.dts"
+
+/ {
+	model = "fsl,P1020RDB-PC";
+	compatible = "fsl,P1020RDB-PC";
+
+	aliases {
+		ethernet1 = &enet1;
+		ethernet2 = &enet2;
+		serial0 = &serial0;
+		pci0 = &pci0;
+		pci1 = &pci1;
+	};
+
+	cpus {
+		PowerPC,P1020@1 {
+			status = "disabled";
+		};
+	};
+
+	memory {
+		device_type = "memory";
+	};
+
+	localbus@ffe05000 {
+		status = "disabled";
+	};
+
+	soc@ffe00000 {
+		serial1: serial@4600 {
+			status = "disabled";
+		};
+
+		enet0: ethernet@b0000 {
+			status = "disabled";
+		};
+
+		mpic: pic@40000 {
+			protected-sources = <
+			42 29 30 34	/* serial1, enet0-queue-group0 */
+			17 18 24 45	/* enet0-queue-group1, crypto */
+			>;
+			pic-no-reset;
+		};
+	};
+};
diff --git a/arch/powerpc/boot/dts/p1020rdb-pc_camp_core1.dts b/arch/powerpc/boot/dts/p1020rdb-pc_camp_core1.dts
new file mode 100644
index 0000000..d3281ab
--- /dev/null
+++ b/arch/powerpc/boot/dts/p1020rdb-pc_camp_core1.dts
@@ -0,0 +1,142 @@
+/*
+ * P1020 RDB-PC Core1 Device Tree Source in CAMP mode.
+ *
+ * In CAMP mode, each core needs to have its own dts. Only mpic and L2 cache
+ * can be shared, all the other devices must be assigned to one core only.
+ * This dts allows core1 to have l2, eth0, crypto.
+ *
+ * Please note to add "-b 1" for core1's dts compiling.
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/include/ "p1020rdb-pc.dts"
+
+/ {
+	model = "fsl,P1020RDB-PC";
+	compatible = "fsl,P1020RDB-PC";
+
+	aliases {
+		ethernet0 = &enet0;
+		serial0 = &serial1;
+		};
+
+	cpus {
+		PowerPC,P1020@0 {
+			status = "disabled";
+		};
+	};
+
+	memory {
+		device_type = "memory";
+	};
+
+	localbus@ffe05000 {
+		status = "disabled";
+	};
+
+	soc@ffe00000 {
+		ecm-law@0 {
+			status = "disabled";
+		};
+
+		ecm@1000 {
+			status = "disabled";
+		};
+
+		memory-controller@2000 {
+			status = "disabled";
+		};
+
+		i2c@3000 {
+			status = "disabled";
+		};
+
+		i2c@3100 {
+			status = "disabled";
+		};
+
+		serial0: serial@4500 {
+			status = "disabled";
+		};
+
+		spi@7000 {
+			status = "disabled";
+		};
+
+		gpio: gpio-controller@f000 {
+			status = "disabled";
+		};
+
+		dma@21300 {
+			status = "disabled";
+		};
+
+		mdio@24000 {
+			status = "disabled";
+		};
+
+		mdio@25000 {
+			status = "disabled";
+		};
+
+		enet1: ethernet@b1000 {
+			status = "disabled";
+		};
+
+		enet2: ethernet@b2000 {
+			status = "disabled";
+		};
+
+		usb@22000 {
+			status = "disabled";
+		};
+
+		sdhci@2e000 {
+			status = "disabled";
+		};
+
+		mpic: pic@40000 {
+			protected-sources = <
+			16 		/* ecm, mem, L2, pci0, pci1 */
+			43 42 59	/* i2c, serial0, spi */
+			47 63 62 	/* gpio, tdm */
+			20 21 22 23	/* dma */
+			03 02 		/* mdio */
+			35 36 40	/* enet1-queue-group0 */
+			51 52 67	/* enet1-queue-group1 */
+			31 32 33	/* enet2-queue-group0 */
+			25 26 27	/* enet2-queue-group1 */
+			28 72 58 	/* usb, sdhci, crypto */
+			0xb0 0xb1 0xb2	/* message */
+			0xb3 0xb4 0xb5
+			0xb6 0xb7
+			0xe0 0xe1 0xe2	/* msi */
+			0xe3 0xe4 0xe5
+			0xe6 0xe7		/* sdhci, crypto , pci */
+			>;
+			pic-no-reset;
+		};
+
+		msi@41600 {
+			status = "disabled";
+		};
+
+		global-utilities@e0000 {	//global utilities block
+			status = "disabled";
+		};
+	};
+
+	pci0: pcie@ffe09000 {
+		status = "disabled";
+	};
+
+	pci1: pcie@ffe0a000 {
+		status = "disabled";
+	};
+};
-- 
1.7.0.4

^ permalink raw reply related

* RE: [RFC] Multi queue support in ethernet/freescale/ucc_geth.c
From: Li Yang-R58472 @ 2012-02-09 10:44 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: netdev@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4F2B1138.2090004@windriver.com>

> -----Original Message-----
> From: Paul Gortmaker [mailto:paul.gortmaker@windriver.com]
> Sent: Friday, February 03, 2012 6:42 AM
> To: Li Yang-R58472
> Cc: netdev@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
> Subject: [RFC] Multi queue support in ethernet/freescale/ucc_geth.c
>=20
> Hi Li,

Hi Paul,

Sorry for the late response due to holidays.

>=20
> A while back DaveM mentioned that it would be good to break out the ring
> allocations[1] in this driver.
>=20
> I was looking at it, and in the process noticed this:
>=20
> $ grep 'numQueues.*=3D' drivers/net/ethernet/freescale/ucc_geth.c
>         .numQueuesTx =3D 1,
>         .numQueuesRx =3D 1,
> $
>=20
> My interpretation of the above is that there is no way (aside from a code
> edit) to enable multi queue support.
> They are only ever assigned one time, to a value of one.
>=20
> Assuming I'm not missing something obvious, is the multi queue support
> functional and tested, or just old code that never got tested and
> subsequently enabled?

Previously the device is only used on single core cpu, so we didn't have th=
e incentive to enable multi-queue.  It is not tested on Linux currently.

>=20
> The reason I ask, is that the ring allocation code gets rid of the loop
> wrapping it, if the driver is really only meant to ever have just single
> queues for Rx/Tx. And other areas of the driver can also be simplified
> accordingly as well.

Well.  I would prefer the other way which is to add the multi-queue support=
 as we are using the QE in multi-core SoC and the current driver is having =
almost all the code needed for multi-queue except interface to the protocol=
 layer.

- Leo

^ permalink raw reply

* [PATCH] powerpc/dts: Removed fsl,msi property from dts.
From: Diana Craciun @ 2012-02-09 13:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Diana CRACIUN

From: Diana CRACIUN <Diana.Craciun@freescale.com>

The association in the decice tree between PCI and MSI
using fsl,msi property was an artificial one and it does
not reflect the actual hardware.

Signed-off-by: Diana CRACIUN <Diana.Craciun@freescale.com>
---
 arch/powerpc/boot/dts/p2041rdb.dts |    3 ---
 arch/powerpc/boot/dts/p3041ds.dts  |    4 ----
 arch/powerpc/boot/dts/p3060qds.dts |    2 --
 arch/powerpc/boot/dts/p4080ds.dts  |    3 ---
 arch/powerpc/boot/dts/p5020ds.dts  |    4 ----
 5 files changed, 0 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/boot/dts/p2041rdb.dts b/arch/powerpc/boot/dts/p2041rdb.dts
index 4f957db..2852139 100644
--- a/arch/powerpc/boot/dts/p2041rdb.dts
+++ b/arch/powerpc/boot/dts/p2041rdb.dts
@@ -135,7 +135,6 @@
 		reg = <0xf 0xfe200000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x00000000 0x0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8000000 0x0 0x00010000>;
-		fsl,msi = <&msi0>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -151,7 +150,6 @@
 		reg = <0xf 0xfe201000 0 0x1000>;
 		ranges = <0x02000000 0x0 0xe0000000 0xc 0x20000000 0x0 0x20000000
 			  0x01000000 0x0 0x00000000 0xf 0xf8010000 0x0 0x00010000>;
-		fsl,msi = <&msi1>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -167,7 +165,6 @@
 		reg = <0xf 0xfe202000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x40000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8020000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
diff --git a/arch/powerpc/boot/dts/p3041ds.dts b/arch/powerpc/boot/dts/p3041ds.dts
index f469145..22a215e 100644
--- a/arch/powerpc/boot/dts/p3041ds.dts
+++ b/arch/powerpc/boot/dts/p3041ds.dts
@@ -173,7 +173,6 @@
 		reg = <0xf 0xfe200000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x00000000 0x0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8000000 0x0 0x00010000>;
-		fsl,msi = <&msi0>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -189,7 +188,6 @@
 		reg = <0xf 0xfe201000 0 0x1000>;
 		ranges = <0x02000000 0x0 0xe0000000 0xc 0x20000000 0x0 0x20000000
 			  0x01000000 0x0 0x00000000 0xf 0xf8010000 0x0 0x00010000>;
-		fsl,msi = <&msi1>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -205,7 +203,6 @@
 		reg = <0xf 0xfe202000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x40000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8020000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -221,7 +218,6 @@
 		reg = <0xf 0xfe203000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x60000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8030000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
diff --git a/arch/powerpc/boot/dts/p3060qds.dts b/arch/powerpc/boot/dts/p3060qds.dts
index 529042e..9ae875c 100644
--- a/arch/powerpc/boot/dts/p3060qds.dts
+++ b/arch/powerpc/boot/dts/p3060qds.dts
@@ -212,7 +212,6 @@
 		reg = <0xf 0xfe200000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x00000000 0x0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8000000 0x0 0x00010000>;
-		fsl,msi = <&msi0>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -228,7 +227,6 @@
 		reg = <0xf 0xfe201000 0 0x1000>;
 		ranges = <0x02000000 0x0 0xe0000000 0xc 0x20000000 0x0 0x20000000
 			  0x01000000 0x0 0x00000000 0xf 0xf8010000 0x0 0x00010000>;
-		fsl,msi = <&msi1>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
diff --git a/arch/powerpc/boot/dts/p4080ds.dts b/arch/powerpc/boot/dts/p4080ds.dts
index 6d60e54..3e20460 100644
--- a/arch/powerpc/boot/dts/p4080ds.dts
+++ b/arch/powerpc/boot/dts/p4080ds.dts
@@ -141,7 +141,6 @@
 		reg = <0xf 0xfe200000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x00000000 0x0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8000000 0x0 0x00010000>;
-		fsl,msi = <&msi0>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -157,7 +156,6 @@
 		reg = <0xf 0xfe201000 0 0x1000>;
 		ranges = <0x02000000 0x0 0xe0000000 0xc 0x20000000 0x0 0x20000000
 			  0x01000000 0x0 0x00000000 0xf 0xf8010000 0x0 0x00010000>;
-		fsl,msi = <&msi1>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -173,7 +171,6 @@
 		reg = <0xf 0xfe202000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x40000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8020000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
diff --git a/arch/powerpc/boot/dts/p5020ds.dts b/arch/powerpc/boot/dts/p5020ds.dts
index 1c25068..27c07ed 100644
--- a/arch/powerpc/boot/dts/p5020ds.dts
+++ b/arch/powerpc/boot/dts/p5020ds.dts
@@ -173,7 +173,6 @@
 		reg = <0xf 0xfe200000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x00000000 0x0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8000000 0x0 0x00010000>;
-		fsl,msi = <&msi0>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -189,7 +188,6 @@
 		reg = <0xf 0xfe201000 0 0x1000>;
 		ranges = <0x02000000 0x0 0xe0000000 0xc 0x20000000 0x0 0x20000000
 			  0x01000000 0x0 0x00000000 0xf 0xf8010000 0x0 0x00010000>;
-		fsl,msi = <&msi1>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -205,7 +203,6 @@
 		reg = <0xf 0xfe202000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x40000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8020000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
@@ -221,7 +218,6 @@
 		reg = <0xf 0xfe203000 0 0x1000>;
 		ranges = <0x02000000 0 0xe0000000 0xc 0x60000000 0 0x20000000
 			  0x01000000 0 0x00000000 0xf 0xf8030000 0 0x00010000>;
-		fsl,msi = <&msi2>;
 		pcie@0 {
 			ranges = <0x02000000 0 0xe0000000
 				  0x02000000 0 0xe0000000
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH] powerpc/dts: Removed fsl,msi property from dts.
From: Tabi Timur-B04825 @ 2012-02-09 16:04 UTC (permalink / raw)
  To: Craciun Diana Madalina-STFD002; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1328794860-12592-1-git-send-email-diana.craciun@freescale.com>

On Thu, Feb 9, 2012 at 7:41 AM, Diana Craciun
<diana.craciun@freescale.com> wrote:
> From: Diana CRACIUN <Diana.Craciun@freescale.com>
>
> The association in the decice tree between PCI and MSI
> using fsl,msi property was an artificial one and it does
> not reflect the actual hardware.
>
> Signed-off-by: Diana CRACIUN <Diana.Craciun@freescale.com>

Acked-by: Timur Tabi <timur@freescale.com>

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH 2/2 v4] powerpc/dts: Add dts for p1020rdb-pc board
From: Tabi Timur-B04825 @ 2012-02-09 16:08 UTC (permalink / raw)
  To: Fan Zhicheng-B32736; +Cc: linuxppc-dev@lists.ozlabs.org, Zhicheng Fan
In-Reply-To: <1328766036-17697-2-git-send-email-B32736@freescale.com>

On Wed, Feb 8, 2012 at 11:40 PM, Zhicheng Fan <B32736@freescale.com> wrote:
>
> =A0arch/powerpc/boot/dts/p1020rdb-pc.dts =A0 =A0 =A0 =A0 =A0 =A0| =A0 90 =
++++++++
> =A0arch/powerpc/boot/dts/p1020rdb-pc.dtsi =A0 =A0 =A0 =A0 =A0 | =A0247 ++=
++++++++++++++++++++
> =A0arch/powerpc/boot/dts/p1020rdb-pc_36b.dts =A0 =A0 =A0 =A0| =A0 90 ++++=
++++

If we're going to support both 32-bit and 36-bit dts files, then we
have to label both DTS files properly.  Do not assume that 32-bit is
the "default", because on some platforms, 36-bit is the default.

p1020rdb-pc.dts should be called p1020rdb-pc_32b.dts.

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH 1/2 v4] powerpc/85xx: Add p1020rdb-pc platform support
From: Tabi Timur-B04825 @ 2012-02-09 16:09 UTC (permalink / raw)
  To: Fan Zhicheng-B32736; +Cc: linuxppc-dev@lists.ozlabs.org, Zhicheng Fan
In-Reply-To: <1328766036-17697-1-git-send-email-B32736@freescale.com>

On Wed, Feb 8, 2012 at 11:40 PM, Zhicheng Fan <B32736@freescale.com> wrote:

> +static int __init p1020_rdb_pc_probe(void)
> +{
> + =A0 =A0 =A0 unsigned long root =3D of_get_flat_dt_root();
> +
> + =A0 =A0 =A0 if (of_flat_dt_is_compatible(root, "fsl,P1020RDB-PC"))
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return 1;
> + =A0 =A0 =A0 return 0;
> +}

static int __init p1020_rdb_pc_probe(void)
{
       unsigned long root =3D of_get_flat_dt_root();

       return of_flat_dt_is_compatible(root, "fsl,P1020RDB-PC");
}

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH 1/2 v2] P1025RDB: Add Quicc Engine support
From: Tabi Timur-B04825 @ 2012-02-09 16:17 UTC (permalink / raw)
  To: Fan Zhicheng-B32736; +Cc: linuxppc-dev@lists.ozlabs.org, Zhicheng Fan
In-Reply-To: <1328694779-24024-2-git-send-email-B32736@freescale.com>

> +#ifdef CONFIG_QUICC_ENGINE
> + =A0 =A0 =A0 np =3D of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
> + =A0 =A0 =A0 if (np) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qe_ic_casca=
de_high_mpic);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 of_node_put(np);
> +
> + =A0 =A0 =A0 } else
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_err("Could not find qe-ic node\n");

Since you have to use pr_err instead of dev_err, please add a prefix
to the message.  Like this:

pr_err("mpc85xx-rdb: could not find qe-ic node\n");

or maybe something like this:

pr_err("%s: could not find qe-ic node\n", __func__);


> + =A0 =A0 =A0 if (machine_is(p1025_rdb)) {
> +
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __be32 __iomem *pmuxcr;
> +
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 np =3D of_find_node_by_name(NULL, "global-u=
tilities");
> +
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (np) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pmuxcr =3D of_iomap(np, 0) =
+ MPC85xx_PMUXCR_OFFSET;

Use the ccsr_guts_85xx structure instead of hard-coded offsets.

MPC85xx_PMUXCR_OFFSET should be deleted.

> +
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!pmuxcr)
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_err(KERN=
_EMERG "Error: Alternate function"
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 " signal multiplex control register not"
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 " mapped!\n");

A missing node in the device tree is NOT an emergency.  Also, the
KERN_xxx macros are not supposed to be used in a pr_xxx macro.  Please
don't blindly copy/paste code from somewhere else without thinking
about it.

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply

* Re: [PATCH v2] powerpc: Rework lazy-interrupt handling
From: Tudor Laurentiu @ 2012-02-09 17:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Scott Wood, Stuart Yoder, Anton Blanchard, linuxppc-dev,
	Paul Mackerras
In-Reply-To: <1328761532.2903.53.camel@pasglop>

Hi Ben,

Small comment inline.

On 02/09/2012 06:25 AM, Benjamin Herrenschmidt wrote:
>  From 0ace17ba6960a8788b1bda3770df254cbbc6a244 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt<benh@kernel.crashing.org>
> Date: Thu, 9 Feb 2012 15:25:04 +1100
> Subject: [PATCH] powerpc: Rework lazy-interrupt handling
>
> The current implementation of lazy interrupts handling has some
> issues that this tries to address.
>
> Except on iSeries, we don't do the various workarounds we need to
> do on re-enable when returning from an interrupt, which can do an
> implicit re-enable, and thus we may still lose or get delayed
> decrementer or doorbell interrupts.
>
> The current scheme also makes it much harder to handle the external
> "edge" interrupts provided by some BookE processors when using the
> EPR facility (External Proxy) and the Freescale Hypervisor.
>
> We also hard mask on decrementer interrupts which is sub-optimal.
>
> This is an attempt at fixing it all in one go by reworking the way
> we do the lazy interrupt disabling.
>
> The base idea is to replace the "hard_enabled" field with a
> "irq_happened" field in which we store a bit mask of what interrupt
> occurred while soft-disabled.
>
> When re-enabling, either via arch_local_irq_restore() or when returning
> from an interrupt, we can now decide what to do by testing bits in that
> field. We then implement re-emitting of the lost interrupts via either
> a re-use of the existing exception frame (exception exit case) or via
> the creation of a new one from assembly code (arch_local_irq_enable),
> without the need to trigger a fake one using set_dec() or similar.
>
> In addition, this adds a few refinements:
>
>   - We no longer  hard disable decrementer interrupts that occur
> while soft-disabled. We now simply bump the decrementer back to max
> (on BookS) or leave it stopped (on BookE) and continue with hard interrupts
> enabled, which means that we'll potentially get better sample quality from
> performance monitor interrupts.
>
>   - Timer, decrementer and doorbell interrupts now hard-enable
> shortly after removing the source of the interrupt, which means
> they no longer run entirely hard disabled. Again, this will improve
> perf sample quality.
>
>   - On Book3E 64-bit, we now make the performance monitor interrupt
> act as an NMI like Book3S (the necessary C code for that to work
> appear to already be present in the FSL perf code, notably calling
> nmi_enter instead of irq_enter).
>
> There are additional refinements that we can do on top of this patch:
>
>   - We could remove the ps3 workaround from arch_local_irq_enable(),
> I believe that it should no longer be necessary
>
>   - We could make "masked" decrementer interrupts act as NMIs when doing
> timer-based perf sampling to improve the sample quality.
>
>   - There are additional simplifications of the exception entry/exit path
> that I've spotted along the way, such as merging fast_exception_return
> with the normal code path.
>
> This patch needs a LOT more testing&  review than it had so far !!!
>
> Not-signed-off-by-yet: Benjamin Herrenschmidt<benh@kernel.crashing.org>
> ---
>
> v2:
>
> - Add hard-enable to decrementer, timer and doorbells
> - Fix CR clobber in masked irq handling on BookE
> - Make embedded perf interrupt act as an NMI
> - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
>    to retrigger an interrupt without preventing hard-enable
>
> Signed-off-by: Benjamin Herrenschmidt<benh@kernel.crashing.org>
> ---
>   arch/powerpc/include/asm/exception-64s.h        |   21 ++-
>   arch/powerpc/include/asm/hw_irq.h               |   51 +++++-
>   arch/powerpc/include/asm/irqflags.h             |   13 +-
>   arch/powerpc/include/asm/paca.h                 |    2 +-
>   arch/powerpc/kernel/asm-offsets.c               |    2 +-
>   arch/powerpc/kernel/dbell.c                     |   12 ++
>   arch/powerpc/kernel/entry_64.S                  |   96 ++++++-----
>   arch/powerpc/kernel/exceptions-64e.S            |  210 ++++++++++++++++-------
>   arch/powerpc/kernel/exceptions-64s.S            |   90 ++++++----
>   arch/powerpc/kernel/head_64.S                   |    9 -
>   arch/powerpc/kernel/idle_book3e.S               |    8 +-
>   arch/powerpc/kernel/idle_power4.S               |   17 ++-
>   arch/powerpc/kernel/idle_power7.S               |   20 ++-
>   arch/powerpc/kernel/irq.c                       |  187 ++++++++++++++------
>   arch/powerpc/kernel/time.c                      |   15 ++-
>   arch/powerpc/platforms/iseries/Makefile         |    2 +-
>   arch/powerpc/platforms/iseries/exception.S      |   11 +-
>   arch/powerpc/platforms/iseries/misc.S           |   26 ---
>   arch/powerpc/platforms/pseries/processor_idle.c |   24 +++-
>   19 files changed, 540 insertions(+), 276 deletions(-)
>   delete mode 100644 arch/powerpc/platforms/iseries/misc.S
>

[snip]

> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 429983c..7fa4096 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -21,6 +21,7 @@
>   #include<asm/exception-64e.h>
>   #include<asm/bug.h>
>   #include<asm/irqflags.h>
> +#include<asm/hw_irq.h>
>   #include<asm/ptrace.h>
>   #include<asm/ppc-opcode.h>
>   #include<asm/mmu.h>
> @@ -77,59 +78,55 @@
>   #define SPRN_MC_SRR1	SPRN_MCSRR1
>
>   #define NORMAL_EXCEPTION_PROLOG(n, addition)				    \
> -	EXCEPTION_PROLOG(n, GEN, addition##_GEN)
> +	EXCEPTION_PROLOG(n, GEN, addition##_GEN(n))
>
>   #define CRIT_EXCEPTION_PROLOG(n, addition)				    \
> -	EXCEPTION_PROLOG(n, CRIT, addition##_CRIT)
> +	EXCEPTION_PROLOG(n, CRIT, addition##_CRIT(n))
>
>   #define DBG_EXCEPTION_PROLOG(n, addition)				    \
> -	EXCEPTION_PROLOG(n, DBG, addition##_DBG)
> +	EXCEPTION_PROLOG(n, DBG, addition##_DBG(n))
>
>   #define MC_EXCEPTION_PROLOG(n, addition)				    \
> -	EXCEPTION_PROLOG(n, MC, addition##_MC)
> +	EXCEPTION_PROLOG(n, MC, addition##_MC(n))
>
>
>   /* Variants of the "addition" argument for the prolog
>    */
> -#define PROLOG_ADDITION_NONE_GEN
> -#define PROLOG_ADDITION_NONE_CRIT
> -#define PROLOG_ADDITION_NONE_DBG
> -#define PROLOG_ADDITION_NONE_MC
> +#define PROLOG_ADDITION_NONE_GEN(n)
> +#define PROLOG_ADDITION_NONE_CRIT(n)
> +#define PROLOG_ADDITION_NONE_DBG(n)
> +#define PROLOG_ADDITION_NONE_MC(n)
>
> -#define PROLOG_ADDITION_MASKABLE_GEN					    \
> +#define PROLOG_ADDITION_MASKABLE_GEN(n)					    \
>   	lbz	r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */	    \
>   	cmpwi	cr0,r11,0;		/* yes ->  go out of line */	    \
> -	beq	masked_interrupt_book3e;
> +	beq	masked_interrupt_book3e_##n;
>
> -#define PROLOG_ADDITION_2REGS_GEN					    \
> +#define PROLOG_ADDITION_2REGS_GEN(n)					    \
>   	std	r14,PACA_EXGEN+EX_R14(r13);				    \
>   	std	r15,PACA_EXGEN+EX_R15(r13)
>
> -#define PROLOG_ADDITION_1REG_GEN					    \
> +#define PROLOG_ADDITION_1REG_GEN(n)					    \
>   	std	r14,PACA_EXGEN+EX_R14(r13);
>
> -#define PROLOG_ADDITION_2REGS_CRIT					    \
> +#define PROLOG_ADDITION_2REGS_CRIT(n)					    \
>   	std	r14,PACA_EXCRIT+EX_R14(r13);				    \
>   	std	r15,PACA_EXCRIT+EX_R15(r13)
>
> -#define PROLOG_ADDITION_2REGS_DBG					    \
> +#define PROLOG_ADDITION_2REGS_DBG(n)					    \
>   	std	r14,PACA_EXDBG+EX_R14(r13);				    \
>   	std	r15,PACA_EXDBG+EX_R15(r13)
>
> -#define PROLOG_ADDITION_2REGS_MC					    \
> +#define PROLOG_ADDITION_2REGS_MC(n)					    \
>   	std	r14,PACA_EXMC+EX_R14(r13);				    \
>   	std	r15,PACA_EXMC+EX_R15(r13)
>
> -#define PROLOG_ADDITION_DOORBELL_GEN					    \
> -	lbz	r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */	    \
> -	cmpwi	cr0,r11,0;		/* yes ->  go out of line */	    \
> -	beq	masked_doorbell_book3e
> -
>
>   /* Core exception code for all exceptions except TLB misses.
>    * XXX: Needs to make SPRN_SPRG_GEN depend on exception type
>    */
>   #define EXCEPTION_COMMON(n, excf, ints)					    \
> +exc_##n##_common:							    \
>   	std	r0,GPR0(r1);		/* save r0 in stackframe */	    \
>   	std	r2,GPR2(r1);		/* save r2 in stackframe */	    \
>   	SAVE_4GPRS(3, r1);		/* save r3 - r6 in stackframe */    \
> @@ -167,19 +164,25 @@
>   	std	r0,RESULT(r1);		/* clear regs->result */	    \
>   	ints;
>
> -/* Variants for the "ints" argument */
> -#define INTS_KEEP
> -#define INTS_DISABLE_SOFT						    \
> +/* Variants for the "ints" argument. We know r0 is 0 on entry */
> +#define INTS_KEEP							    \
>   	stb	r0,PACASOFTIRQEN(r13);	/* mark interrupts soft-disabled */ \
>   	TRACE_DISABLE_INTS;
> -#define INTS_DISABLE_HARD						    \
> -	stb	r0,PACAHARDIRQEN(r13); /* and hard disabled */
> -#define INTS_DISABLE_ALL						    \
> -	INTS_DISABLE_SOFT						    \
> -	INTS_DISABLE_HARD
> -
> -/* This is called by exceptions that used INTS_KEEP (that is did not clear
> - * neither soft nor hard IRQ indicators in the PACA. This will restore MSR:EE
> +
> +/* This second version is meant for exceptions that don't immediately
> + * hard-enable. We set a bit in paca->irq_happened to ensure that
> + * a subsequent call to arch_local_irq_restore() will properly
> + * hard-enable and avoid the fast-path
> + */
> +#define INTS_DISABLE							    \
> +	stb	r0,PACASOFTIRQEN(r13);	/* mark interrupts soft-disabled */ \
> +	lbz	r0,PACAIRQHAPPENED(r13);				    \
> +	ori	r0,r0,PACA_HAPPENED;					    \
> +	stb	r0,PACAIRQHAPPENED(r13);				    \
> +	TRACE_DISABLE_INTS;
> +
> +/* This is called by exceptions that used INTS_KEEP (that is did
> + * not set hard IRQ indicators in the PACA). This will restore MSR:EE
>    * to it's previous value
>    *
>    * XXX In the long run, we may want to open-code it in order to separate the
> @@ -238,7 +241,7 @@ exc_##n##_bad_stack:							    \
>   #define MASKABLE_EXCEPTION(trapnum, label, hdlr, ack)			\
>   	START_EXCEPTION(label);						\
>   	NORMAL_EXCEPTION_PROLOG(trapnum, PROLOG_ADDITION_MASKABLE)	\
> -	EXCEPTION_COMMON(trapnum, PACA_EXGEN, INTS_DISABLE_ALL)		\
> +	EXCEPTION_COMMON(trapnum, PACA_EXGEN, INTS_DISABLE)		\
>   	ack(r8);							\
>   	CHECK_NAPPING();						\
>   	addi	r3,r1,STACK_FRAME_OVERHEAD;				\
> @@ -289,7 +292,7 @@ interrupt_end_book3e:
>   /* Critical Input Interrupt */
>   	START_EXCEPTION(critical_input);
>   	CRIT_EXCEPTION_PROLOG(0x100, PROLOG_ADDITION_NONE)
> -//	EXCEPTION_COMMON(0x100, PACA_EXCRIT, INTS_DISABLE_ALL)
> +//	EXCEPTION_COMMON(0x100, PACA_EXCRIT, INTS_DISABLE)
>   //	bl	special_reg_save_crit
>   //	CHECK_NAPPING();
>   //	addi	r3,r1,STACK_FRAME_OVERHEAD
> @@ -300,7 +303,7 @@ interrupt_end_book3e:
>   /* Machine Check Interrupt */
>   	START_EXCEPTION(machine_check);
>   	CRIT_EXCEPTION_PROLOG(0x200, PROLOG_ADDITION_NONE)
> -//	EXCEPTION_COMMON(0x200, PACA_EXMC, INTS_DISABLE_ALL)
> +//	EXCEPTION_COMMON(0x200, PACA_EXMC, INTS_DISABLE)
>   //	bl	special_reg_save_mc
>   //	addi	r3,r1,STACK_FRAME_OVERHEAD
>   //	CHECK_NAPPING();
> @@ -339,7 +342,7 @@ interrupt_end_book3e:
>   	START_EXCEPTION(program);
>   	NORMAL_EXCEPTION_PROLOG(0x700, PROLOG_ADDITION_1REG)
>   	mfspr	r14,SPRN_ESR
> -	EXCEPTION_COMMON(0x700, PACA_EXGEN, INTS_DISABLE_SOFT)
> +	EXCEPTION_COMMON(0x700, PACA_EXGEN, INTS_KEEP)
>   	std	r14,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	ld	r14,PACA_EXGEN+EX_R14(r13)
> @@ -372,7 +375,7 @@ interrupt_end_book3e:
>   /* Watchdog Timer Interrupt */
>   	START_EXCEPTION(watchdog);
>   	CRIT_EXCEPTION_PROLOG(0x9f0, PROLOG_ADDITION_NONE)
> -//	EXCEPTION_COMMON(0x9f0, PACA_EXCRIT, INTS_DISABLE_ALL)
> +//	EXCEPTION_COMMON(0x9f0, PACA_EXCRIT, INTS_DISABLE)
>   //	bl	special_reg_save_crit
>   //	CHECK_NAPPING();
>   //	addi	r3,r1,STACK_FRAME_OVERHEAD
> @@ -450,7 +453,7 @@ interrupt_end_book3e:
>   	mfspr	r15,SPRN_SPRG_CRIT_SCRATCH
>   	mtspr	SPRN_SPRG_GEN_SCRATCH,r15
>   	mfspr	r14,SPRN_DBSR
> -	EXCEPTION_COMMON(0xd00, PACA_EXCRIT, INTS_DISABLE_ALL)
> +	EXCEPTION_COMMON(0xd00, PACA_EXCRIT, INTS_DISABLE)
>   	std	r14,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	mr	r4,r14
> @@ -515,7 +518,7 @@ kernel_dbg_exc:
>   	mfspr	r15,SPRN_SPRG_DBG_SCRATCH
>   	mtspr	SPRN_SPRG_GEN_SCRATCH,r15
>   	mfspr	r14,SPRN_DBSR
> -	EXCEPTION_COMMON(0xd00, PACA_EXDBG, INTS_DISABLE_ALL)
> +	EXCEPTION_COMMON(0xd08, PACA_EXDBG, INTS_DISABLE)
>   	std	r14,_DSISR(r1)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	mr	r4,r14
> @@ -525,21 +528,22 @@ kernel_dbg_exc:
>   	bl	.DebugException
>   	b	.ret_from_except
>
> -	MASKABLE_EXCEPTION(0x260, perfmon, .performance_monitor_exception, ACK_NONE)
> +	START_EXCEPTION(perfmon);
> +	NORMAL_EXCEPTION_PROLOG(0x260, PROLOG_ADDITION_NONE)
> +	EXCEPTION_COMMON(0x260, PACA_EXGEN, INTS_DISABLE)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	ld	r14,PACA_EXGEN+EX_R14(r13)
> +	bl	.save_nvgprs
> +	bl	.performance_monitor_exception
> +	b	.ret_from_except
>
>   /* Doorbell interrupt */
> -	START_EXCEPTION(doorbell)
> -	NORMAL_EXCEPTION_PROLOG(0x2070, PROLOG_ADDITION_DOORBELL)
> -	EXCEPTION_COMMON(0x2070, PACA_EXGEN, INTS_DISABLE_ALL)
> -	CHECK_NAPPING()
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	.doorbell_exception
> -	b	.ret_from_except_lite
> +	MASKABLE_EXCEPTION(0x280, doorbell, .doorbell_exception, ACK_NONE)
>
>   /* Doorbell critical Interrupt */
>   	START_EXCEPTION(doorbell_crit);
> -	CRIT_EXCEPTION_PROLOG(0x2080, PROLOG_ADDITION_NONE)
> -//	EXCEPTION_COMMON(0x2080, PACA_EXCRIT, INTS_DISABLE_ALL)
> +	CRIT_EXCEPTION_PROLOG(0x2a0, PROLOG_ADDITION_NONE)
> +//	EXCEPTION_COMMON(0x280, PACA_EXCRIT, INTS_DISABLE)
>   //	bl	special_reg_save_crit
>   //	CHECK_NAPPING();
>   //	addi	r3,r1,STACK_FRAME_OVERHEAD
> @@ -547,38 +551,116 @@ kernel_dbg_exc:
>   //	b	ret_from_crit_except
>   	b	.
>
> +/* Guest Doorbell */
>   	MASKABLE_EXCEPTION(0x2c0, guest_doorbell, .unknown_exception, ACK_NONE)
> -	MASKABLE_EXCEPTION(0x2e0, guest_doorbell_crit, .unknown_exception, ACK_NONE)
> -	MASKABLE_EXCEPTION(0x310, hypercall, .unknown_exception, ACK_NONE)
> -	MASKABLE_EXCEPTION(0x320, ehpriv, .unknown_exception, ACK_NONE)
> +
> +/* Guest Doorbell critical Interrupt */
> +	START_EXCEPTION(guest_doorbell_crit);
> +	CRIT_EXCEPTION_PROLOG(0x2e0, PROLOG_ADDITION_NONE)
> +//	EXCEPTION_COMMON(0x2e0, PACA_EXCRIT, INTS_DISABLE)
> +//	bl	special_reg_save_crit
> +//	CHECK_NAPPING();
> +//	addi	r3,r1,STACK_FRAME_OVERHEAD
> +//	bl	.guest_doorbell_critical_exception
> +//	b	ret_from_crit_except
> +	b	.
> +
> +/* Hypervisor call */
> +	START_EXCEPTION(hypercall);
> +	NORMAL_EXCEPTION_PROLOG(0x310, PROLOG_ADDITION_NONE)
> +	EXCEPTION_COMMON(0x310, PACA_EXGEN, INTS_KEEP)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	bl	.save_nvgprs
> +	INTS_RESTORE_HARD
> +	bl	.unknown_exception
> +	b	.ret_from_except
> +
> +/* Embedded Hypervisor priviledged  */
> +	START_EXCEPTION(ehpriv);
> +	NORMAL_EXCEPTION_PROLOG(0x320, PROLOG_ADDITION_NONE)
> +	EXCEPTION_COMMON(0x320, PACA_EXGEN, INTS_KEEP)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	bl	.save_nvgprs
> +	INTS_RESTORE_HARD
> +	bl	.unknown_exception
> +	b	.ret_from_except
>
>
>   /*
> - * An interrupt came in while soft-disabled; clear EE in SRR1,
> - * clear paca->hard_enabled and return.
> + * An interrupt came in while soft-disabled; We mark paca->irq_happened
> + * accordingly and if the interrupt is level sensitive, we hard disable
>    */
> -masked_doorbell_book3e:
> -	mtcr	r10
> -	/* Resend the doorbell to fire again when ints enabled */
> -	mfspr	r10,SPRN_PIR
> -	PPC_MSGSND(r10)
> -	b	masked_interrupt_book3e_common
>
> -masked_interrupt_book3e:
> +masked_interrupt_book3e_0x500:
> +	li	r11,PACA_HAPPENED_EE
> +	b	masked_interrupt_book3e_full_mask
> +
> +masked_interrupt_book3e_0x900:
> +	ACK_DEC(r11);
> +	li	r11,PACA_HAPPENED_DEC
> +	b	masked_interrupt_book3e_no_mask
> +masked_interrupt_book3e_0x980:
> +	ACK_FIT(r11);
> +	li	r11,PACA_HAPPENED_DEC
> +	b	masked_interrupt_book3e_no_mask
> +masked_interrupt_book3e_0x280:
> +masked_interrupt_book3e_0x2c0:
> +	li	r11,PACA_HAPPENED_DBELL
> +	b	masked_interrupt_book3e_no_mask
> +
> +masked_interrupt_book3e_no_mask:
> +	mtcr	r10
> +	lbz	r10,PACAIRQHAPPENED(r13)
> +	ori	r10,r10,r11

Shouldn't this be an 'or'?

> +	stb	r10,PACAIRQHAPPENED(r13)
> +	b	1f
> +masked_interrupt_book3e_full_mask:
>   	mtcr	r10
> -masked_interrupt_book3e_common:
> -	stb	r11,PACAHARDIRQEN(r13)
> +	lbz	r10,PACAIRQHAPPENED(r13)
> +	ori	r10,r10,r11

Same comment.

> +	stb	r10,PACAIRQHAPPENED(r13)
>   	mfspr	r10,SPRN_SRR1
>   	rldicl	r11,r10,48,1		/* clear MSR_EE */
>   	rotldi	r10,r11,16
>   	mtspr	SPRN_SRR1,r10
> -	ld	r10,PACA_EXGEN+EX_R10(r13);	/* restore registers */
> +1:	ld	r10,PACA_EXGEN+EX_R10(r13);
>   	ld	r11,PACA_EXGEN+EX_R11(r13);
>   	mfspr	r13,SPRN_SPRG_GEN_SCRATCH;
>   	rfi
>   	b	.
>
>   /*
> + * Called from arch_local_irq_enable when an interrupt needs
> + * to be resent. r3 contains either 0x500,0x900,0x260 or 0x280
> + * to indicate the kind of interrupt. MSR:EE is already off.
> + * We generate a stackframe like if a real interrupt had happened.
> + *
> + * Note: While MSR:EE is off, we need to make sure that _MSR
> + * in the generated frame has EE set to 1 or the exception
> + * handler will not properly re-enable them.
> + */
> +_GLOBAL(__reemit_interrupt)
> +	/* We are going to jump to the exception common code which
> +	 * will retrieve various register values from the PACA which
> +	 * we don't give a damn about.
> +	 */
> +	mflr	r10
> +	mfmsr	r11
> +	mfcr	r4;
> +	mtspr	SPRN_SPRG_GEN_SCRATCH,r13;
> +	std	r1,PACA_EXGEN+EX_R1(r13);
> +	stw	r4,PACA_EXGEN+EX_CR(r13);
> +	ori	r11,r11,MSR_EE
> +	subi	r1,r1,INT_FRAME_SIZE;
> +	cmpwi	cr0,r3,0x500
> +	beq	exc_0x500_common
> +	cmpwi	cr0,r3,0x900
> +	beq+	exc_0x900_common
> +	cmpwi	cr0,r3,0x280
> +	beq+	exc_0x280_common
> +	blr
> +
> +/*
>    * This is called from 0x300 and 0x400 handlers after the prologs with
>    * r14 and r15 containing the fault address and error code, with the
>    * original values stashed away in the PACA
> @@ -680,6 +762,8 @@ BAD_STACK_TRAMPOLINE(0x000)
>   BAD_STACK_TRAMPOLINE(0x100)
>   BAD_STACK_TRAMPOLINE(0x200)
>   BAD_STACK_TRAMPOLINE(0x260)
> +BAD_STACK_TRAMPOLINE(0x280)
> +BAD_STACK_TRAMPOLINE(0x2a0)
>   BAD_STACK_TRAMPOLINE(0x2c0)
>   BAD_STACK_TRAMPOLINE(0x2e0)
>   BAD_STACK_TRAMPOLINE(0x300)

---
Best Regards, Laurentiu

^ permalink raw reply

* Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses
From: Bjorn Helgaas @ 2012-02-09 19:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linux-arch, Tony Luck, Yinghai Lu, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes, linux-pci,
	Andrew Morton, Linus Torvalds
In-Reply-To: <1328738567.2903.45.camel@pasglop>

On Wed, Feb 8, 2012 at 2:02 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2012-02-08 at 07:58 -0800, Bjorn Helgaas wrote:
>> The only architecture-specific thing here is discovering the range of
>> bus numbers below a host bridge. =A0The architecture should not have to
>> mess around with pci_bus_update_busn_res_end() like this. =A0It should
>> be able to say "here's my bus number range" (and of course the PCI
>> core can default to 0-255 if the arch doesn't supply a range) and the
>> core should take care of the rest.
>
> So it's a bit messy in here because we deal with several things.
>
> What the firmware gives us is the range it assigned, but that isn't
> necessarily the HW limits (almost never is in fact).
>
> In some cases we honor it, for example when in "probe only" mode where
> we prevent any reassigning, and in some case, we ignore it and let the
> PCI core renumber things (typically because the FW "forgot" to set aside
> bus numbers for a cardbus slot for example, that sort of things).
>
> So it's a bit of a tricky situation.
>
> Off the top of my head, I'm pretty sure that most if not all of our PCI
> host bridges simply support a full 0...255 range and there is no sharing
> between bridges like on x86, they are just different domains.

My point is that the interface between the arch and the PCI core
should be simply the arch telling the core "this is the range of bus
numbers you can use."  If the firmware doesn't give you the HW limits,
that's the arch's problem.  If you want to assume 0..255 are
available, again, that's the arch's decision.

But the answer to the question "what bus numbers are available to me"
depends only on the host bridge HW configuration.  It does not depend
on what pci_scan_child_bus() found.  Therefore, I think we can come up
with a design where pci_bus_update_busn_res_end() is unnecessary.

Bjorn

^ permalink raw reply

* Re: [PATCH v2] powerpc: Rework lazy-interrupt handling
From: Benjamin Herrenschmidt @ 2012-02-09 20:47 UTC (permalink / raw)
  To: Tudor Laurentiu
  Cc: Scott Wood, Stuart Yoder, Anton Blanchard, linuxppc-dev,
	Paul Mackerras
In-Reply-To: <4F33FC65.40702@freescale.com>

On Thu, 2012-02-09 at 19:03 +0200, Tudor Laurentiu wrote:

> > +masked_interrupt_book3e_0x900:
> > +	ACK_DEC(r11);
> > +	li	r11,PACA_HAPPENED_DEC
> > +	b	masked_interrupt_book3e_no_mask
> > +masked_interrupt_book3e_0x980:
> > +	ACK_FIT(r11);
> > +	li	r11,PACA_HAPPENED_DEC
> > +	b	masked_interrupt_book3e_no_mask
> > +masked_interrupt_book3e_0x280:
> > +masked_interrupt_book3e_0x2c0:
> > +	li	r11,PACA_HAPPENED_DBELL
> > +	b	masked_interrupt_book3e_no_mask
> > +
> > +masked_interrupt_book3e_no_mask:
> > +	mtcr	r10
> > +	lbz	r10,PACAIRQHAPPENED(r13)
> > +	ori	r10,r10,r11
> 
> Shouldn't this be an 'or'?

Yes, absolutely. This is a typo/thinko I do all the time ... oops.

> > +	stb	r10,PACAIRQHAPPENED(r13)
> > +	b	1f
> > +masked_interrupt_book3e_full_mask:
> >   	mtcr	r10
> > -masked_interrupt_book3e_common:
> > -	stb	r11,PACAHARDIRQEN(r13)
> > +	lbz	r10,PACAIRQHAPPENED(r13)
> > +	ori	r10,r10,r11
> 
> Same comment.

I'll respin and fix.

Cheers,
Ben

> > +	stb	r10,PACAIRQHAPPENED(r13)
> >   	mfspr	r10,SPRN_SRR1
> >   	rldicl	r11,r10,48,1		/* clear MSR_EE */
> >   	rotldi	r10,r11,16
> >   	mtspr	SPRN_SRR1,r10
> > -	ld	r10,PACA_EXGEN+EX_R10(r13);	/* restore registers */
> > +1:	ld	r10,PACA_EXGEN+EX_R10(r13);
> >   	ld	r11,PACA_EXGEN+EX_R11(r13);
> >   	mfspr	r13,SPRN_SPRG_GEN_SCRATCH;
> >   	rfi
> >   	b	.
> >
> >   /*
> > + * Called from arch_local_irq_enable when an interrupt needs
> > + * to be resent. r3 contains either 0x500,0x900,0x260 or 0x280
> > + * to indicate the kind of interrupt. MSR:EE is already off.
> > + * We generate a stackframe like if a real interrupt had happened.
> > + *
> > + * Note: While MSR:EE is off, we need to make sure that _MSR
> > + * in the generated frame has EE set to 1 or the exception
> > + * handler will not properly re-enable them.
> > + */
> > +_GLOBAL(__reemit_interrupt)
> > +	/* We are going to jump to the exception common code which
> > +	 * will retrieve various register values from the PACA which
> > +	 * we don't give a damn about.
> > +	 */
> > +	mflr	r10
> > +	mfmsr	r11
> > +	mfcr	r4;
> > +	mtspr	SPRN_SPRG_GEN_SCRATCH,r13;
> > +	std	r1,PACA_EXGEN+EX_R1(r13);
> > +	stw	r4,PACA_EXGEN+EX_CR(r13);
> > +	ori	r11,r11,MSR_EE
> > +	subi	r1,r1,INT_FRAME_SIZE;
> > +	cmpwi	cr0,r3,0x500
> > +	beq	exc_0x500_common
> > +	cmpwi	cr0,r3,0x900
> > +	beq+	exc_0x900_common
> > +	cmpwi	cr0,r3,0x280
> > +	beq+	exc_0x280_common
> > +	blr
> > +
> > +/*
> >    * This is called from 0x300 and 0x400 handlers after the prologs with
> >    * r14 and r15 containing the fault address and error code, with the
> >    * original values stashed away in the PACA
> > @@ -680,6 +762,8 @@ BAD_STACK_TRAMPOLINE(0x000)
> >   BAD_STACK_TRAMPOLINE(0x100)
> >   BAD_STACK_TRAMPOLINE(0x200)
> >   BAD_STACK_TRAMPOLINE(0x260)
> > +BAD_STACK_TRAMPOLINE(0x280)
> > +BAD_STACK_TRAMPOLINE(0x2a0)
> >   BAD_STACK_TRAMPOLINE(0x2c0)
> >   BAD_STACK_TRAMPOLINE(0x2e0)
> >   BAD_STACK_TRAMPOLINE(0x300)
> 
> ---
> Best Regards, Laurentiu

^ permalink raw reply

* Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses
From: Benjamin Herrenschmidt @ 2012-02-09 21:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-arch, Tony Luck, Yinghai Lu, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes, linux-pci,
	Andrew Morton, Linus Torvalds
In-Reply-To: <CAErSpo7ekGn-+Mnyz7KL=fVFH3Ed19AeE3MKkZWJ4bd0Jo7kCw@mail.gmail.com>

On Thu, 2012-02-09 at 11:24 -0800, Bjorn Helgaas wrote:
> My point is that the interface between the arch and the PCI core
> should be simply the arch telling the core "this is the range of bus
> numbers you can use."  If the firmware doesn't give you the HW limits,
> that's the arch's problem.  If you want to assume 0..255 are
> available, again, that's the arch's decision.
> 
> But the answer to the question "what bus numbers are available to me"
> depends only on the host bridge HW configuration.  It does not depend
> on what pci_scan_child_bus() found.  Therefore, I think we can come up
> with a design where pci_bus_update_busn_res_end() is unnecessary.

In an ideal world yes. In a world where there are reverse engineered
platforms on which we aren't 100% sure how thing actually work under the
hood and have the code just adapt on "what's there" (and try to fix it
up -sometimes-), thinks can get a bit murky :-)

But yes, I see your point. As for what is the "correct" setting that
needs to be done so that the patch doesn't end up a regression for us,
I'll have to dig into some ancient HW to dbl check a few things. I hope
0...255 will just work but I can't guarantee it.

What I'll probably do is constraint the core to the values in
hose->min/max, and update selected platforms to put 0..255 in there when
I know for sure they can cope.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] powerpc: Fix WARN_ON in decrementer_check_overflow
From: Hugh Dickins @ 2012-02-09 22:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1328765653.2903.59.camel@pasglop>

On Thu, 9 Feb 2012, Benjamin Herrenschmidt wrote:
> We use __get_cpu_var() which triggers a false positive warning
> in smp_processor_id() thinking interrupts are enabled (at this
> point, they are soft-enabled but hard-disabled).
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> 
> I was initially planning to fix that with a more in-depth rework
> of how we do lazy irq disabling on powerpc, but that patch is
> becoming too complex for this release so I'll apply this as a
> stop-gap and leave the full rework for -next

Okay, thanks for the update, that's equivalent to the get_cpu_var
plus put_cpu_var patch I had generally been running with successfully.

(I was not at all confident that it was a sufficient fix, and have
to say "generally" above because one time I got something that looked
like recursive calls overflowing the stack, and IIRC something "irq"
did appear in each frame of the trace.  But probably no connection,
never seen again, and I don't recall even which -rc or -next it was.)

Hugh

>  
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 701d4ac..01e2877 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -118,10 +118,14 @@ static inline notrace void set_soft_enabled(unsigned long enable)
>  static inline notrace void decrementer_check_overflow(void)
>  {
>  	u64 now = get_tb_or_rtc();
> -	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
> +	u64 *next_tb;
> +
> +	preempt_disable();
> +	next_tb = &__get_cpu_var(decrementers_next_tb);
>  
>  	if (now >= *next_tb)
>  		set_dec(1);
> +	preempt_enable();
>  }
>  
>  notrace void arch_local_irq_restore(unsigned long en)

^ permalink raw reply

* Re: [PATCH] powerpc: Fix WARN_ON in decrementer_check_overflow
From: Benjamin Herrenschmidt @ 2012-02-09 22:45 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linuxppc-dev
In-Reply-To: <alpine.LSU.2.00.1202091411170.1263@eggly.anvils>

On Thu, 2012-02-09 at 14:25 -0800, Hugh Dickins wrote:
> 
> Okay, thanks for the update, that's equivalent to the get_cpu_var
> plus put_cpu_var patch I had generally been running with successfully.
> 
> (I was not at all confident that it was a sufficient fix, and have
> to say "generally" above because one time I got something that looked
> like recursive calls overflowing the stack, and IIRC something "irq"
> did appear in each frame of the trace.  But probably no connection,
> never seen again, and I don't recall even which -rc or -next it was.)

If you ever see that again, please shoot me the log !

Cheers,
Ben.

^ permalink raw reply

* RE: [PATCH SDK1.2 1/3] powerpc/fsl-pci: Unify pci/pcie initialization code
From: Jia Hongtao-B38951 @ 2012-02-10  2:27 UTC (permalink / raw)
  To: Kumar Gala
  Cc: linuxppc-dev@lists.ozlabs.org, Li Yang-R58472, Jia Hongtao-B38951
In-Reply-To: <412C8208B4A0464FA894C5F0C278CD5D010060C9@039-SN1MPN1-005.039d.mgd.msft.net>

Hi Kumar,
This series of patches have been pending for a long time.
I'd like to know whether they are look good or not so I can do the further =
work on it.
It's kind of emergency things for me.
Thanks a lot for your attention.

-----Original Message-----
From: Jia Hongtao-B38951=20
Sent: Tuesday, January 10, 2012 3:31 PM
To: Gala Kumar-B11780
Cc: Li Yang-R58472; Jia Hongtao-B38951; linuxppc-dev@lists.ozlabs.org
Subject: RE: [PATCH SDK1.2 1/3] powerpc/fsl-pci: Unify pci/pcie initializat=
ion code

Hi Kumar,
Do you have any idea on this series of patches?
Looking forward to your answer.
Thanks.

--Jia Hongtao.

-----Original Message-----
From: Jia Hongtao-B38951
Sent: Wednesday, December 21, 2011 3:11 PM
To: linuxppc-dev@lists.ozlabs.org
Cc: Li Yang-R58472; Gala Kumar-B11780; Jia Hongtao-B38951
Subject: [PATCH SDK1.2 1/3] powerpc/fsl-pci: Unify pci/pcie initialization =
code

We unified the Freescale pci/pcie initialization by changing the fsl_pci to=
 a platform driver.

In previous version pci/pcie initialization is in platform code which Initi=
alize pci bridge base on EP/RC or host/agent settings.

Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
---
 arch/powerpc/platforms/85xx/p1022_ds.c |   39 +++++++----------------
 arch/powerpc/sysdev/fsl_pci.c          |   53 ++++++++++++++++++++++++++++=
++++
 2 files changed, 65 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/p1022_ds.c b/arch/powerpc/platform=
s/85xx/p1022_ds.c
index 2bf4342..41de2c1 100644
--- a/arch/powerpc/platforms/85xx/p1022_ds.c
+++ b/arch/powerpc/platforms/85xx/p1022_ds.c
@@ -277,32 +277,9 @@ void __init mpc85xx_smp_init(void);
  */
 static void __init p1022_ds_setup_arch(void)  { -#ifdef CONFIG_PCI
-	struct device_node *np;
-#endif
-	dma_addr_t max =3D 0xffffffff;
-
 	if (ppc_md.progress)
 		ppc_md.progress("p1022_ds_setup_arch()", 0);
=20
-#ifdef CONFIG_PCI
-	for_each_compatible_node(np, "pci", "fsl,p1022-pcie") {
-		struct resource rsrc;
-		struct pci_controller *hose;
-
-		of_address_to_resource(np, 0, &rsrc);
-
-		if ((rsrc.start & 0xfffff) =3D=3D 0x8000)
-			fsl_add_bridge(np, 1);
-		else
-			fsl_add_bridge(np, 0);
-
-		hose =3D pci_find_hose_for_OF_device(np);
-		max =3D min(max, hose->dma_window_base_cur +
-			  hose->dma_window_size);
-	}
-#endif
-
 #if defined(CONFIG_FB_FSL_DIU) || defined(CONFIG_FB_FSL_DIU_MODULE)
 	diu_ops.get_pixel_format	=3D p1022ds_get_pixel_format;
 	diu_ops.set_gamma_table		=3D p1022ds_set_gamma_table;
@@ -316,11 +293,8 @@ static void __init p1022_ds_setup_arch(void)  #endif
=20
 #ifdef CONFIG_SWIOTLB
-	if (memblock_end_of_DRAM() > max) {
+	if (memblock_end_of_DRAM() > 0xffffffff)
 		ppc_swiotlb_enable =3D 1;
-		set_pci_dma_ops(&swiotlb_dma_ops);
-		ppc_md.pci_dma_dev_setup =3D pci_dma_dev_setup_swiotlb;
-	}
 #endif
=20
 	pr_info("Freescale P1022 DS reference board\n"); @@ -339,6 +313,17 @@ sta=
tic int __init p1022_ds_publish_devices(void)  }  machine_device_initcall(p=
1022_ds, p1022_ds_publish_devices);
=20
+static struct of_device_id __initdata p1022_pci_ids[] =3D {
+	{ .compatible =3D "fsl,p1022-pcie", },
+	{},
+};
+
+static int __init p1022_ds_publish_pci_device(void) {
+	return of_platform_bus_probe(NULL, p1022_pci_ids, NULL); }=20
+machine_arch_initcall(p1022_ds, p1022_ds_publish_pci_device);
+
 machine_arch_initcall(p1022_ds, swiotlb_setup_bus_notifier);
=20
 /*
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c =
index 4ce547e..a0f305d 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -712,3 +712,56 @@ u64 fsl_pci_immrbar_base(struct pci_controller *hose)
=20
 	return 0;
 }
+
+static const struct of_device_id pci_ids[] =3D {
+	{ .compatible =3D "fsl,mpc8540-pci", },
+	{ .compatible =3D "fsl,mpc8548-pcie", },
+	{ .compatible =3D "fsl,p1022-pcie", },
+	{},
+};
+
+static int __devinit fsl_pci_probe(struct platform_device *pdev) {
+	struct pci_controller *hose;
+
+	if (of_match_node(pci_ids, pdev->dev.of_node)) {
+		struct resource rsrc;
+		of_address_to_resource(pdev->dev.of_node, 0, &rsrc);
+		if ((rsrc.start & 0xfffff) =3D=3D 8000)
+			fsl_add_bridge(pdev->dev.of_node, 1);
+		else
+			fsl_add_bridge(pdev->dev.of_node, 0);
+
+#ifdef CONFIG_SWIOTLB
+		hose =3D pci_find_hose_for_OF_device(pdev->dev.of_node);
+		/*
+		 * if we couldn't map all of DRAM via the dma windows
+		 * we need SWIOTLB to handle buffers located outside of
+		 * dma capable memory region
+		 */
+		if (memblock_end_of_DRAM() > hose->dma_window_base_cur
+				+ hose->dma_window_size) {
+			ppc_swiotlb_enable =3D 1;
+			set_pci_dma_ops(&swiotlb_dma_ops);
+			ppc_md.pci_dma_dev_setup =3D pci_dma_dev_setup_swiotlb;
+		}
+#endif
+
+	}
+
+	return 0;
+}
+
+static struct platform_driver fsl_pci_driver =3D {
+	.driver =3D {
+		.name =3D "fsl-pci",
+		.of_match_table =3D pci_ids,
+	},
+	.probe =3D fsl_pci_probe,
+};
+
+static int __init fsl_pci_init(void)
+{
+	return platform_driver_register(&fsl_pci_driver);
+}
+arch_initcall(fsl_pci_init);
--
1.7.5.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox