LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] powerpc/44x: Add more changes for APM821XX EMAC driver
From: David Miller @ 2012-02-29 18:25 UTC (permalink / raw)
  To: jwboyer; +Cc: dhdang, linux-kernel, paulus, netdev, linuxppc-dev
In-Reply-To: <CA+5PVA5hciQSvfkodX-oP_kUZueiTp=0+t8X_0iHQ+ehU0ecOQ@mail.gmail.com>

From: Josh Boyer <jwboyer@gmail.com>
Date: Wed, 29 Feb 2012 08:43:46 -0500

> On Fri, Feb 17, 2012 at 3:07 AM, Duc Dang <dhdang@apm.com> wrote:
>> This patch includes:
>>
>> =A0Configure EMAC PHY clock source (clock from PHY or internal clock=
).
>>
>> =A0Do not advertise PHY half duplex capability as APM821XX EMAC does=
 not
>> support half duplex mode.
>>
>> =A0Add changes to support configuring jumbo frame for APM821XX EMAC.=

>>
>> Signed-off-by: Duc Dang <dhdang@apm.com>
> =

> This should have been sent to netdev.  CC'ing them now.
> =

> Ben and David, I can take this change through the 4xx tree if it look=
s OK to
> both of you.  The pre-requisite DTS patch will go through my tree, so=
 it might
> make sense to keep them together.

Well the patch has coding style problems, for one:

>> +			dev->features |=3D (EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE
>> +					| EMAC_FTR_APM821XX_NO_HALF_DUPLEX
>> +					| EMAC_FTR_460EX_PHY_CLK_FIX);

Should be:

>> +			dev->features |=3D (EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE |
>> +					  EMAC_FTR_APM821XX_NO_HALF_DUPLEX |
>> +					  EMAC_FTR_460EX_PHY_CLK_FIX);

And this:

>> +		dev->phy_feat_exc =3D (SUPPORTED_1000baseT_Half
>> +					| SUPPORTED_100baseT_Half
>> +					| SUPPORTED_10baseT_Half);

Should be:

>> +		dev->phy_feat_exc =3D (SUPPORTED_1000baseT_Half |
>> +				     SUPPORTED_100baseT_Half |
>> +				     SUPPORTED_10baseT_Half);

^ permalink raw reply

* [PATCH v2] bootmem/sparsemem: remove limit constraint in alloc_bootmem_section
From: Nishanth Aravamudan @ 2012-02-29 18:12 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Anton Blanchard, Dave Hansen, linux-mm, Paul Mackerras,
	Johannes Weiner, Andrew Morton, Robert Jennings, linuxppc-dev
In-Reply-To: <20120228154732.GE1199@suse.de>

On 28.02.2012 [15:47:32 +0000], Mel Gorman wrote:
> On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
> > While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
> > Overcommit) on powerpc, we tripped the following:
> > 
> > kernel BUG at mm/bootmem.c:483!
> > cpu 0x0: Vector: 700 (Program Check) at [c000000000c03940]
> >     pc: c000000000a62bd8: .alloc_bootmem_core+0x90/0x39c
> >     lr: c000000000a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
> >     sp: c000000000c03bc0
> >    msr: 8000000000021032
> >   current = 0xc000000000b0cce0
> >   paca    = 0xc000000001d80000
> >     pid   = 0, comm = swapper
> > kernel BUG at mm/bootmem.c:483!
> > enter ? for help
> > [c000000000c03c80] c000000000a64bcc
> > .sparse_early_usemaps_alloc_node+0x84/0x29c
> > [c000000000c03d50] c000000000a64f10 .sparse_init+0x12c/0x28c
> > [c000000000c03e20] c000000000a474f4 .setup_arch+0x20c/0x294
> > [c000000000c03ee0] c000000000a4079c .start_kernel+0xb4/0x460
> > [c000000000c03f90] c000000000009670 .start_here_common+0x1c/0x2c
> > 
> > This is
> > 
> >         BUG_ON(limit && goal + size > limit);
> > 
> > and after some debugging, it seems that
> > 
> > 	goal = 0x7ffff000000
> > 	limit = 0x80000000000
> > 
> > and sparse_early_usemaps_alloc_node ->
> > sparse_early_usemaps_alloc_pgdat_section -> alloc_bootmem_section calls
> > 
> > 	return alloc_bootmem_section(usemap_size() * count, section_nr);
> > 
> > This is on a system with 8TB available via the AMS pool, and as a quirk
> > of AMS in firmware, all of that memory shows up in node 0. So, we end up
> > with an allocation that will fail the goal/limit constraints. In theory,
> > we could "fall-back" to alloc_bootmem_node() in
> > sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
> > defined, we'll BUG_ON() instead. A simple solution appears to be to
> > disable the limit check if the size of the allocation in
> > alloc_bootmem_secition exceeds the section size.
> > 
> > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> > Cc: Dave Hansen <haveblue@us.ibm.com>
> > Cc: Anton Blanchard <anton@au1.ibm.com>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
> > Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
> > Cc: linux-mm@kvack.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > ---
> >  include/linux/mmzone.h |    2 ++
> >  mm/bootmem.c           |    5 ++++-
> >  2 files changed, 6 insertions(+), 1 deletions(-)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 650ba2f..4176834 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -967,6 +967,8 @@ static inline unsigned long early_pfn_to_nid(unsigned long pfn)
> >   * PA_SECTION_SHIFT		physical address to/from section number
> >   * PFN_SECTION_SHIFT		pfn to/from section number
> >   */
> > +#define BYTES_PER_SECTION	(1UL << SECTION_SIZE_BITS)
> > +
> >  #define SECTIONS_SHIFT		(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
> >  
> >  #define PA_SECTION_SHIFT	(SECTION_SIZE_BITS)
> > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > index 668e94d..5cbbc76 100644
> > --- a/mm/bootmem.c
> > +++ b/mm/bootmem.c
> > @@ -770,7 +770,10 @@ void * __init alloc_bootmem_section(unsigned long size,
> >  
> >  	pfn = section_nr_to_pfn(section_nr);
> >  	goal = pfn << PAGE_SHIFT;
> > -	limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT;
> > +	if (size > BYTES_PER_SECTION)
> > +		limit = 0;
> > +	else
> > +		limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT;
> 
> As it's ok to spill the allocation over to an adjacent section, why not
> just make limit==0 unconditionally. That would avoid defining
> BYTES_PER_SECTION.

Something like this?

Andrew, presuming Mel & Johannes give their, ack this should presumably
supersede the patch you pulled into -mm.

Thanks,
Nish

-------

While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
Overcommit) on powerpc, we tripped the following:

kernel BUG at mm/bootmem.c:483!
cpu 0x0: Vector: 700 (Program Check) at [c000000000c03940]
    pc: c000000000a62bd8: .alloc_bootmem_core+0x90/0x39c
    lr: c000000000a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
    sp: c000000000c03bc0
   msr: 8000000000021032
  current = 0xc000000000b0cce0
  paca    = 0xc000000001d80000
    pid   = 0, comm = swapper
kernel BUG at mm/bootmem.c:483!
enter ? for help
[c000000000c03c80] c000000000a64bcc
.sparse_early_usemaps_alloc_node+0x84/0x29c
[c000000000c03d50] c000000000a64f10 .sparse_init+0x12c/0x28c
[c000000000c03e20] c000000000a474f4 .setup_arch+0x20c/0x294
[c000000000c03ee0] c000000000a4079c .start_kernel+0xb4/0x460
[c000000000c03f90] c000000000009670 .start_here_common+0x1c/0x2c

This is

        BUG_ON(limit && goal + size > limit);

and after some debugging, it seems that

	goal = 0x7ffff000000
	limit = 0x80000000000

and sparse_early_usemaps_alloc_node ->
sparse_early_usemaps_alloc_pgdat_section calls

	return alloc_bootmem_section(usemap_size() * count, section_nr);

This is on a system with 8TB available via the AMS pool, and as a quirk
of AMS in firmware, all of that memory shows up in node 0. So, we end up
with an allocation that will fail the goal/limit constraints. In theory,
we could "fall-back" to alloc_bootmem_node() in
sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
defined, we'll BUG_ON() instead. A simple solution appears to be to
unconditionally remove the limit condition in alloc_bootmem_section,
meaning allocations are allowed to cross section boundaries (necessary
for systems of this size).

Johannes Weiner pointed out that if alloc_bootmem_section() no longer
guarantees section-locality, we need check_usemap_section_nr() to print
possible cross-dependencies between node descriptors and the usemaps
allocated through it. That makes the two loops in
sparse_early_usemaps_alloc_node() identical, so re-factor the code a
bit.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---
v2: Unconditionally set limit to 0. Fold in Johannes' changes to
sparse_early_usemaps_alloc_node.

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 668e94d..9c9ae09 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -770,7 +770,7 @@ void * __init alloc_bootmem_section(unsigned long size,
 
 	pfn = section_nr_to_pfn(section_nr);
 	goal = pfn << PAGE_SHIFT;
-	limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT;
+	limit = 0;
 	bdata = &bootmem_node_data[early_pfn_to_nid(pfn)];
 
 	return alloc_bootmem_core(bdata, size, SMP_CACHE_BYTES, goal, limit);
diff --git a/mm/sparse.c b/mm/sparse.c
index 61d7cde..a8bc7d3 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -353,29 +353,21 @@ static void __init sparse_early_usemaps_alloc_node(unsigned long**usemap_map,
 
 	usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid),
 								 usemap_count);
-	if (usemap) {
-		for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
-			if (!present_section_nr(pnum))
-				continue;
-			usemap_map[pnum] = usemap;
-			usemap += size;
+	if (!usemap) {
+		usemap = alloc_bootmem_node(NODE_DATA(nodeid), size * usemap_count);
+		if (!usemap) {
+			printk(KERN_WARNING "%s: allocation failed\n", __func__);
+			return;
 		}
-		return;
 	}
 
-	usemap = alloc_bootmem_node(NODE_DATA(nodeid), size * usemap_count);
-	if (usemap) {
-		for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
-			if (!present_section_nr(pnum))
-				continue;
-			usemap_map[pnum] = usemap;
-			usemap += size;
-			check_usemap_section_nr(nodeid, usemap_map[pnum]);
-		}
-		return;
+	for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
+		if (!present_section_nr(pnum))
+			continue;
+		usemap_map[pnum] = usemap;
+		usemap += size;
+		check_usemap_section_nr(nodeid, usemap_map[pnum]);
 	}
-
-	printk(KERN_WARNING "%s: allocation failed\n", __func__);
 }
 
 #ifndef CONFIG_SPARSEMEM_VMEMMAP

-- 
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center

^ permalink raw reply related

* Re: [PATCH] KVM: PPC: Don't sync timebase when inside KVM
From: Scott Wood @ 2012-02-29 17:50 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1330481769-24390-1-git-send-email-agraf@suse.de>

On 02/28/2012 08:16 PM, Alexander Graf wrote:
> When we know that we're running inside of a KVM guest, we don't have to
> worry about synchronizing timebases between different CPUs, since the
> host already took care of that.
> 
> This fixes CPU overcommit scenarios where vCPUs could hang forever trying
> to sync each other while not being scheduled.
> 
> Reported-by: Stuart Yoder <B08248@freescale.com>
> Signed-off-by: Alexander Graf <agraf@suse.de>

This should apply to any hypervisor, not just KVM.  On book3e, Power ISA
says timebase is read-only on virtualized implementations.  My
understanding is that book3s is paravirt-only (guest state is not
considered an implementation of the Power ISA), and it says "Writing the
Time Base is privileged, and can be done only in hypervisor state".

Which platforms are you seeing this on?  If it's on Freescale chips,
U-Boot should be doing the sync and Linux should never do it, even in
the absence of a hypervisor.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/44x: Fix PCI MSI support for APM821xx SoC and Bluestone board
From: Josh Boyer @ 2012-02-29 14:18 UTC (permalink / raw)
  To: Mai La
  Cc: open-source-review, Tirumala R Marri, linux-kernel,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <1330505221-3678-1-git-send-email-mla@apm.com>

On Wed, Feb 29, 2012 at 3:47 AM, Mai La <mla@apm.com> wrote:
> This patch consists of:
> - Enable PCI MSI as default for Bluestone board
> - Define number of MSI interrupt for Maui APM821xx

What is Maui?  Is that the same thing as Bluestone?

> - Fix returning ENODEV as finding MSI node
> - Fix MSI physical high and low address
> - Keep MSI data logically
>
> Signed-off-by: Mai La <mla@apm.com>

Wow.  So there are a lot of bugfixes here.  I'm surprised this ever worked =
at
all with some of the things you're fixing.  Nice to see.

> ---
> =A0arch/powerpc/platforms/44x/Kconfig | =A0 =A02 ++
> =A0arch/powerpc/sysdev/ppc4xx_msi.c =A0 | =A0 28 ++++++++++++++++++------=
----
> =A02 files changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/platforms/44x/Kconfig b/arch/powerpc/platforms/=
44x/Kconfig
> index fcf6bf2..9f04ce3 100644
> --- a/arch/powerpc/platforms/44x/Kconfig
> +++ b/arch/powerpc/platforms/44x/Kconfig
> @@ -23,6 +23,8 @@ config BLUESTONE
> =A0 =A0 =A0 =A0default n
> =A0 =A0 =A0 =A0select PPC44x_SIMPLE
> =A0 =A0 =A0 =A0select APM821xx
> + =A0 =A0 =A0 select PCI_MSI
> + =A0 =A0 =A0 select PPC4xx_MSI
> =A0 =A0 =A0 =A0select IBM_EMAC_RGMII
> =A0 =A0 =A0 =A0help
> =A0 =A0 =A0 =A0 =A0This option enables support for the APM APM821xx Evalu=
ation board.
> diff --git a/arch/powerpc/sysdev/ppc4xx_msi.c b/arch/powerpc/sysdev/ppc4x=
x_msi.c
> index 1c2d7af..6103908 100644
> --- a/arch/powerpc/sysdev/ppc4xx_msi.c
> +++ b/arch/powerpc/sysdev/ppc4xx_msi.c
> @@ -31,7 +31,7 @@
> =A0#include <asm/prom.h>
> =A0#include <asm/hw_irq.h>
> =A0#include <asm/ppc-pci.h>
> -#include <boot/dcr.h>
> +#include <asm/dcr.h>
> =A0#include <asm/dcr-regs.h>
> =A0#include <asm/msi_bitmap.h>
>
> @@ -43,7 +43,12 @@
> =A0#define PEIH_FLUSH0 =A0 =A00x30
> =A0#define PEIH_FLUSH1 =A0 =A00x38
> =A0#define PEIH_CNTRST =A0 =A00x48
> +
> +#ifdef CONFIG_APM821xx
> +#define NR_MSI_IRQS =A0 =A08
> +#else
> =A0#define NR_MSI_IRQS =A0 =A04
> +#endif

Hm.  Do you think this is going to change quite a bit depending on which So=
C
is being used?  If so, it might be better to introduce a Kconfig variable
that just defines this instead.  Something like:

	config 4xx_MSI_IRQS
	   int "Number of MSI IRQs"
	   depends on 4xx
	   default "8" if APM821xx
	   default "4" if !APM821xx

If there aren't going to be a wide variety of numbers, then the simple ifde=
f
you have is probably sufficient.

> =A0struct ppc4xx_msi {
> =A0 =A0 =A0 =A0u32 msi_addr_lo;
> @@ -150,12 +155,11 @@ static int ppc4xx_setup_pcieh_hw(struct platform_de=
vice *dev,
> =A0 =A0 =A0 =A0if (!sdr_addr)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return -1;
>
> - =A0 =A0 =A0 SDR0_WRITE(sdr_addr, (u64)res.start >> 32); =A0 =A0 =A0/*HI=
GH addr */
> - =A0 =A0 =A0 SDR0_WRITE(sdr_addr + 1, res.start & 0xFFFFFFFF); /* Low ad=
dr */
> -
> + =A0 =A0 =A0 mtdcri(SDR0, *sdr_addr, res.start >> 32); =A0 =A0 =A0 /*HIG=
H addr */
> + =A0 =A0 =A0 mtdcri(SDR0, *sdr_addr + 1, res.start & 0xFFFFFFFF);/* Low =
addr */

Don't you still want the (u64) cast on res.start?

> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
> is for the sole use of the intended recipient(s) and contains information
> that is confidential and proprietary to AppliedMicro Corporation or its s=
ubsidiaries.
> It is to be used solely for the purpose of furthering the parties' busine=
ss relationship.
> All unauthorized review, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by reply=
 e-mail
> and destroy all copies of the original message.

Is there a way you can drop this?  Others from APM seem to have figured out
how to do that, so hopefully it won't be a big problem.

josh

^ permalink raw reply

* Re: [PATCH 1/3] powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
From: Josh Boyer @ 2012-02-29 13:54 UTC (permalink / raw)
  To: Vinh Nguyen Huu Tuong; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <1324385014-30725-1-git-send-email-vhtnguyen@apm.com>

On Tue, Dec 20, 2011 at 7:43 AM, Vinh Nguyen Huu Tuong
<vhtnguyen@apm.com> wrote:
> This patch consists of:
> - Fix the pvr mask for checking pvr in cputable.c
> - Fix the cpu name as consistent with cpu name is describled in dts file
>
> Signed-off-by: Vinh Nguyen Huu Tuong <vhtnguyen@apm.com>
> ---

I was waiting to see if you would submit a new series with patch 3/3 fixed for
the comments I made.  Seems you haven't yet or I missed it entirely.  For now,
I'll take this patch as it's stand-alone.  The DTS and PCI driver patches will
need to be submitted together again.

josh

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/44x: Add more changes for APM821XX EMAC driver
From: Josh Boyer @ 2012-02-29 13:43 UTC (permalink / raw)
  To: Duc Dang; +Cc: netdev, Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <1329466058-15969-1-git-send-email-dhdang@apm.com>

On Fri, Feb 17, 2012 at 3:07 AM, Duc Dang <dhdang@apm.com> wrote:
> This patch includes:
>
> =A0Configure EMAC PHY clock source (clock from PHY or internal clock).
>
> =A0Do not advertise PHY half duplex capability as APM821XX EMAC does not
> support half duplex mode.
>
> =A0Add changes to support configuring jumbo frame for APM821XX EMAC.
>
> Signed-off-by: Duc Dang <dhdang@apm.com>

This should have been sent to netdev.  CC'ing them now.

Ben and David, I can take this change through the 4xx tree if it looks OK t=
o
both of you.  The pre-requisite DTS patch will go through my tree, so it mi=
ght
make sense to keep them together.

josh

> ---
>  drivers/net/ethernet/ibm/emac/core.c |   26 +++++++++++++++++++++++++-
>  drivers/net/ethernet/ibm/emac/core.h |   13 +++++++++++--
>  drivers/net/ethernet/ibm/emac/emac.h |    5 ++++-
>  3 files changed, 40 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/=
ibm/emac/core.c
> index ed79b2d..de620f1 100644
> --- a/drivers/net/ethernet/ibm/emac/core.c
> +++ b/drivers/net/ethernet/ibm/emac/core.c
> @@ -434,6 +434,11 @@ static inline u32 emac_iff2rmr(struct net_device *nd=
ev)
>  	else if (!netdev_mc_empty(ndev))
>  		r |=3D EMAC_RMR_MAE;
>
> +	if (emac_has_feature(dev, EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE)) {
> +		r &=3D ~EMAC4_RMR_MJS_MASK;
> +		r |=3D EMAC4_RMR_MJS(ndev->mtu);
> +	}
> +
>  	return r;
>  }
>
> @@ -965,6 +970,7 @@ static int emac_resize_rx_ring(struct emac_instance *=
dev, int new_mtu)
>  	int rx_sync_size =3D emac_rx_sync_size(new_mtu);
>  	int rx_skb_size =3D emac_rx_skb_size(new_mtu);
>  	int i, ret =3D 0;
> +	int mr1_jumbo_bit_change =3D 0;
>
>  	mutex_lock(&dev->link_lock);
>  	emac_netif_stop(dev);
> @@ -1013,7 +1019,14 @@ static int emac_resize_rx_ring(struct emac_instanc=
e *dev, int new_mtu)
>  	}
>   skip:
>  	/* Check if we need to change "Jumbo" bit in MR1 */
> -	if ((new_mtu > ETH_DATA_LEN) ^ (dev->ndev->mtu > ETH_DATA_LEN)) {
> +	if (emac_has_feature(dev, EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE))
> +		mr1_jumbo_bit_change =3D (new_mtu > ETH_DATA_LEN) ||
> +				(dev->ndev->mtu > ETH_DATA_LEN);
> +	else
> +		mr1_jumbo_bit_change =3D (new_mtu > ETH_DATA_LEN) ^
> +				(dev->ndev->mtu > ETH_DATA_LEN);
> +
> +	if (mr1_jumbo_bit_change) {
>  		/* This is to prevent starting RX channel in emac_rx_enable() */
>  		set_bit(MAL_COMMAC_RX_STOPPED, &dev->commac.flags);
>
> @@ -2471,6 +2484,7 @@ static int __devinit emac_init_phy(struct emac_inst=
ance *dev)
>
>  	/* Disable any PHY features not supported by the platform */
>  	dev->phy.def->features &=3D ~dev->phy_feat_exc;
> +	dev->phy.features &=3D ~dev->phy_feat_exc;
>
>  	/* Setup initial link parameters */
>  	if (dev->phy.features & SUPPORTED_Autoneg) {
> @@ -2568,6 +2582,10 @@ static int __devinit emac_init_config(struct emac_=
instance *dev)
>  		if (of_device_is_compatible(np, "ibm,emac-405ex") ||
>  		    of_device_is_compatible(np, "ibm,emac-405exr"))
>  			dev->features |=3D EMAC_FTR_440EP_PHY_CLK_FIX;
> +		if (of_device_is_compatible(np, "ibm,emac-apm821xx"))
> +			dev->features |=3D (EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE
> +					| EMAC_FTR_APM821XX_NO_HALF_DUPLEX
> +					| EMAC_FTR_460EX_PHY_CLK_FIX);
>  	} else if (of_device_is_compatible(np, "ibm,emac4")) {
>  		dev->features |=3D EMAC_FTR_EMAC4;
>  		if (of_device_is_compatible(np, "ibm,emac-440gx"))
> @@ -2818,6 +2836,12 @@ static int __devinit emac_probe(struct platform_de=
vice *ofdev)
>  	dev->stop_timeout =3D STOP_TIMEOUT_100;
>  	INIT_DELAYED_WORK(&dev->link_work, emac_link_timer);
>
> +	/* Some SoCs like APM821xx does not support Half Duplex mode. */
> +	if (emac_has_feature(dev, EMAC_FTR_APM821XX_NO_HALF_DUPLEX))
> +		dev->phy_feat_exc =3D (SUPPORTED_1000baseT_Half
> +					| SUPPORTED_100baseT_Half
> +					| SUPPORTED_10baseT_Half);
> +
>  	/* Find PHY if any */
>  	err =3D emac_init_phy(dev);
>  	if (err !=3D 0)
> diff --git a/drivers/net/ethernet/ibm/emac/core.h b/drivers/net/ethernet/=
ibm/emac/core.h
> index fa3ec57..9dea85a 100644
> --- a/drivers/net/ethernet/ibm/emac/core.h
> +++ b/drivers/net/ethernet/ibm/emac/core.h
> @@ -325,7 +325,14 @@ struct emac_instance {
>   * Set if we need phy clock workaround for 460ex or 460gt
>   */
>  #define EMAC_FTR_460EX_PHY_CLK_FIX	0x00000400
> -
> +/*
> + * APM821xx requires Jumbo frame size set explicitly
> + */
> +#define EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE	0x00000800
> +/*
> + * APM821xx does not support Half Duplex mode
> + */
> +#define EMAC_FTR_APM821XX_NO_HALF_DUPLEX	0x00001000
>
>  /* Right now, we don't quite handle the always/possible masks on the
>   * most optimal way as we don't have a way to say something like
> @@ -353,7 +360,9 @@ enum {
>  	    EMAC_FTR_NO_FLOW_CONTROL_40x |
>  #endif
>  	EMAC_FTR_460EX_PHY_CLK_FIX |
> -	EMAC_FTR_440EP_PHY_CLK_FIX,
> +	EMAC_FTR_440EP_PHY_CLK_FIX |
> +	EMAC_APM821XX_REQ_JUMBO_FRAME_SIZE |
> +	EMAC_FTR_APM821XX_NO_HALF_DUPLEX,
>  };
>
>  static inline int emac_has_feature(struct emac_instance *dev,
> diff --git a/drivers/net/ethernet/ibm/emac/emac.h b/drivers/net/ethernet/=
ibm/emac/emac.h
> index 1568278..36bcd69 100644
> --- a/drivers/net/ethernet/ibm/emac/emac.h
> +++ b/drivers/net/ethernet/ibm/emac/emac.h
> @@ -212,7 +212,10 @@ struct emac_regs {
>  #define EMAC4_RMR_RFAF_64_1024		0x00000006
>  #define EMAC4_RMR_RFAF_128_2048		0x00000007
>  #define EMAC4_RMR_BASE			EMAC4_RMR_RFAF_128_2048
> -
> +#if defined(CONFIG_APM821xx)
> +#define EMAC4_RMR_MJS_MASK              0x0001fff8
> +#define EMAC4_RMR_MJS(s)                (((s) << 3) & EMAC4_RMR_MJS_MASK=
)
> +#endif
>  /* EMACx_ISR & EMACx_ISER */
>  #define EMAC4_ISR_TXPE			0x20000000
>  #define EMAC4_ISR_RXPE			0x10000000
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" i=
n
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
From: Michael Ellerman @ 2012-02-29 12:56 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev
In-Reply-To: <1330409051-8941-21-git-send-email-shangw@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

On Tue, 2012-02-28 at 14:04 +0800, Gavin Shan wrote:
> With the original EEH implementation, the EEH global statistics
> are maintained by individual global variables. That makes the
> code a little hard to maintain.

Hi Gavin,

> @@ -1174,21 +1182,24 @@ static int proc_eeh_show(struct seq_file *m, void *v)
>  {
>  	if (0 == eeh_subsystem_enabled) {
>  		seq_printf(m, "EEH Subsystem is globally disabled\n");
> -		seq_printf(m, "eeh_total_mmio_ffs=%ld\n", total_mmio_ffs);
> +		seq_printf(m, "eeh_total_mmio_ffs=%d\n", eeh_stats.total_mmio_ffs);
>  	} else {
>  		seq_printf(m, "EEH Subsystem is enabled\n");
>  		seq_printf(m,
> -				"no device=%ld\n"
> -				"no device node=%ld\n"
> -				"no config address=%ld\n"
> -				"check not wanted=%ld\n"
> -				"eeh_total_mmio_ffs=%ld\n"
> -				"eeh_false_positives=%ld\n"
> -				"eeh_slot_resets=%ld\n",
> -				no_device, no_dn, no_cfg_addr, 
> -				ignored_check, total_mmio_ffs, 
> -				false_positives,
> -				slot_resets);
> +				"no device           =%d\n"
> +				"no device node      =%d\n"
> +				"no config address   =%d\n"
> +				"check not wanted    =%d\n"
> +				"eeh_total_mmio_ffs  =%d\n"
> +				"eeh_false_positives =%d\n"
> +				"eeh_slot_resets     =%d\n",

There *might* be tools out there that parse this output, so I'd say
don't change it unless you have to - and I don't think you have to?

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations
From: Johannes Weiner @ 2012-02-29  9:17 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Anton Blanchard, Dave Hansen, linux-mm, Paul Mackerras,
	Nishanth Aravamudan, Andrew Morton, Robert Jennings, linuxppc-dev
In-Reply-To: <20120228201151.GC5136@linux.vnet.ibm.com>

On Tue, Feb 28, 2012 at 12:11:51PM -0800, Nishanth Aravamudan wrote:
> On 28.02.2012 [14:53:26 +0100], Johannes Weiner wrote:
> > On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
> > > While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
> > > Overcommit) on powerpc, we tripped the following:
> > > 
> > > kernel BUG at mm/bootmem.c:483!
> > > cpu 0x0: Vector: 700 (Program Check) at [c000000000c03940]
> > >     pc: c000000000a62bd8: .alloc_bootmem_core+0x90/0x39c
> > >     lr: c000000000a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
> > >     sp: c000000000c03bc0
> > >    msr: 8000000000021032
> > >   current = 0xc000000000b0cce0
> > >   paca    = 0xc000000001d80000
> > >     pid   = 0, comm = swapper
> > > kernel BUG at mm/bootmem.c:483!
> > > enter ? for help
> > > [c000000000c03c80] c000000000a64bcc
> > > .sparse_early_usemaps_alloc_node+0x84/0x29c
> > > [c000000000c03d50] c000000000a64f10 .sparse_init+0x12c/0x28c
> > > [c000000000c03e20] c000000000a474f4 .setup_arch+0x20c/0x294
> > > [c000000000c03ee0] c000000000a4079c .start_kernel+0xb4/0x460
> > > [c000000000c03f90] c000000000009670 .start_here_common+0x1c/0x2c
> > > 
> > > This is
> > > 
> > >         BUG_ON(limit && goal + size > limit);
> > > 
> > > and after some debugging, it seems that
> > > 
> > > 	goal = 0x7ffff000000
> > > 	limit = 0x80000000000
> > > 
> > > and sparse_early_usemaps_alloc_node ->
> > > sparse_early_usemaps_alloc_pgdat_section -> alloc_bootmem_section calls
> > > 
> > > 	return alloc_bootmem_section(usemap_size() * count, section_nr);
> > > 
> > > This is on a system with 8TB available via the AMS pool, and as a quirk
> > > of AMS in firmware, all of that memory shows up in node 0. So, we end up
> > > with an allocation that will fail the goal/limit constraints. In theory,
> > > we could "fall-back" to alloc_bootmem_node() in
> > > sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
> > > defined, we'll BUG_ON() instead. A simple solution appears to be to
> > > disable the limit check if the size of the allocation in
> > > alloc_bootmem_secition exceeds the section size.
> > 
> > It makes sense to allow the usemaps to spill over to subsequent
> > sections instead of panicking, so FWIW:
> > 
> > Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> > 
> > That being said, it would be good if check_usemap_section_nr() printed
> > the cross-dependencies between pgdats and sections when the usemaps of
> > a node spilled over to other sections than the ones holding the pgdat.
> > 
> > How about this?
> > 
> > ---
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Subject: sparsemem/bootmem: catch greater than section size allocations fix
> > 
> > If alloc_bootmem_section() no longer guarantees section-locality, we
> > need check_usemap_section_nr() to print possible cross-dependencies
> > between node descriptors and the usemaps allocated through it.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> > 
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 61d7cde..9e032dc 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -359,6 +359,7 @@ static void __init sparse_early_usemaps_alloc_node(unsigned long**usemap_map,
> >  				continue;
> >  			usemap_map[pnum] = usemap;
> >  			usemap += size;
> > +			check_usemap_section_nr(nodeid, usemap_map[pnum]);
> >  		}
> >  		return;
> >  	}
> 
> This makes sense to me -- ok if I fold it into the re-worked patch
> (based upon Mel's comments)?

Sure thing!

> > Furthermore, I wonder if we can remove the sparse-specific stuff from
> > bootmem.c as well, as now even more so than before, calculating the
> > desired area is really none of bootmem's business.
> > 
> > Would something like this be okay?
> > 
> > ---
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Subject: [patch] mm: remove sparsemem allocation details from the bootmem allocator
> > 
> > alloc_bootmem_section() derives allocation area constraints from the
> > specified sparsemem section.  This is a bit specific for a generic
> > memory allocator like bootmem, though, so move it over to sparsemem.
> > 
> > Since __alloc_bootmem_node() already retries failed allocations with
> > relaxed area constraints, the fallback code in sparsemem.c can be
> > removed and the code becomes a bit more compact overall.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> I've not tested it, but the intention seems sensible. I think it should
> remain a separate change.

Yes, I agree.  I'll resend it in a bit as stand-alone patch.

^ permalink raw reply

* [PATCH 1/2] powerpc/44x: Fix PCI MSI support for APM821xx SoC and Bluestone board
From: Mai La @ 2012-02-29  8:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Josh Boyer, Matt Porter,
	Tirumala R Marri, Grant Likely, Michael Neuling, Kumar Gala,
	Anton Blanchard, linuxppc-dev, linux-kernel
  Cc: open-source-review, Mai La

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4148 bytes --]

This patch consists of:
- Enable PCI MSI as default for Bluestone board 
- Define number of MSI interrupt for Maui APM821xx
- Fix returning ENODEV as finding MSI node
- Fix MSI physical high and low address
- Keep MSI data logically

Signed-off-by: Mai La <mla@apm.com>
---
 arch/powerpc/platforms/44x/Kconfig |    2 ++
 arch/powerpc/sysdev/ppc4xx_msi.c   |   28 ++++++++++++++++++----------
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/44x/Kconfig b/arch/powerpc/platforms/44x/Kconfig
index fcf6bf2..9f04ce3 100644
--- a/arch/powerpc/platforms/44x/Kconfig
+++ b/arch/powerpc/platforms/44x/Kconfig
@@ -23,6 +23,8 @@ config BLUESTONE
 	default n
 	select PPC44x_SIMPLE
 	select APM821xx
+	select PCI_MSI
+	select PPC4xx_MSI
 	select IBM_EMAC_RGMII
 	help
 	  This option enables support for the APM APM821xx Evaluation board.
diff --git a/arch/powerpc/sysdev/ppc4xx_msi.c b/arch/powerpc/sysdev/ppc4xx_msi.c
index 1c2d7af..6103908 100644
--- a/arch/powerpc/sysdev/ppc4xx_msi.c
+++ b/arch/powerpc/sysdev/ppc4xx_msi.c
@@ -31,7 +31,7 @@
 #include <asm/prom.h>
 #include <asm/hw_irq.h>
 #include <asm/ppc-pci.h>
-#include <boot/dcr.h>
+#include <asm/dcr.h>
 #include <asm/dcr-regs.h>
 #include <asm/msi_bitmap.h>
 
@@ -43,7 +43,12 @@
 #define PEIH_FLUSH0	0x30
 #define PEIH_FLUSH1	0x38
 #define PEIH_CNTRST	0x48
+
+#ifdef CONFIG_APM821xx
+#define NR_MSI_IRQS	8
+#else
 #define NR_MSI_IRQS	4
+#endif
 
 struct ppc4xx_msi {
 	u32 msi_addr_lo;
@@ -150,12 +155,11 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 	if (!sdr_addr)
 		return -1;
 
-	SDR0_WRITE(sdr_addr, (u64)res.start >> 32);	 /*HIGH addr */
-	SDR0_WRITE(sdr_addr + 1, res.start & 0xFFFFFFFF); /* Low addr */
-
+	mtdcri(SDR0, *sdr_addr, res.start >> 32);	/*HIGH addr */
+	mtdcri(SDR0, *sdr_addr + 1, res.start & 0xFFFFFFFF);/* Low addr */
 
 	msi->msi_dev = of_find_node_by_name(NULL, "ppc4xx-msi");
-	if (msi->msi_dev)
+	if (!msi->msi_dev)
 		return -ENODEV;
 
 	msi->msi_regs = of_iomap(msi->msi_dev, 0);
@@ -167,9 +171,12 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 		(u32) (msi->msi_regs + PEIH_TERMADH), (u32) (msi->msi_regs));
 
 	msi_virt = dma_alloc_coherent(&dev->dev, 64, &msi_phys, GFP_KERNEL);
-	msi->msi_addr_hi = 0x0;
-	msi->msi_addr_lo = (u32) msi_phys;
-	dev_dbg(&dev->dev, "PCIE-MSI: msi address 0x%x\n", msi->msi_addr_lo);
+	if (!msi_virt)
+		return -ENOMEM;
+	msi->msi_addr_hi = (u32)(msi_phys >> 32);
+	msi->msi_addr_lo = (u32)(msi_phys & 0xffffffff);
+	dev_dbg(&dev->dev, "PCIE-MSI: msi address high 0x%x, low 0x%x\n",
+		msi->msi_addr_hi, msi->msi_addr_lo);
 
 	/* Progam the Interrupt handler Termination addr registers */
 	out_be32(msi->msi_regs + PEIH_TERMADH, msi->msi_addr_hi);
@@ -185,6 +192,8 @@ static int ppc4xx_setup_pcieh_hw(struct platform_device *dev,
 	out_be32(msi->msi_regs + PEIH_MSIED, *msi_data);
 	out_be32(msi->msi_regs + PEIH_MSIMK, *msi_mask);
 
+	dma_free_coherent(&dev->dev, 64, msi_virt, msi_phys);
+
 	return 0;
 }
 
@@ -215,8 +224,6 @@ static int __devinit ppc4xx_msi_probe(struct platform_device *dev)
 	struct resource res;
 	int err = 0;
 
-	msi = &ppc4xx_msi;/*keep the msi data for further use*/
-
 	dev_dbg(&dev->dev, "PCIE-MSI: Setting up MSI support...\n");
 
 	msi = kzalloc(sizeof(struct ppc4xx_msi), GFP_KERNEL);
@@ -242,6 +249,7 @@ static int __devinit ppc4xx_msi_probe(struct platform_device *dev)
 		dev_err(&dev->dev, "Error allocating MSI bitmap\n");
 		goto error_out;
 	}
+	ppc4xx_msi = *msi;
 
 	ppc_md.setup_msi_irqs = ppc4xx_setup_msi_irqs;
 	ppc_md.teardown_msi_irqs = ppc4xx_teardown_msi_irqs;
-- 
1.7.3.4

CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
is for the sole use of the intended recipient(s) and contains information 
that is confidential and proprietary to AppliedMicro Corporation or its subsidiaries. 
It is to be used solely for the purpose of furthering the parties' business relationship. 
All unauthorized review, use, disclosure or distribution is prohibited. 
If you are not the intended recipient, please contact the sender by reply e-mail 
and destroy all copies of the original message.

^ permalink raw reply related

* [PATCH 2/2] powerpc/44x: Add PCI MSI node for APM821xx SoC and Bluestone board in DTS
From: Mai La @ 2012-02-29  8:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Josh Boyer, Matt Porter,
	Tirumala R Marri, Grant Likely, Michael Neuling, Kumar Gala,
	Anton Blanchard, linuxppc-dev, linux-kernel
  Cc: open-source-review, Mai La

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]


Signed-off-by: Mai La <mla@apm.com>
---
 arch/powerpc/boot/dts/bluestone.dts |   24 ++++++++++++++++++++++++
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/boot/dts/bluestone.dts b/arch/powerpc/boot/dts/bluestone.dts
index 2a56a0d..8ea6325 100644
--- a/arch/powerpc/boot/dts/bluestone.dts
+++ b/arch/powerpc/boot/dts/bluestone.dts
@@ -250,5 +250,29 @@
 			};
 		};
 
+		MSI: ppc4xx-msi@C10000000 {
+			compatible = "amcc,ppc4xx-msi", "ppc4xx-msi";
+			reg = < 0xC 0x10000000 0x100
+				0xC 0x10000000 0x100>;
+			sdr-base = <0x36C>;
+			msi-data = <0x00004440>;
+			msi-mask = <0x0000ffe0>;
+			interrupts =<0 1 2 3 4 5 6 7>;
+			interrupt-parent = <&MSI>;
+			#interrupt-cells = <1>;
+			#address-cells = <0>;
+			#size-cells = <0>;
+			msi-available-ranges = <0x0 0x100>;
+			interrupt-map = <
+				0 &UIC3 0x18 1
+				1 &UIC3 0x19 1
+				2 &UIC3 0x1A 1
+				3 &UIC3 0x1B 1
+				4 &UIC3 0x1C 1
+				5 &UIC3 0x1D 1
+				6 &UIC3 0x1E 1
+				7 &UIC3 0x1F 1
+			>;
+		};
 	};
 };
-- 
1.7.3.4

CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
is for the sole use of the intended recipient(s) and contains information 
that is confidential and proprietary to AppliedMicro Corporation or its subsidiaries. 
It is to be used solely for the purpose of furthering the parties' business relationship. 
All unauthorized review, use, disclosure or distribution is prohibited. 
If you are not the intended recipient, please contact the sender by reply e-mail 
and destroy all copies of the original message.

^ permalink raw reply related

* Re: [PATCH v5 00/21] EEH reorganization
From: Gavin Shan @ 2012-02-29  3:04 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev
In-Reply-To: <1330409051-8941-1-git-send-email-shangw@linux.vnet.ibm.com>

Hi Ben,

Could you pls take a look on this when you have time?

Thanks,
Gavin

> This series of patches is going to reorganize EEH so that it could support
> multiple platforms in future. The requirements were raised from the aspects.
> 
> 	* The original EEH implementation only support pSeries platform, which
> 	  would be regarded as guest system. Platform powernv is coming and EEH
> 	  needs to be supported on powernv as well.
> 	* Different platforms might be running based on variable firmware.Further
> 	  more, the firmware would supply different EEH interfaces to kernel.
> 	  Therefore, we have to do necessary abstraction on current EEH implementation.
> 
> In order to accomodate the requirements, the series of patches have reorganized
> current EEH implementation.
> 
> 	* The original implementation looks not clean enough. Necessary cleanup
> 	  will be done in some of the patches.
> 	* struct eeh_ops has been introduced so that EEH core components and platform
> 	  dependent implementation could be split up. That make it possible for EEH
> 	  to be supported on multiple platforms.
> 	* struct eeh_dev has been introduced to replace struct pci_dn so that EEH module
> 	  works independently as much as possible.
> 	* EEH global statistics will be maintained in a collective fashion.
> 
> v1 -> v2:
> 
> 	* If possible, to add "eeh_" prefix for function names.
> 	* The format of leading function comments won't be changed in order not to
> 	  break kernel document automatic generation (e.g. by "make pdfdocs").
> 	* The name of local variables won't be changed if there're no explicit reasons.
> 	* Represent the PE's state in bitmap fasion.
> 	* Some function names have been adjusted so that they look shorter and
> 	  meaningful.
> 	* Platform operation name has been changed to "pseries".
> 	* Merge those patches for cleanup if possible.
> 	* The line length is kept as appropriately short if possible.
> 	* Fixup on alignment & spacing issues.
> 
> v2 -> v3:
> 	* Split cleanup patch into 2: one for comment cleanup and another one for
> 	  renaming function names.
> 	* Try to use pr_warning/pr_info/pr_debug instead of printk() function call.
> 	* Function names are adjusted a little bit so that they looks more meaningful
> 	  according to comments from Michael/Ben.
> 	* Useful comment has been kept according to Michael's comments.
> 	* struct eeh_ops::set_eeh has been changed to eeh_ops::set_option.
> 	* struct eeh_ops::name has been changed to "char *".
> 	* Remove file name from the source file.
> 	* Copyright (C) format has been changed since "(C)" isn't encouraged to use.
> 	* The header files included in the source file have been sorted alphabetically.
> 	* eeh_platform_init() has been replaced by eeh_pseries_init() to avoid duplicate
> 	  functions when kernel supports multiple platforms.
> 	* "F/W" has been changed to "Firmware".
> 	* The maximal wait time to retrieve PE's state has been covered by macro.
> 	* It also include changes according to the minor comments from Michael.
> 
> v3 -> v4:
> 	* Fix some typo included in the commit messages.
> 	* Reduce code nesting according to Ram's suggestions.
> 	* Addtinal pr_warning on failure of configuring bridges.
> 
> v4 -> v5:
> 	* OF node and PCI device are tracing the corresponding eeh device.
> 	  That has been changed to "struct eeh_dev *" instead of the original
> 	  "void *".
> 	* The conversion between OF node, PCI device, eeh device is changed
> 	  to inline functions instead of the original macros.
> 	* The "struct eeh_stats" has been moved from eeh.h to eeh.c. Besides,
> 	  the individual members of the struct have been changed to fixed-type
> 	  "unsigned int". 
> 
> 
> The series of patches (v5) has been verified on Firebird-L machine. In order to carry out
> the test, you have to install IBM Power Tools from IBM internal yum source. Following
> command is used to force EEH check on ethernet interface, which could be recovered eventually
> by EEH and device driver successfully. You could keep pinging to the blade before issuing
> the following command to force EEH. You should see the network interface can't be reached for
> a moment and everything will be recovered couple of seconds after the forced EEH error. At the
> same time, you should see EEH error log out of system console. 
> 
> 	* errinjct eeh -v -f 0 -p U78AE.001.WZS00M9-P1-C18-L1-T2 -a 0x0 -m 0x0
> 
> -----
> 
> arch/powerpc/include/asm/device.h            |    3 +
> arch/powerpc/include/asm/eeh.h               |  134 +++-
> arch/powerpc/include/asm/eeh_event.h         |   33 +-
> arch/powerpc/include/asm/ppc-pci.h           |   89 +--
> arch/powerpc/kernel/of_platform.c            |    3 +
> arch/powerpc/kernel/rtas_pci.c               |    3 +
> arch/powerpc/platforms/pseries/Makefile      |    3 +-
> arch/powerpc/platforms/pseries/eeh.c         | 1044 ++++++++++++--------------
> arch/powerpc/platforms/pseries/eeh_cache.c   |   44 +-
> arch/powerpc/platforms/pseries/eeh_dev.c     |  102 +++
> arch/powerpc/platforms/pseries/eeh_driver.c  |  213 +++---
> arch/powerpc/platforms/pseries/eeh_event.c   |   55 +-
> arch/powerpc/platforms/pseries/eeh_pseries.c |  565 ++++++++++++++
> arch/powerpc/platforms/pseries/eeh_sysfs.c   |   25 +-
> arch/powerpc/platforms/pseries/msi.c         |    2 +-
> arch/powerpc/platforms/pseries/pci_dlpar.c   |    3 +
> arch/powerpc/platforms/pseries/setup.c       |    7 +-
> include/linux/of.h                           |   10 +
> include/linux/pci.h                          |    7 +
> 19 files changed, 1477 insertions(+), 868 deletions(-)
> 
> Thanks,
> Gavin

^ permalink raw reply

* RE: [PATCH V3] fsl-sata: add support for interrupt coalsecing feature
From: Liu Qiang-B32616 @ 2012-02-29  2:54 UTC (permalink / raw)
  To: jgarzik@pobox.com
  Cc: linux-ide@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, Li Yang-R58472
In-Reply-To: <CADRPPNTmRvPVymJmW4vagU+2M4DmqeDSEj=4=R_=YDJekOTccQ@mail.gmail.com>

SGkgSmVmZiwNCg0KRG8geW91IHBsYW4gdG8gYXBwbHkgaXQgdG8gdXBzdHJlYW0sIG9yIGFueSBz
dWdnZXN0aW9ucz8gVGhhbmtzLg0KDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZy
b206IGxpbnV4LWlkZS1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1pZGUtDQo+
IG93bmVyQHZnZXIua2VybmVsLm9yZ10gT24gQmVoYWxmIE9mIExpIFlhbmcNCj4gU2VudDogV2Vk
bmVzZGF5LCBGZWJydWFyeSAxNSwgMjAxMiAzOjUxIFBNDQo+IFRvOiBMaXUgUWlhbmctQjMyNjE2
DQo+IENjOiBqZ2FyemlrQHBvYm94LmNvbTsgbGludXgtaWRlQHZnZXIua2VybmVsLm9yZzsgbGlu
dXgtDQo+IGtlcm5lbEB2Z2VyLmtlcm5lbC5vcmc7IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMu
b3JnDQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggVjNdIGZzbC1zYXRhOiBhZGQgc3VwcG9ydCBmb3Ig
aW50ZXJydXB0IGNvYWxzZWNpbmcNCj4gZmVhdHVyZQ0KPiANCj4gT24gV2VkLCBGZWIgMTUsIDIw
MTIgYXQgMzo0MCBQTSwgUWlhbmcgTGl1IDxxaWFuZy5saXVAZnJlZXNjYWxlLmNvbT4NCj4gd3Jv
dGU6DQo+ID4gQWRkcyBzdXBwb3J0IGZvciBpbnRlcnJ1cHQgY29hbGVzY2luZyBmZWF0dXJlIHRv
IHJlZHVjZSBpbnRlcnJ1cHQNCj4gZXZlbnRzLg0KPiA+IFByb3ZpZGVzIGEgbWVjaGFuaXNtIG9m
IGFkanVzdGluZyBjb2FsZXNjaW5nIGNvdW50IGFuZCB0aW1lb3V0IHRpY2sgYnkNCj4gPiBzeXNm
cyBhdCBydW50aW1lLCBzbyB0aGF0IHRyYWRlb2ZmIG9mIGxhdGVuY3kgYW5kIENQVSBsb2FkIGNh
biBiZSBtYWRlDQo+ID4gZGVwZW5kaW5nIG9uIGRpZmZlcmVudCBhcHBsaWNhdGlvbnMuDQo+ID4N
Cj4gPiBTaWduZWQtb2ZmLWJ5OiBRaWFuZyBMaXUgPHFpYW5nLmxpdUBmcmVlc2NhbGUuY29tPg0K
PiANCj4gQWNrZWQtYnk6IExpIFlhbmcgPGxlb2xpQGZyZWVzY2FsZS5jb20+DQo+IA0KPiAtIExl
bw0KPiAtLQ0KPiBUbyB1bnN1YnNjcmliZSBmcm9tIHRoaXMgbGlzdDogc2VuZCB0aGUgbGluZSAi
dW5zdWJzY3JpYmUgbGludXgtaWRlIiBpbg0KPiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFq
b3Jkb21vQHZnZXIua2VybmVsLm9yZyBNb3JlIG1ham9yZG9tbyBpbmZvIGF0DQo+IGh0dHA6Ly92
Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbA0KDQo=

^ permalink raw reply

* Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
From: Gavin Shan @ 2012-02-29  2:25 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1330409051-8941-21-git-send-email-shangw@linux.vnet.ibm.com>

With the original EEH implementation, the EEH global statistics
are maintained by individual global variables. That makes the
code a little hard to maintain.

The patch introduces extra struct eeh_stats for the EEH global
statistics so that it can be maintained in collective fashion.

It's the rework on the corresponding v5 patch. According to
the comments from David Laight, the EEH global statistics have
been changed for a litte bit so that they have fixed-type of
"u64". Also, the format used to print them has been changed to
"%llu" based on David's suggestion.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/eeh.c |   65 ++++++++++++++++++++--------------
 1 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index 9b1fd0c..753ec8a 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -102,14 +102,22 @@ static DEFINE_RAW_SPINLOCK(confirm_error_lock);
 #define EEH_PCI_REGS_LOG_LEN 4096
 static unsigned char pci_regs_buf[EEH_PCI_REGS_LOG_LEN];
 
-/* System monitoring statistics */
-static unsigned long no_device;
-static unsigned long no_dn;
-static unsigned long no_cfg_addr;
-static unsigned long ignored_check;
-static unsigned long total_mmio_ffs;
-static unsigned long false_positives;
-static unsigned long slot_resets;
+/*
+ * The struct is used to maintain the EEH global statistic
+ * information. Besides, the EEH global statistics will be
+ * exported to user space through procfs
+ */
+struct eeh_stats {
+	u64 no_device;		/* PCI device not found		*/
+	u64 no_dn;		/* OF node not found		*/
+	u64 no_cfg_addr;	/* Config address not found	*/
+	u64 ignored_check;	/* EEH check skipped		*/
+	u64 total_mmio_ffs;	/* Total EEH checks		*/
+	u64 false_positives;	/* Unnecessary EEH checks	*/
+	u64 slot_resets;	/* PE reset			*/
+};
+
+static struct eeh_stats eeh_stats;
 
 #define IS_BRIDGE(class_code) (((class_code)<<16) == PCI_BASE_CLASS_BRIDGE)
 
@@ -392,13 +400,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 	int rc = 0;
 	const char *location;
 
-	total_mmio_ffs++;
+	eeh_stats.total_mmio_ffs++;
 
 	if (!eeh_subsystem_enabled)
 		return 0;
 
 	if (!dn) {
-		no_dn++;
+		eeh_stats.no_dn++;
 		return 0;
 	}
 	dn = eeh_find_device_pe(dn);
@@ -407,14 +415,14 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 	/* Access to IO BARs might get this far and still not want checking. */
 	if (!(edev->mode & EEH_MODE_SUPPORTED) ||
 	    edev->mode & EEH_MODE_NOCHECK) {
-		ignored_check++;
+		eeh_stats.ignored_check++;
 		pr_debug("EEH: Ignored check (%x) for %s %s\n",
 			edev->mode, eeh_pci_name(dev), dn->full_name);
 		return 0;
 	}
 
 	if (!edev->config_addr && !edev->pe_config_addr) {
-		no_cfg_addr++;
+		eeh_stats.no_cfg_addr++;
 		return 0;
 	}
 
@@ -460,13 +468,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev)
 	    (ret == EEH_STATE_NOT_SUPPORT) ||
 	    (ret & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) ==
 	    (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) {
-		false_positives++;
+		eeh_stats.false_positives++;
 		edev->false_positives ++;
 		rc = 0;
 		goto dn_unlock;
 	}
 
-	slot_resets++;
+	eeh_stats.slot_resets++;
  
 	/* Avoid repeated reports of this failure, including problems
 	 * with other functions on this device, and functions under
@@ -513,7 +521,7 @@ unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned lon
 	addr = eeh_token_to_phys((unsigned long __force) token);
 	dev = pci_addr_cache_get_device(addr);
 	if (!dev) {
-		no_device++;
+		eeh_stats.no_device++;
 		return val;
 	}
 
@@ -1174,21 +1182,24 @@ static int proc_eeh_show(struct seq_file *m, void *v)
 {
 	if (0 == eeh_subsystem_enabled) {
 		seq_printf(m, "EEH Subsystem is globally disabled\n");
-		seq_printf(m, "eeh_total_mmio_ffs=%ld\n", total_mmio_ffs);
+		seq_printf(m, "eeh_total_mmio_ffs=%llu\n", eeh_stats.total_mmio_ffs);
 	} else {
 		seq_printf(m, "EEH Subsystem is enabled\n");
 		seq_printf(m,
-				"no device=%ld\n"
-				"no device node=%ld\n"
-				"no config address=%ld\n"
-				"check not wanted=%ld\n"
-				"eeh_total_mmio_ffs=%ld\n"
-				"eeh_false_positives=%ld\n"
-				"eeh_slot_resets=%ld\n",
-				no_device, no_dn, no_cfg_addr, 
-				ignored_check, total_mmio_ffs, 
-				false_positives,
-				slot_resets);
+				"no device           =%llu\n"
+				"no device node      =%llu\n"
+				"no config address   =%llu\n"
+				"check not wanted    =%llu\n"
+				"eeh_total_mmio_ffs  =%llu\n"
+				"eeh_false_positives =%llu\n"
+				"eeh_slot_resets     =%llu\n",
+				eeh_stats.no_device,
+				eeh_stats.no_dn,
+				eeh_stats.no_cfg_addr,
+				eeh_stats.ignored_check,
+				eeh_stats.total_mmio_ffs,
+				eeh_stats.false_positives,
+				eeh_stats.slot_resets);
 	}
 
 	return 0;
-- 
1.7.5.4

 

^ permalink raw reply related

* [PATCH] KVM: PPC: Don't sync timebase when inside KVM
From: Alexander Graf @ 2012-02-29  2:16 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm

When we know that we're running inside of a KVM guest, we don't have to
worry about synchronizing timebases between different CPUs, since the
host already took care of that.

This fixes CPU overcommit scenarios where vCPUs could hang forever trying
to sync each other while not being scheduled.

Reported-by: Stuart Yoder <B08248@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/smp.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 46695fe..670b453 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -49,6 +49,8 @@
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
 #endif
+#include <linux/kvm_para.h>
+#include <asm/kvm_para.h>
 
 #ifdef DEBUG
 #include <asm/udbg.h>
@@ -541,7 +543,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
 
 	DBG("Processor %u found.\n", cpu);
 
-	if (smp_ops->give_timebase)
+	if (!kvm_para_available() && smp_ops->give_timebase)
 		smp_ops->give_timebase();
 
 	/* Wait until cpu puts itself in the online map */
@@ -626,7 +628,7 @@ void __devinit start_secondary(void *unused)
 
 	if (smp_ops->setup_cpu)
 		smp_ops->setup_cpu(cpu);
-	if (smp_ops->take_timebase)
+	if (!kvm_para_available() && smp_ops->take_timebase)
 		smp_ops->take_timebase();
 
 	secondary_cpu_time_init();
-- 
1.6.0.2

^ permalink raw reply related

* Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
From: Gavin Shan @ 2012-02-29  1:08 UTC (permalink / raw)
  To: David Laight, linuxppc-dev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6E82@saturn3.aculab.com>

>  
> > +struct eeh_stats {
> > +	unsigned int no_device;		/* PCI device not found */
> ...
> > +				"no device           =%d\n"
> ...
> 
> Use %u (for all the stats), you really don't want negative
> values printed.

Yes. 

> I've NFI how long wrapping these counters might take!
> If it is feasable (maybe much above 100Hz) then you
> need 64bit counters.
> 

I think it's better to use "u64" here ;-)

> 	David
> 

Thanks,
Gavin
 

^ permalink raw reply

* [PATCH 35/38] KVM: PPC: booke: Support perfmon interrupts
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

When during guest context we get a performance monitor interrupt, we
currently bail out and oops. Let's route it to its correct handler
instead.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 7df3f3a..ee39c8a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -679,6 +679,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		r = RESUME_GUEST;
 		break;
 
+	case BOOKE_INTERRUPT_PERFORMANCE_MONITOR:
+		r = RESUME_GUEST;
+		break;
+
 	case BOOKE_INTERRUPT_HV_PRIV:
 		r = emulation_exit(run, vcpu);
 		break;
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 37/38] KVM: PPC: booke: Reinject performance monitor interrupts
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

When we get a performance monitor interrupt, we need to make sure that
the host receives it. So reinject it like we reinject the other host
destined interrupts.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - call regs sync directly
---
 arch/powerpc/include/asm/hw_irq.h |    1 +
 arch/powerpc/kvm/booke.c          |    4 ++++
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index bb712c9..904e66c 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -12,6 +12,7 @@
 #include <asm/processor.h>
 
 extern void timer_interrupt(struct pt_regs *);
+extern void performance_monitor_exception(struct pt_regs *regs);
 
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 488936b..8e8aa4c 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -634,6 +634,10 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
 	case BOOKE_INTERRUPT_MACHINE_CHECK:
 		/* FIXME */
 		break;
+	case BOOKE_INTERRUPT_PERFORMANCE_MONITOR:
+		kvmppc_fill_pt_regs(&regs);
+		performance_monitor_exception(&regs);
+		break;
 	}
 }
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 38/38] KVM: PPC: Booke: only prepare to enter when we enter
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

So far, we've always called prepare_to_enter even when all we did was return
to the host. This patch changes that semantic to only call prepare_to_enter
when we actually want to get back into the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 8e8aa4c..9f27258 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -464,7 +464,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
  *
  * returns !0 if a signal is pending and check_signal is true
  */
-static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal)
+static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
 	int r = 0;
 
@@ -477,7 +477,7 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal)
 			continue;
 		}
 
-		if (check_signal && signal_pending(current)) {
+		if (signal_pending(current)) {
 			r = 1;
 			break;
 		}
@@ -509,7 +509,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	}
 
 	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu, true)) {
+	if (kvmppc_prepare_to_enter(vcpu)) {
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
 		goto out;
@@ -946,11 +946,13 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 * To avoid clobbering exit_reason, only check for signals if we
 	 * aren't already exiting to userspace for some other reason.
 	 */
-	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu, !(r & RESUME_HOST))) {
-		run->exit_reason = KVM_EXIT_INTR;
-		r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
-		kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+	if (!(r & RESUME_HOST)) {
+		local_irq_disable();
+		if (kvmppc_prepare_to_enter(vcpu)) {
+			run->exit_reason = KVM_EXIT_INTR;
+			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
+			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+		}
 	}
 
 	return r;
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 34/38] KVM: PPC: e500: fix typo in tlb code
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

The tlbncfg registers should be populated with their respective TLB's
values. Fix the obvious typo.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500_tlb.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 279e10a..e05232b 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -1268,8 +1268,8 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
 
 	vcpu->arch.tlbcfg[1] = mfspr(SPRN_TLB1CFG) &
 			     ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
-	vcpu->arch.tlbcfg[0] |= vcpu_e500->gtlb_params[1].entries;
-	vcpu->arch.tlbcfg[0] |=
+	vcpu->arch.tlbcfg[1] |= vcpu_e500->gtlb_params[1].entries;
+	vcpu->arch.tlbcfg[1] |=
 		vcpu_e500->gtlb_params[1].ways << TLBnCFG_ASSOC_SHIFT;
 
 	return 0;
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 36/38] KVM: PPC: booke: expose good state on irq reinject
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

When reinjecting an interrupt into the host interrupt handler after we're
back in host kernel land, we need to tell the kernel where the interrupt
happened. We can't tell it that we were in guest state, because that might
lead to random code walking host addresses. So instead, we tell it that
we came from the interrupt reinject code.

This helps getting reasonable numbers out of perf.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - actually sync host state
  - no need for vcpu in sync
---
 arch/powerpc/kvm/booke.c |   56 +++++++++++++++++++++++++++++++++------------
 1 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ee39c8a..488936b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -595,37 +595,63 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
 	}
 }
 
-/**
- * kvmppc_handle_exit
- *
- * Return value is in the form (errcode<<2 | RESUME_FLAG_HOST | RESUME_FLAG_NV)
- */
-int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
-                       unsigned int exit_nr)
+static void kvmppc_fill_pt_regs(struct pt_regs *regs)
 {
-	int r = RESUME_HOST;
+	ulong r1, ip, msr, lr;
+
+	asm("mr %0, 1" : "=r"(r1));
+	asm("mflr %0" : "=r"(lr));
+	asm("mfmsr %0" : "=r"(msr));
+	asm("bl 1f; 1: mflr %0" : "=r"(ip));
+
+	memset(regs, 0, sizeof(*regs));
+	regs->gpr[1] = r1;
+	regs->nip = ip;
+	regs->msr = msr;
+	regs->link = lr;
+}
 
-	/* update before a new last_exit_type is rewritten */
-	kvmppc_update_timing_stats(vcpu);
+static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
+				     unsigned int exit_nr)
+{
+	struct pt_regs regs;
 
 	switch (exit_nr) {
 	case BOOKE_INTERRUPT_EXTERNAL:
-		do_IRQ(current->thread.regs);
+		kvmppc_fill_pt_regs(&regs);
+		do_IRQ(&regs);
 		break;
-
 	case BOOKE_INTERRUPT_DECREMENTER:
-		timer_interrupt(current->thread.regs);
+		kvmppc_fill_pt_regs(&regs);
+		timer_interrupt(&regs);
 		break;
-
 #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
 	case BOOKE_INTERRUPT_DOORBELL:
-		doorbell_exception(current->thread.regs);
+		kvmppc_fill_pt_regs(&regs);
+		doorbell_exception(&regs);
 		break;
 #endif
 	case BOOKE_INTERRUPT_MACHINE_CHECK:
 		/* FIXME */
 		break;
 	}
+}
+
+/**
+ * kvmppc_handle_exit
+ *
+ * Return value is in the form (errcode<<2 | RESUME_FLAG_HOST | RESUME_FLAG_NV)
+ */
+int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
+                       unsigned int exit_nr)
+{
+	int r = RESUME_HOST;
+
+	/* update before a new last_exit_type is rewritten */
+	kvmppc_update_timing_stats(vcpu);
+
+	/* restart interrupts if they were meant for the host */
+	kvmppc_restart_interrupt(vcpu, exit_nr);
 
 	local_irq_enable();
 
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 10/38] KVM: PPC: e500: Track TLB1 entries with a bitmap
From: Alexander Graf @ 2012-02-29  0:09 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

From: Scott Wood <scottwood@freescale.com>

Rather than invalidate everything when a TLB1 entry needs to be
taken down, keep track of which host TLB1 entries are used for
a given guest TLB1 entry, and invalidate just those entries.

Based on code from Ashish Kalra <Ashish.Kalra@freescale.com>
and Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500.h     |    5 +++
 arch/powerpc/kvm/e500_tlb.c |   72 ++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 34cef08..f4dee55 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -2,6 +2,7 @@
  * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
  *
  * Author: Yu Liu <yu.liu@freescale.com>
+ *         Ashish Kalra <ashish.kalra@freescale.com>
  *
  * Description:
  * This file is based on arch/powerpc/kvm/44x_tlb.h and
@@ -25,6 +26,7 @@
 
 #define E500_TLB_VALID 1
 #define E500_TLB_DIRTY 2
+#define E500_TLB_BITMAP 4
 
 struct tlbe_ref {
 	pfn_t pfn;
@@ -82,6 +84,9 @@ struct kvmppc_vcpu_e500 {
 	struct page **shared_tlb_pages;
 	int num_shared_tlb_pages;
 
+	u64 *g2h_tlb1_map;
+	unsigned int *h2g_tlb1_rmap;
+
 #ifdef CONFIG_KVM_E500
 	u32 pid[E500_PID_NUM];
 
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 9925fc6..c8ce51d 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -2,6 +2,7 @@
  * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved.
  *
  * Author: Yu Liu, yu.liu@freescale.com
+ *         Ashish Kalra, ashish.kalra@freescale.com
  *
  * Description:
  * This file is based on arch/powerpc/kvm/44x_tlb.c,
@@ -175,8 +176,28 @@ static void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500,
 	struct kvm_book3e_206_tlb_entry *gtlbe =
 		get_entry(vcpu_e500, tlbsel, esel);
 
-	if (tlbsel == 1) {
-		kvmppc_e500_tlbil_all(vcpu_e500);
+	if (tlbsel == 1 &&
+	    vcpu_e500->gtlb_priv[1][esel].ref.flags & E500_TLB_BITMAP) {
+		u64 tmp = vcpu_e500->g2h_tlb1_map[esel];
+		int hw_tlb_indx;
+		unsigned long flags;
+
+		local_irq_save(flags);
+		while (tmp) {
+			hw_tlb_indx = __ilog2_u64(tmp & -tmp);
+			mtspr(SPRN_MAS0,
+			      MAS0_TLBSEL(1) |
+			      MAS0_ESEL(to_htlb1_esel(hw_tlb_indx)));
+			mtspr(SPRN_MAS1, 0);
+			asm volatile("tlbwe");
+			vcpu_e500->h2g_tlb1_rmap[hw_tlb_indx] = 0;
+			tmp &= tmp - 1;
+		}
+		mb();
+		vcpu_e500->g2h_tlb1_map[esel] = 0;
+		vcpu_e500->gtlb_priv[1][esel].ref.flags &= ~E500_TLB_BITMAP;
+		local_irq_restore(flags);
+
 		return;
 	}
 
@@ -282,6 +303,16 @@ static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 	}
 }
 
+static void clear_tlb1_bitmap(struct kvmppc_vcpu_e500 *vcpu_e500)
+{
+	if (vcpu_e500->g2h_tlb1_map)
+		memset(vcpu_e500->g2h_tlb1_map,
+		       sizeof(u64) * vcpu_e500->gtlb_params[1].entries, 0);
+	if (vcpu_e500->h2g_tlb1_rmap)
+		memset(vcpu_e500->h2g_tlb1_rmap,
+		       sizeof(unsigned int) * host_tlb_params[1].entries, 0);
+}
+
 static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	int tlbsel = 0;
@@ -511,7 +542,7 @@ static void kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 /* XXX for both one-one and one-to-many , for now use TLB1 */
 static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 		u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe,
-		struct kvm_book3e_206_tlb_entry *stlbe)
+		struct kvm_book3e_206_tlb_entry *stlbe, int esel)
 {
 	struct tlbe_ref *ref;
 	unsigned int victim;
@@ -524,6 +555,14 @@ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	ref = &vcpu_e500->tlb_refs[1][victim];
 	kvmppc_e500_shadow_map(vcpu_e500, gvaddr, gfn, gtlbe, 1, stlbe, ref);
 
+	vcpu_e500->g2h_tlb1_map[esel] |= (u64)1 << victim;
+	vcpu_e500->gtlb_priv[1][esel].ref.flags |= E500_TLB_BITMAP;
+	if (vcpu_e500->h2g_tlb1_rmap[victim]) {
+		unsigned int idx = vcpu_e500->h2g_tlb1_rmap[victim];
+		vcpu_e500->g2h_tlb1_map[idx] &= ~(1ULL << victim);
+	}
+	vcpu_e500->h2g_tlb1_rmap[victim] = esel;
+
 	return victim;
 }
 
@@ -728,7 +767,7 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
 			 * are mapped on the fly. */
 			stlbsel = 1;
 			sesel = kvmppc_e500_tlb1_map(vcpu_e500, eaddr,
-				    raddr >> PAGE_SHIFT, gtlbe, &stlbe);
+				    raddr >> PAGE_SHIFT, gtlbe, &stlbe, esel);
 			break;
 
 		default:
@@ -856,7 +895,7 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 
 		stlbsel = 1;
 		sesel = kvmppc_e500_tlb1_map(vcpu_e500, eaddr, gfn,
-					     gtlbe, &stlbe);
+					     gtlbe, &stlbe, esel);
 		break;
 	}
 
@@ -872,6 +911,9 @@ static void free_gtlb(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	int i;
 
+	clear_tlb1_bitmap(vcpu_e500);
+	kfree(vcpu_e500->g2h_tlb1_map);
+
 	clear_tlb_refs(vcpu_e500);
 	kfree(vcpu_e500->gtlb_priv[0]);
 	kfree(vcpu_e500->gtlb_priv[1]);
@@ -932,6 +974,7 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 	char *virt;
 	struct page **pages;
 	struct tlbe_priv *privs[2] = {};
+	u64 *g2h_bitmap = NULL;
 	size_t array_len;
 	u32 sets;
 	int num_pages, ret, i;
@@ -993,10 +1036,16 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 	if (!privs[0] || !privs[1])
 		goto err_put_page;
 
+	g2h_bitmap = kzalloc(sizeof(u64) * params.tlb_sizes[1],
+	                     GFP_KERNEL);
+	if (!g2h_bitmap)
+		goto err_put_page;
+
 	free_gtlb(vcpu_e500);
 
 	vcpu_e500->gtlb_priv[0] = privs[0];
 	vcpu_e500->gtlb_priv[1] = privs[1];
+	vcpu_e500->g2h_tlb1_map = g2h_bitmap;
 
 	vcpu_e500->gtlb_arch = (struct kvm_book3e_206_tlb_entry *)
 		(virt + (cfg->array & (PAGE_SIZE - 1)));
@@ -1129,6 +1178,18 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
 	if (!vcpu_e500->gtlb_priv[1])
 		goto err;
 
+	vcpu_e500->g2h_tlb1_map = kzalloc(sizeof(unsigned int) *
+					  vcpu_e500->gtlb_params[1].entries,
+					  GFP_KERNEL);
+	if (!vcpu_e500->g2h_tlb1_map)
+		goto err;
+
+	vcpu_e500->h2g_tlb1_rmap = kzalloc(sizeof(unsigned int) *
+					   host_tlb_params[1].entries,
+					   GFP_KERNEL);
+	if (!vcpu_e500->h2g_tlb1_rmap)
+		goto err;
+
 	/* Init TLB configuration register */
 	vcpu->arch.tlbcfg[0] = mfspr(SPRN_TLB0CFG) &
 			     ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
@@ -1154,6 +1215,7 @@ err:
 void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	free_gtlb(vcpu_e500);
+	kfree(vcpu_e500->h2g_tlb1_rmap);
 	kfree(vcpu_e500->tlb_refs[0]);
 	kfree(vcpu_e500->tlb_refs[1]);
 }
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 21/38] KVM: PPC: make e500v2 kvm and e500mc cpu mutually exclusive
From: Alexander Graf @ 2012-02-29  0:09 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

We can't run e500v2 kvm on e500mc kernels, so indicate that by
making the 2 options mutually exclusive in kconfig.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 44a998d..f4dacb9 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -120,7 +120,7 @@ config KVM_EXIT_TIMING
 
 config KVM_E500V2
 	bool "KVM support for PowerPC E500v2 processors"
-	depends on EXPERIMENTAL && E500
+	depends on EXPERIMENTAL && E500 && !PPC_E500MC
 	select KVM
 	select KVM_MMIO
 	---help---
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 31/38] KVM: PPC: booke: Readd debug abort code for machine check
From: Alexander Graf @ 2012-02-29  0:09 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

When during guest execution we get a machine check interrupt, we don't
know how to handle it yet. So let's add the error printing code back
again that we dropped accidently earlier and tell user space that something
went really wrong.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 11b0625..af02d9d 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -634,7 +634,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	switch (exit_nr) {
 	case BOOKE_INTERRUPT_MACHINE_CHECK:
-		r = RESUME_GUEST;
+		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
+		kvmppc_dump_vcpu(vcpu);
+		/* For debugging, send invalid exit reason to user space */
+		run->hw.hardware_exit_reason = ~1ULL << 32;
+		run->hw.hardware_exit_reason |= mfspr(SPRN_MCSR);
+		r = RESUME_HOST;
 		break;
 
 	case BOOKE_INTERRUPT_EXTERNAL:
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 32/38] KVM: PPC: booke: add GS documentation for program interrupt
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

The comment for program interrupts triggered when using bookehv was
misleading. Update it to mention why MSR_GS indicates that we have
to inject an interrupt into the guest again, not emulate it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index af02d9d..7df3f3a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -685,8 +685,14 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	case BOOKE_INTERRUPT_PROGRAM:
 		if (vcpu->arch.shared->msr & (MSR_PR | MSR_GS)) {
-			/* Program traps generated by user-level software must be handled
-			 * by the guest kernel. */
+			/*
+			 * Program traps generated by user-level software must
+			 * be handled by the guest kernel.
+			 *
+			 * In GS mode, hypervisor privileged instructions trap
+			 * on BOOKE_INTERRUPT_HV_PRIV, not here, so these are
+			 * actual program interrupts, handled by the guest.
+			 */
 			kvmppc_core_queue_program(vcpu, vcpu->arch.fault_esr);
 			r = RESUME_GUEST;
 			kvmppc_account_exit(vcpu, USR_PR_INST);
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH 33/38] KVM: PPC: bookehv: remove unused code
From: Alexander Graf @ 2012-02-29  0:10 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Scott Wood, linuxppc-dev, kvm
In-Reply-To: <1330474206-14794-1-git-send-email-agraf@suse.de>

There was some unused code in the exit code path that must have been
a leftover from earlier iterations. While it did no harm, it's superfluous
and thus should be removed.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - fix commit message
  - also remove "lwz	r9, VCPU_KVM(r4)" which was as superfluous
---
 arch/powerpc/kvm/bookehv_interrupts.S |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 021d087..63fc5f0 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -91,10 +91,6 @@
 	PPC_STL	r9, VCPU_TIMING_EXIT_TBU(r4)
 #endif
 
-	.if	\flags & NEED_EMU
-	lwz	r9, VCPU_KVM(r4)
-	.endif
-
 	oris	r8, r6, MSR_CE@h
 #ifdef CONFIG_64BIT
 	std	r6, (VCPU_SHARED_MSR)(r11)
@@ -112,9 +108,6 @@
 	 * appropriate for the exception type).
 	 */
 	cmpw	r6, r8
-	.if	\flags & NEED_EMU
-	lwz	r9, KVM_LPID(r9)
-	.endif
 	beq	1f
 	mfmsr	r7
 	.if	\srr0 != SPRN_MCSRR0 && \srr0 != SPRN_CSRR0
-- 
1.6.0.2

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox