LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] powerpc/prom: remove limit on maximum size of properties
From: Nishanth Aravamudan @ 2012-02-27 18:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nishanth Aravamudan, Anton Blanchard, Paul Mackerras,
	linuxppc-dev, Robert Jennings
In-Reply-To: <1330298534.20389.53.camel@pasglop>

Hi Ben,

On 27.02.2012 [10:22:14 +1100], Benjamin Herrenschmidt wrote:
> On Fri, 2012-02-24 at 16:23 -0800, Nishanth Aravamudan wrote:
> > On a 16TB system (using AMS/CMO), I get:
> > 
> > WARNING: ignoring large property [/ibm,dynamic-reconfiguration-memory] ibm,dynamic-memory length 0x000000000017ffec
> > 
> > and significantly less memory is thus shown to the partition. As far as
> > I can tell, the constant used is arbitrary, but bump it up to 2MB, which
> > covers the above property (approximately 1.5MB).
> > 
> > With this patch, the kernel does see all of the system memory on the
> > 16TB system.
> 
> Why not go all the way to either removing the limit, or setting it to
> something much bigger ? That's just asking to break again when we get an
> even bigger system.

Fair point -- sorry, was just trying to get something out the door that
I had tested, since I have limited access to the hardware in question.

> The limit was originally set because of Apple machines carrying ROM
> images in the device-tree, at a time where we were much more memory
> constrained than we are now.
> 
> But even then, it never represented such a large gain and in the end,
> was probably not -that- useful.
> 
> I'd say bump it to something really large like 16M or remove the limit
> alltogether.

So, in terms of removing the limit altogether, is that a literal
statement? I don't know this code particularly well, but only see
references to the constant, at least in prom_init.c:

powerpc/prom: remove limit on maximum size of properties
    
On a 16TB system (using AMS/CMO), I get:
    
WARNING: ignoring large property [/ibm,dynamic-reconfiguration-memory] ibm,dynamic-memory length 0x000000000017ffec
    
and significantly less memory is thus shown to the partition. As far as
I can tell, the constant used is arbitrary. Ben Herrenschmidt provided
additional background that
    
> The limit was originally set because of Apple machines carrying ROM
> images in the device-tree, at a time where we were much more memory
> constrained than we are now.
    
and that it is likely not very useful any longer.
    
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
    
---
With this patch, the kernel should see all of the system memory on the
16TB system, but I'm unable to test due to lack of hardware access. I'm
working on getting access again, but not confident that will happen
soon.

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index eca626e..e2d5990 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -48,14 +48,6 @@
 #include <linux/linux_logo.h>
 
 /*
- * Properties whose value is longer than this get excluded from our
- * copy of the device tree. This value does need to be big enough to
- * ensure that we don't lose things like the interrupt-map property
- * on a PCI-PCI bridge.
- */
-#define MAX_PROPERTY_LENGTH	(1UL * 1024 * 1024)
-
-/*
  * Eventually bump that one up
  */
 #define DEVTREE_CHUNK_SIZE	0x100000
@@ -2273,13 +2265,6 @@ static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start,
 		/* sanity checks */
 		if (l == PROM_ERROR)
 			continue;
-		if (l > MAX_PROPERTY_LENGTH) {
-			prom_printf("WARNING: ignoring large property ");
-			/* It seems OF doesn't null-terminate the path :-( */
-			prom_printf("[%s] ", path);
-			prom_printf("%s length 0x%x\n", RELOC(pname), l);
-			continue;
-		}
 
 		/* push property head */
 		dt_push_token(OF_DT_PROP, mem_start, mem_end);

-- 
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center

^ permalink raw reply related

* Re: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction
From: Scott Wood @ 2012-02-27 19:19 UTC (permalink / raw)
  To: Olivia Yin; +Cc: Liu Yu, linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1330340388-30296-1-git-send-email-hong-hua.yin@freescale.com>

On 02/27/2012 04:59 AM, Olivia Yin wrote:
> So that we can call it in kernel.
> 
> Signed-off-by: Liu Yu <yu.liu@freescale.com>

Explain why we want this, and point out that this makes it similar to
load_up_fpu.

> ---
>  arch/powerpc/kernel/head_fsl_booke.S |   23 ++++++-----------------
>  1 files changed, 6 insertions(+), 17 deletions(-)

When posting a patch authored by someone else, more or less unchanged,
you should put a From: line in the body of the e-mail.

git send-email will do this automatically if you preserve the authorship
in the git commit.

Also, you should add your own Signed-off-by.

-Scott

^ permalink raw reply

* Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
From: Scott Wood @ 2012-02-27 19:28 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1330093591-19523-25-git-send-email-agraf@suse.de>

On 02/24/2012 08:26 AM, Alexander Graf wrote:
> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long *pending = &vcpu->arch.pending_exceptions;
>  	unsigned long old_pending = vcpu->arch.pending_exceptions;
> @@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  
>  	/* Tell the guest about our interrupt status */
>  	kvmppc_update_int_pending(vcpu, *pending, old_pending);
> +
> +	return 0;
>  }
>  
>  pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 9979be1..3fcec2c 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>  }
>  
>  /* Check pending exceptions and deliver one, if possible. */
> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  {
> +	int r = 0;
>  	WARN_ON_ONCE(!irqs_disabled());
>  
>  	kvmppc_core_check_exceptions(vcpu);
> @@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  		local_irq_disable();
>  
>  		kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
> -		kvmppc_core_check_exceptions(vcpu);
> +		r = 1;
>  	};
> +
> +	return r;
> +}
> +
> +/*
> + * Common checks before entering the guest world.  Call with interrupts
> + * disabled.
> + *
> + * returns !0 if a signal is pending and check_signal is true
> + */
> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal)
> +{
> +	int r = 0;
> +
> +	WARN_ON_ONCE(!irqs_disabled());
> +	while (true) {
> +		if (need_resched()) {
> +			local_irq_enable();
> +			cond_resched();
> +			local_irq_disable();
> +			continue;
> +		}
> +
> +		if (kvmppc_core_prepare_to_enter(vcpu)) {
> +			/* interrupts got enabled in between, so we
> +			   are back at square 1 */
> +			continue;
> +		}
> +
> +
> +		if (check_signal && signal_pending(current))
> +			r = 1;

If there is a signal pending and MSR[WE] is set, we'll loop forever
without reaching this check.

-Scott

^ permalink raw reply

* Re: [PATCH 36/37] KVM: PPC: booke: expose guest registers on irq reinject
From: Scott Wood @ 2012-02-27 19:45 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <810950A7-CD5A-46A5-865B-1209A5532DCC@suse.de>

On 02/26/2012 05:59 AM, Alexander Graf wrote:
> 
> On 25.02.2012, at 00:40, Scott Wood wrote:
> 
>> On 02/24/2012 08:26 AM, Alexander Graf wrote:
>>> +static void kvmppc_fill_pt_regs(struct kvm_vcpu *vcpu, struct pt_regs *regs)
>>> {
>>> -	int r = RESUME_HOST;
>>> +	int i;
>>>
>>> -	/* update before a new last_exit_type is rewritten */
>>> -	kvmppc_update_timing_stats(vcpu);
>>> +	for (i = 0; i < 32; i++)
>>> +		regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
>>> +	regs->nip = vcpu->arch.pc;
>>> +	regs->msr = vcpu->arch.shared->msr;
>>> +	regs->ctr = vcpu->arch.ctr;
>>> +	regs->link = vcpu->arch.lr;
>>> +	regs->xer = kvmppc_get_xer(vcpu);
>>> +	regs->ccr = kvmppc_get_cr(vcpu);
>>> +	regs->dar = get_guest_dear(vcpu);
>>> +	regs->dsisr = get_guest_esr(vcpu);
>>> +}
>>
>> How much overhead does this add to every interrupt?  Can't we keep this
>> to the minimum that perf cares about?
> 
> I would rather not make assumptions on what perf cares about - maybe we want to one day implement "perf kvm" and then perf could rely on pretty much anything in there.

In that case I think we should be populating a real pt_regs from the
start, as in my original patchset.

I only agreed to take it out because I thought the set of things we'd
copy would be minimal.  This seems like a lot of overhead.

I'm not familiar with "perf kvm", but if it's kvm-specific surely the
KVM code should know/dictate what it can rely on?  Or maybe there can be
a debug option that enables full pt_regs (similar to exit timing)?

Could we just set regs to NULL when the debug option isn't enabled?

>>> +static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>>> +				     unsigned int exit_nr)
>>> +{
>>> +	struct pt_regs regs = *current->thread.regs;
>>>
>>> +	kvmppc_fill_pt_regs(vcpu, &regs);
>>
>> Why are you copying out of current->thread.regs?  That's old junk data,
>> set by some previous exception and possibly overwritten since.
> 
> Because it gives us good default values for anything we don't set. Do you have other recommendations?

It does not give good default values for anything.  It is junk,
unallocated memory, overwritten by who knows what.  Same as the memory
you're copying to.

To avoid garbage in fields we don't set, fill it with zeroes first.

-Scott

^ permalink raw reply

* [PATCH net-next] ibm emac: delete module references; phy.c only supported as built-in
From: Paul Gortmaker @ 2012-02-27 19:47 UTC (permalink / raw)
  To: netdev; +Cc: linuxppc-dev, Paul Gortmaker

The Makefile has it as "ibm_emac-y := mal.o core.o phy.o" so there is
no way this can be built modular, so remove all references to module
support.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

diff --git a/drivers/net/ethernet/ibm/emac/phy.c b/drivers/net/ethernet/ibm/emac/phy.c
index d3b9d10..fb96387 100644
--- a/drivers/net/ethernet/ibm/emac/phy.c
+++ b/drivers/net/ethernet/ibm/emac/phy.c
@@ -11,13 +11,15 @@
  * Copyright 2007 Benjamin Herrenschmidt, IBM Corp.
  *                <benh@kernel.crashing.org>
  *
+ * Originally listed as MODULE_LICENSE("GPL") in the code, but that
+ * was removed, since this code is only supported as non-modular.
+ *
  * Based on the arch/ppc version of the driver:
  *
  * (c) 2003, Benjamin Herrenscmidt (benh@kernel.crashing.org)
  * (c) 2004-2005, Eugene Surovegin <ebs@ebshome.net>
  *
  */
-#include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/netdevice.h>
@@ -537,5 +539,3 @@ int emac_mii_phy_probe(struct mii_phy *phy, int address)
 
 	return 0;
 }
-
-MODULE_LICENSE("GPL");
-- 
1.7.9.1

^ permalink raw reply related

* Re: [PATCH v6 4/4] KVM: PPC: epapr: Update other hypercall invoking
From: Scott Wood @ 2012-02-27 19:50 UTC (permalink / raw)
  To: Liu Yu; +Cc: linuxppc-dev, B07421, agraf, kvm-ppc, kvm
In-Reply-To: <1329988942-19610-4-git-send-email-yu.liu@freescale.com>

On 02/23/2012 03:22 AM, Liu Yu wrote:
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index 2dcdbc9..99ebdde 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -15,6 +15,7 @@ if VIRT_DRIVERS
>  config FSL_HV_MANAGER
>  	tristate "Freescale hypervisor management driver"
>  	depends on FSL_SOC
> +	select EPAPR_PARAVIRT
>  	help
>            The Freescale hypervisor management driver provides several services
>  	  to drivers and applications related to the Freescale hypervisor:

What about the byte channel driver, and possibly others?

Grep for the hypercalls to make sure you got everything that uses this.

-Scott

^ permalink raw reply

* Re: [PATCH net-next] ibm emac: delete module references; phy.c only supported as built-in
From: David Miller @ 2012-02-27 19:57 UTC (permalink / raw)
  To: paul.gortmaker; +Cc: netdev, linuxppc-dev
In-Reply-To: <1330372024-30974-1-git-send-email-paul.gortmaker@windriver.com>

From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Mon, 27 Feb 2012 14:47:04 -0500

> The Makefile has it as "ibm_emac-y := mal.o core.o phy.o" so there is
> no way this can be built modular, so remove all references to module
> support.

That doesn't mean it's only buildable statically.

"ibm_emacs-y :=" is merely a way to tell make what objects go into ibm_emac.{o,ko}

IBM_EMAC is tristate

^ permalink raw reply

* Re: [PATCH net-next] ibm emac: delete module references; phy.c only supported as built-in
From: Ben Hutchings @ 2012-02-27 19:56 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: netdev, linuxppc-dev
In-Reply-To: <1330372024-30974-1-git-send-email-paul.gortmaker@windriver.com>

On Mon, 2012-02-27 at 14:47 -0500, Paul Gortmaker wrote:
> The Makefile has it as "ibm_emac-y := mal.o core.o phy.o" so there is
> no way this can be built modular, so remove all references to module
> support.

No, that's nonsense.  You need to look at whether ibm_emac.o can be
added to obj-m (which it can).

Ben.

> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> 
> diff --git a/drivers/net/ethernet/ibm/emac/phy.c b/drivers/net/ethernet/ibm/emac/phy.c
> index d3b9d10..fb96387 100644
> --- a/drivers/net/ethernet/ibm/emac/phy.c
> +++ b/drivers/net/ethernet/ibm/emac/phy.c
> @@ -11,13 +11,15 @@
>   * Copyright 2007 Benjamin Herrenschmidt, IBM Corp.
>   *                <benh@kernel.crashing.org>
>   *
> + * Originally listed as MODULE_LICENSE("GPL") in the code, but that
> + * was removed, since this code is only supported as non-modular.
> + *
>   * Based on the arch/ppc version of the driver:
>   *
>   * (c) 2003, Benjamin Herrenscmidt (benh@kernel.crashing.org)
>   * (c) 2004-2005, Eugene Surovegin <ebs@ebshome.net>
>   *
>   */
> -#include <linux/module.h>
>  #include <linux/kernel.h>
>  #include <linux/types.h>
>  #include <linux/netdevice.h>
> @@ -537,5 +539,3 @@ int emac_mii_phy_probe(struct mii_phy *phy, int address)
>  
>  	return 0;
>  }
> -
> -MODULE_LICENSE("GPL");

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net-next] ibm emac: delete module references; phy.c only supported as built-in
From: Paul Gortmaker @ 2012-02-27 20:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linuxppc-dev
In-Reply-To: <20120227.145753.1111293673513137413.davem@davemloft.net>

[Re: [PATCH net-next] ibm emac: delete module references; phy.c only supported as built-in] On 27/02/2012 (Mon 14:57) David Miller wrote:

> From: Paul Gortmaker <paul.gortmaker@windriver.com>
> Date: Mon, 27 Feb 2012 14:47:04 -0500
> 
> > The Makefile has it as "ibm_emac-y := mal.o core.o phy.o" so there is
> > no way this can be built modular, so remove all references to module
> > support.
> 
> That doesn't mean it's only buildable statically.
> 
> "ibm_emacs-y :=" is merely a way to tell make what objects go into ibm_emac.{o,ko}
> 
> IBM_EMAC is tristate

Argh, yes.  Sorry for the noise.  Since phy.c doesn't call any modular
functionality directly, the patch "worked" and I didn't catch the
error in my reasoning.

Paul.

^ permalink raw reply

* Re: [PATCH v6 1/4] KVM: PPC: epapr: Factor out the epapr init
From: Scott Wood @ 2012-02-27 20:52 UTC (permalink / raw)
  To: Liu Yu; +Cc: linuxppc-dev, B07421, agraf, kvm-ppc, kvm
In-Reply-To: <1329988942-19610-1-git-send-email-yu.liu@freescale.com>

On 02/23/2012 03:22 AM, Liu Yu wrote:
> +static int __init epapr_paravirt_init(void)
> +{
> +	struct device_node *hyper_node;
> +	const u32 *insts;
> +	int len, i;
> +
> +	hyper_node = of_find_node_by_path("/hypervisor");
> +	if (!hyper_node)
> +		return -ENODEV;
> +
> +	insts = of_get_property(hyper_node, "hcall-instructions", &len);
> +	if (!insts)
> +		return 0;

-ENODEV here too.

> +	if (!(len % 4) && len <= (4 * 4)) {
> +		for (i = 0; i < (len / 4); i++)
> +			patch_instruction(epapr_hypercall_start + i, insts[i]);
> +
> +		epapr_paravirt_enabled = true;
> +	} else {
> +		printk(KERN_WARNING
> +		       "ePAPR paravirt: hcall-instructions format error\n");
> +	}

Do this:

if (error) {
	print error
	return error code
}

continue with function

Not this:

if (!error) {
	continue with function
} else {
	report the error from several lines back
}

> @@ -33,6 +34,14 @@ config KVM_GUEST
>  
>  	  In case of doubt, say Y
>  
> +config EPAPR_PARAVIRT
> +	bool "ePAPR para-virtualization support"
> +	default n
> +	help
> +	  Used to enalbe ePAPR complied para-virtualization support for guest.
> +
> +	  In case of doubt, say Y

s/Used to enalbe/Enable/

s/complied/compliant/ (or just s/complied//)

-Scott

^ permalink raw reply

* Re: [PATCH 1/2] powerpc: Move GE GPIO and PIC drivers
From: Benjamin Herrenschmidt @ 2012-02-27 21:32 UTC (permalink / raw)
  To: Martyn Welch; +Cc: Wim Van Sebroeck, linuxppc-dev, linux-kernel
In-Reply-To: <4F4B8BB2.1060004@ge.com>

On Mon, 2012-02-27 at 13:57 +0000, Martyn Welch wrote:

> This patch (or one like it) has been around for a while now. Kumar wanted me
> to put them here rather than sysdev[1], but I'm easy either way.

Ah well, I disagree with Kumar here :-) One thing you can do is put all
your platforms files including the drivers in platforms/ge, that works
too. What I don't want is to have stray files at the top-level of
platforms.

> > Also, use git mv so that the file moves appear as such in the history,
> > this will make review easier by clearly separating the move from actual
> > changes to the files.
> > 
> 
> Hmm, thought I'd done that. Will try again.

Could be the way you exported the patch, I don't remember the git option
off the top of my head but there's a way to make it show the moves as
such.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 09/24] PCI, powerpc: Register busn_res for root buses
From: Bjorn Helgaas @ 2012-02-27 22:44 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-arch, Tony Luck, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes, linux-pci,
	Andrew Morton, Linus Torvalds
In-Reply-To: <CAE9FiQXx6wXaes1pbpM7SAHzndgr=N2D2g6LEWsB2_pf8+mgSA@mail.gmail.com>

On Sat, Feb 25, 2012 at 12:47 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Fri, Feb 24, 2012 at 2:24 PM, Jesse Barnes <jbarnes@virtuousgeek.org> =
wrote:
>> On Thu, 23 Feb 2012 12:51:30 -0800
>> Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> 2) We already have a way to add resources to a root bus: the
>>> pci_add_resource() used to add I/O port and MMIO apertures. =A0I think
>>> it'd be a lot simpler to just use that same interface for the bus
>>> number aperture, e.g.,
>>>
>>> =A0 =A0 pci_add_resource(&resources, hose->io_space);
>>> =A0 =A0 pci_add_resource(&resources, hose->mem_space);
>>> =A0 =A0 pci_add_resource(&resources, hose->busnr_space);
>>> =A0 =A0 bus =3D pci_scan_root_bus(dev, next_busno, pci_ops, sysdata, &r=
esources);
>>>
>>> This is actually a bit redundant, since "next_busno" should be the
>>> same as hose->busnr_space->start. =A0So if we adopted this approach, we
>>> might want to eventually drop the "next_busno" argument.
>>
>> Yeah that would be nice, the call would certainly make more sense that
>> way.
>
> no, I don't think so.
>
> using pci_add_resource will need to create dummy resource abut bus range.

I don't see the problem here.  The bus number aperture is a
fundamental property of a host bridge, and any firmware or native
bridge driver that tells you about a bridge but doesn't tell you the
bus number aperture is just broken.

Every arch already has struct resources for the MMIO and IO port
regions available on the bus.  ACPI already has a resource for the bus
number range.  It makes sense to me that the arch should supply a bus
number resource.

Conceptually, it's just like the MMIO and IO resources, so it makes
sense to me that the code for bus numbers should look like the code
for MMIO and IO ports.

> there is lots of pci_scan_root_bus(), =A0and those user does not bus end
> yet before scan.
> so could just hide pci_insert_busn_res in pci_scan_root_bus, and
> update busn_res end there.

pci_scan_child_bus() does NOT tell you the end of the bus number
aperture, and we shouldn't pretend that it does.  It might give you a
lower bound on the end of the aperture (as long as you're willing to
trust the current PCI config and you don't change anything).

> other arch like x86, ia64, powerpc, sparc, will insert exact bus range
> between pci_create_root_bus and
> pci_scan_child_bus, will not need to update busn_res end.
>
> please check v7 of this patchset.
>
> =A0 =A0 =A0 =A0git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linu=
x-yinghai.git
> for-pci-busn-alloc

I looked at your git tree, but I can't tell whether what's there is v7
or not and it's too much trouble to try to figure it out.

Bjorn

^ permalink raw reply

* Re: linux-next: build failure after merge of the final tree
From: Benjamin Herrenschmidt @ 2012-02-27 23:30 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Bjorn Helgaas, linux-next, ppc-dev, linux-kernel, Jesse Barnes
In-Reply-To: <1330334360.11728.3.camel@pasglop>

On Mon, 2012-02-27 at 20:19 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2012-02-27 at 17:37 +1100, Stephen Rothwell wrote:
> >         pci_add_resource_offset(resources, res,
> > -                       (resource_size_t) hose->io_base_virt - _IO_BASE);
> > +                       (resource_size_t)(unsigned long)hose->io_base_virt - _IO_BASE);
> 
> We have to be careful here as we do want sign extension to happen (yeah
> it's odd, but it's the way we do IOs on ppc32 :-) Maybe I should change
> it one day).
> 
> So we probably want to do:
> 
> 	 (resource_size_t)(long long)(hose->io_base_virt - _IO_BASE)

Oops ... that was meant to read (long) not (long long)... Any ways, I
more or less convinced myself that even without the sign extension it
would still work, since the IO port number is eventually cast to an
unsigned int by the accessors, so as long as the low 32-bits are correct
(and they'll be with or without the sign extension), we should be fine.
It's just that something trying to print the resource might end up
displaying garbage in the top bits.

Cheers,
Ben.


> Basically, IO resources are relative to _IO_BASE which on ppc32 is
> basically the virtual address where we map the first PHB IO space.
> 
> Subsequent PHB mappings can end up below _IO_BASE, leading to negative
> resource values for IO BARs on those busses. It all works fine because
> even an unsigned addition will do the right thing as long as the value
> is fully sign extended.
> 
> Cheers,
> Ben.

^ permalink raw reply

* Re: [PATCH 14/21] Introduce EEH device
From: Gavin Shan @ 2012-02-28  1:13 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linuxppc-dev
In-Reply-To: <20120224234610.922da1a3a2523ee5b6319390@canb.auug.org.au>

> Hi Gavin,
> 
> On Fri, 24 Feb 2012 17:38:11 +0800 Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> >
> > +#define EEH_DEV_TO_OF_NODE(edev)	(edev->dn)
> > +#define EEH_DEV_TO_PCI_DEV(edev)	(edev->pdev)
> > +#define OF_NODE_TO_EEH_DEV(dn)		((struct eeh_dev *)(dn->edev))
> > +#define PCI_DEV_TO_EEH_DEV(pdev)	((struct eeh_dev *)(pdev->dev.archdata.edev))
> 
> You need to put parentheses around the use of macro arguments ... or,
> even better, make these into "static inline" accessor functions (with
> lower case names).
> 

Thanks for your comments, Stephen. I'd like to make "static inline"
functions for them in next revision ;-)

Thanks,
Gavin

> -- 
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au
> http://www.canb.auug.org.au/~sfr/

^ permalink raw reply

* Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
From: Gavin Shan @ 2012-02-28  1:19 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linuxppc-dev
In-Reply-To: <20120225000141.73c0de577597e675a0fb5021@canb.auug.org.au>

> Hi Gavin,
> 
> On Fri, 24 Feb 2012 17:38:17 +0800 Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> >
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index 1310971..226c9a5 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > @@ -98,6 +98,21 @@ struct eeh_ops {
> >  	int (*configure_bridge)(struct device_node *dn);
> >  };
> >  
> > +/*
> > + * The struct is used to maintain the EEH global statistic
> > + * information. Besides, the EEH global statistics will be
> > + * exported to user space through procfs
> > + */
> > +struct eeh_stats {
> > +	unsigned long no_device;	/* PCI device not found		*/
> > +	unsigned long no_dn;		/* OF node not found		*/
> > +	unsigned long no_cfg_addr;	/* Config address not found	*/
> > +	unsigned long ignored_check;	/* EEH check skipped		*/
> > +	unsigned long total_mmio_ffs;	/* Total EEH checks		*/
> > +	unsigned long false_positives;	/* Unnecessary EEH checks	*/
> > +	unsigned long slot_resets;	/* PE reset			*/
> > +};
> 
> If this is used in only one place, there is not much point in putting it
> in a header file.
> 

Thanks, Stephen. I'll move it to eeh.c in next revision.

Thanks,
Gavin

> -- 
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au
> http://www.canb.auug.org.au/~sfr/

^ permalink raw reply

* Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
From: Gavin Shan @ 2012-02-28  1:22 UTC (permalink / raw)
  To: David Laight; +Cc: linuxppc-dev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6E79@saturn3.aculab.com>

>  
> > +/*
> > + * The struct is used to maintain the EEH global statistic
> > + * information. Besides, the EEH global statistics will be
> > + * exported to user space through procfs
> > + */
> > +struct eeh_stats {
> > +	unsigned long no_device;	/* PCI device not found
> */
> > +	unsigned long no_dn;		/* OF node not found
> */
> > +	unsigned long no_cfg_addr;	/* Config address not  found
> */
> > +	unsigned long ignored_check;	/* EEH check skipped
> */
> > +	unsigned long total_mmio_ffs;	/* Total EEH checks
> */
> > +	unsigned long false_positives;	/* Unnecessary EEH checks
> */
> > +	unsigned long slot_resets;	/* PE reset
> */
> > +};
> 
> Why 'unsigned long', surely either 'unsigned int'
> or a fixed-width type.
> 

Thanks, David. Could you pls give more information on
how the fixed-width types benefits us?

Thanks,
Gavin
 
> 	David
> 
> 

^ permalink raw reply

* Re: [PATCH 14/21] Introduce EEH device
From: Gavin Shan @ 2012-02-28  1:26 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linuxppc-dev
In-Reply-To: <20120224235012.63b01e51749cb09bc0195d38@canb.auug.org.au>

> Hi Gavin,
> 
> On Fri, 24 Feb 2012 17:38:11 +0800 Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> >
> > diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
> > index d57c08a..4668344 100644
> > --- a/arch/powerpc/include/asm/device.h
> > +++ b/arch/powerpc/include/asm/device.h
> > @@ -31,6 +31,9 @@ struct dev_archdata {
> >  #ifdef CONFIG_SWIOTLB
> >  	dma_addr_t		max_direct_dma_addr;
> >  #endif
> > +#ifdef CONFIG_EEH
> > +	void			*edev;
> > +#endif
> >  };
> >  
> >  struct pdev_archdata {
> > diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> > index ad8f318..1310971 100644
> > --- a/arch/powerpc/include/asm/eeh.h
> > +++ b/arch/powerpc/include/asm/eeh.h
> > +#define OF_NODE_TO_EEH_DEV(dn)		((struct eeh_dev *)(dn->edev))
> > +#define PCI_DEV_TO_EEH_DEV(pdev)	((struct eeh_dev *)(pdev->dev.archdata.edev))
> 
> If the edev fields of dev_archdata and device_node are always going to be
> "struct eeh_dev *", why not declare then as such and avoid the casting?
> 

Thanks for your comments, Stephen. I'll change it in next revision accordingly.

Thanks,
Gavin

> -- 
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au
> http://www.canb.auug.org.au/~sfr/

^ permalink raw reply

* linux-next: manual merge of the devicetree tree with the powerpc tree
From: Stephen Rothwell @ 2012-02-28  2:06 UTC (permalink / raw)
  To: Grant Likely; +Cc: linux-next, Paul Mackerras, linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

Hi Grant,

Today's linux-next merge of the devicetree tree got a conflict in
arch/powerpc/platforms/iseries/Kconfig between commit 3d066d77cf46
("powerpc: remove CONFIG_PPC_ISERIES from the architecture Kconfig
files") from the powerpc tree and commit 0f22dd395fc4 ("of: Only compile
OF_DYNAMIC on PowerPC pseries and iseries") from the devicetree tree.

The former commit removes the file (as the legacy iSeries platform is
being removed), so I did that.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
From: Yinghai Lu @ 2012-02-28  2:09 UTC (permalink / raw)
  To: Jesse Barnes, Benjamin Herrenschmidt, Tony Luck
  Cc: linux-arch, Greg Kroah-Hartman, linuxppc-dev, linux-kernel,
	Dominik Brodowski, Yinghai Lu, linux-pci, Bjorn Helgaas,
	Paul Mackerras, Andrew Morton, Linus Torvalds
In-Reply-To: <1330395009-29260-1-git-send-email-yinghai@kernel.org>

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/pci-common.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 910b9de..ae5ae5f 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1660,6 +1660,8 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose)
 	bus->secondary = hose->first_busno;
 	hose->bus = bus;
 
+	pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_busno);
+
 	/* Get probe mode and perform scan */
 	mode = PCI_PROBE_NORMAL;
 	if (node && ppc_md.pci_probe_mode)
@@ -1670,8 +1672,11 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose)
 		of_scan_bus(node, bus);
 	}
 
-	if (mode == PCI_PROBE_NORMAL)
+	if (mode == PCI_PROBE_NORMAL) {
+		pci_bus_update_busn_res_end(bus, 255);
 		hose->last_busno = bus->subordinate = pci_scan_child_bus(bus);
+		pci_bus_update_busn_res_end(bus, bus->subordinate);
+	}
 
 	/* Platform gets a chance to do some global fixups before
 	 * we proceed to resource allocation
-- 
1.7.7

^ permalink raw reply related

* linux-next: manual merge of the irqdomain tree with the powerpc tree
From: Stephen Rothwell @ 2012-02-28  4:23 UTC (permalink / raw)
  To: Grant Likely; +Cc: linux-next, Paul Mackerras, linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

Hi Grant,

Today's linux-next merge of the irqdomain tree got a conflict in
arch/powerpc/sysdev/mpic.c between commit fe83364f0bf1 ("powerpc/mpic:
Fix allocation of reverse-map for multi-ISU mpics") from the powerpc tree
and commit a8db8cf0d894 ("irq_domain: Replace irq_alloc_host() with
revmap-specific initializers") from the irqdomain tree.

I fixed it up (see below) and can carry the fix as necessary.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --cc arch/powerpc/sysdev/mpic.c
index 16eb743,c83a512..0000000
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@@ -1355,9 -1345,9 +1355,9 @@@ struct mpic * __init mpic_alloc(struct 
  	mpic->isu_shift = 1 + __ilog2(mpic->isu_size - 1);
  	mpic->isu_mask = (1 << mpic->isu_shift) - 1;
  
- 	mpic->irqhost = irq_alloc_host(mpic->node, IRQ_HOST_MAP_LINEAR,
+ 	mpic->irqhost = irq_domain_add_linear(mpic->node,
 -				       isu_size ? isu_size : mpic->num_sources,
 -				       &mpic_host_ops, mpic);
 +				       last_irq + 1, &mpic_host_ops,
- 				       intvec_top + 1);
++				       mpic);
  
  	/*
  	 * FIXME: The code leaks the MPIC object and mappings here; this

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: linux-next: manual merge of the devicetree tree with the powerpc tree
From: Grant Likely @ 2012-02-28  4:38 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linux-next, Paul Mackerras, linux-kernel, linuxppc-dev
In-Reply-To: <20120228130634.3c511b096bc78f9b99e804fe@canb.auug.org.au>

On Mon, Feb 27, 2012 at 7:06 PM, Stephen Rothwell <sfr@canb.auug.org.au> wr=
ote:
> Hi Grant,
>
> Today's linux-next merge of the devicetree tree got a conflict in
> arch/powerpc/platforms/iseries/Kconfig between commit 3d066d77cf46
> ("powerpc: remove CONFIG_PPC_ISERIES from the architecture Kconfig
> files") from the powerpc tree and commit 0f22dd395fc4 ("of: Only compile
> OF_DYNAMIC on PowerPC pseries and iseries") from the devicetree tree.
>
> The former commit removes the file (as the legacy iSeries platform is
> being removed), so I did that.

Okay, thanks.

g.

> --
> Cheers,
> Stephen Rothwell =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sfr@canb.auug.org=
.au



--=20
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply

* Re: [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
From: Bjorn Helgaas @ 2012-02-28  5:36 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-arch, Tony Luck, linuxppc-dev, linux-pci, linux-kernel,
	Dominik Brodowski, Paul Mackerras, Jesse Barnes,
	Greg Kroah-Hartman, Andrew Morton, Linus Torvalds
In-Reply-To: <1330395009-29260-9-git-send-email-yinghai@kernel.org>

On Mon, Feb 27, 2012 at 7:09 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
> =A0arch/powerpc/kernel/pci-common.c | =A0 =A07 ++++++-
> =A01 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-c=
ommon.c
> index 910b9de..ae5ae5f 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1660,6 +1660,8 @@ void __devinit pcibios_scan_phb(struct pci_controll=
er *hose)
> =A0 =A0 =A0 =A0bus->secondary =3D hose->first_busno;
> =A0 =A0 =A0 =A0hose->bus =3D bus;
>
> + =A0 =A0 =A0 pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_=
busno);
> +
> =A0 =A0 =A0 =A0/* Get probe mode and perform scan */
> =A0 =A0 =A0 =A0mode =3D PCI_PROBE_NORMAL;
> =A0 =A0 =A0 =A0if (node && ppc_md.pci_probe_mode)
> @@ -1670,8 +1672,11 @@ void __devinit pcibios_scan_phb(struct pci_control=
ler *hose)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0of_scan_bus(node, bus);
> =A0 =A0 =A0 =A0}
>
> - =A0 =A0 =A0 if (mode =3D=3D PCI_PROBE_NORMAL)
> + =A0 =A0 =A0 if (mode =3D=3D PCI_PROBE_NORMAL) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pci_bus_update_busn_res_end(bus, 255);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0hose->last_busno =3D bus->subordinate =3D =
pci_scan_child_bus(bus);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pci_bus_update_busn_res_end(bus, bus->subor=
dinate);
> + =A0 =A0 =A0 }

There's a lot of powerpc code that does this:

    bus_range =3D of_get_property(pcictrl, "bus-range", &len);
    hose->first_busno =3D bus_range[0];
    hose->last_busno =3D bus_range[1];

That *looks* like it is discovering the bus number aperture.  Is it?
If it is, why are we using the largest bus number found by
pci_scan_child_bus() rather than "last_busno"?

Bjorn

^ permalink raw reply

* [PATCH v5 00/21] EEH reorganization
From: Gavin Shan @ 2012-02-28  6:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: shangw

This series of patches is going to reorganize EEH so that it could support
multiple platforms in future. The requirements were raised from the aspects.

	* The original EEH implementation only support pSeries platform, which
	  would be regarded as guest system. Platform powernv is coming and EEH
	  needs to be supported on powernv as well.
	* Different platforms might be running based on variable firmware.Further
	  more, the firmware would supply different EEH interfaces to kernel.
	  Therefore, we have to do necessary abstraction on current EEH implementation.

In order to accomodate the requirements, the series of patches have reorganized
current EEH implementation.

	* The original implementation looks not clean enough. Necessary cleanup
	  will be done in some of the patches.
	* struct eeh_ops has been introduced so that EEH core components and platform
	  dependent implementation could be split up. That make it possible for EEH
	  to be supported on multiple platforms.
	* struct eeh_dev has been introduced to replace struct pci_dn so that EEH module
	  works independently as much as possible.
	* EEH global statistics will be maintained in a collective fashion.

v1 -> v2:

	* If possible, to add "eeh_" prefix for function names.
	* The format of leading function comments won't be changed in order not to
	  break kernel document automatic generation (e.g. by "make pdfdocs").
	* The name of local variables won't be changed if there're no explicit reasons.
	* Represent the PE's state in bitmap fasion.
	* Some function names have been adjusted so that they look shorter and
	  meaningful.
	* Platform operation name has been changed to "pseries".
	* Merge those patches for cleanup if possible.
	* The line length is kept as appropriately short if possible.
	* Fixup on alignment & spacing issues.

v2 -> v3:
	* Split cleanup patch into 2: one for comment cleanup and another one for
	  renaming function names.
	* Try to use pr_warning/pr_info/pr_debug instead of printk() function call.
	* Function names are adjusted a little bit so that they looks more meaningful
	  according to comments from Michael/Ben.
	* Useful comment has been kept according to Michael's comments.
	* struct eeh_ops::set_eeh has been changed to eeh_ops::set_option.
	* struct eeh_ops::name has been changed to "char *".
	* Remove file name from the source file.
	* Copyright (C) format has been changed since "(C)" isn't encouraged to use.
	* The header files included in the source file have been sorted alphabetically.
	* eeh_platform_init() has been replaced by eeh_pseries_init() to avoid duplicate
	  functions when kernel supports multiple platforms.
	* "F/W" has been changed to "Firmware".
	* The maximal wait time to retrieve PE's state has been covered by macro.
	* It also include changes according to the minor comments from Michael.

v3 -> v4:
	* Fix some typo included in the commit messages.
	* Reduce code nesting according to Ram's suggestions.
	* Addtinal pr_warning on failure of configuring bridges.

v4 -> v5:
	* OF node and PCI device are tracing the corresponding eeh device.
	  That has been changed to "struct eeh_dev *" instead of the original
	  "void *".
	* The conversion between OF node, PCI device, eeh device is changed
	  to inline functions instead of the original macros.
	* The "struct eeh_stats" has been moved from eeh.h to eeh.c. Besides,
	  the individual members of the struct have been changed to fixed-type
	  "unsigned int". 


The series of patches (v5) has been verified on Firebird-L machine. In order to carry out
the test, you have to install IBM Power Tools from IBM internal yum source. Following
command is used to force EEH check on ethernet interface, which could be recovered eventually
by EEH and device driver successfully. You could keep pinging to the blade before issuing
the following command to force EEH. You should see the network interface can't be reached for
a moment and everything will be recovered couple of seconds after the forced EEH error. At the
same time, you should see EEH error log out of system console. 

	* errinjct eeh -v -f 0 -p U78AE.001.WZS00M9-P1-C18-L1-T2 -a 0x0 -m 0x0

-----

arch/powerpc/include/asm/device.h            |    3 +
arch/powerpc/include/asm/eeh.h               |  134 +++-
arch/powerpc/include/asm/eeh_event.h         |   33 +-
arch/powerpc/include/asm/ppc-pci.h           |   89 +--
arch/powerpc/kernel/of_platform.c            |    3 +
arch/powerpc/kernel/rtas_pci.c               |    3 +
arch/powerpc/platforms/pseries/Makefile      |    3 +-
arch/powerpc/platforms/pseries/eeh.c         | 1044 ++++++++++++--------------
arch/powerpc/platforms/pseries/eeh_cache.c   |   44 +-
arch/powerpc/platforms/pseries/eeh_dev.c     |  102 +++
arch/powerpc/platforms/pseries/eeh_driver.c  |  213 +++---
arch/powerpc/platforms/pseries/eeh_event.c   |   55 +-
arch/powerpc/platforms/pseries/eeh_pseries.c |  565 ++++++++++++++
arch/powerpc/platforms/pseries/eeh_sysfs.c   |   25 +-
arch/powerpc/platforms/pseries/msi.c         |    2 +-
arch/powerpc/platforms/pseries/pci_dlpar.c   |    3 +
arch/powerpc/platforms/pseries/setup.c       |    7 +-
include/linux/of.h                           |   10 +
include/linux/pci.h                          |    7 +
19 files changed, 1477 insertions(+), 868 deletions(-)

Thanks,
Gavin

^ permalink raw reply

* [PATCH 06/21] pSeries platform EEH PE address retrieval
From: Gavin Shan @ 2012-02-28  6:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: shangw
In-Reply-To: <1330409051-8941-1-git-send-email-shangw@linux.vnet.ibm.com>

There're 2 types of addresses used for EEH operations. The first
one would be BDF (Bus/Device/Function) address which is retrieved
from the reg property of the corresponding FDT node. Another one
is PE address that should be enquired from firmware through RTAS
call on pSeries platform. When issuing EEH operation, the PE address
has precedence over BDF address.

The patch implements retrieving PE address according to the given
BDF address on pSeries platform. Also, the struct eeh_early_enable_info
has been removed since the information can be figured out from
dn->pdn->phb->buid directly and that simplifies the code.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/eeh.c         |   67 +------------------------
 arch/powerpc/platforms/pseries/eeh_pseries.c |   46 +++++++++++++++++-
 2 files changed, 48 insertions(+), 65 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index 70a9617..00797e0 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -91,8 +91,6 @@ static int ibm_set_slot_reset;
 static int ibm_read_slot_reset_state;
 static int ibm_read_slot_reset_state2;
 static int ibm_slot_error_detail;
-static int ibm_get_config_addr_info;
-static int ibm_get_config_addr_info2;
 static int ibm_configure_bridge;
 static int ibm_configure_pe;
 
@@ -1048,56 +1046,6 @@ void eeh_configure_bridge(struct pci_dn *pdn)
 	}
 }
 
-#define EEH_ENABLE 1
-
-struct eeh_early_enable_info {
-	unsigned int buid_hi;
-	unsigned int buid_lo;
-};
-
-/**
- * eeh_get_pe_addr - Retrieve PE address with given BDF address
- * @config_addr: BDF address
- * @info: BUID of the associated PHB
- *
- * There're 2 kinds of addresses existing in EEH core components:
- * BDF address and PE address. Besides, there has dedicated platform
- * dependent function call to retrieve the PE address according to
- * the given BDF address. Further more, we prefer PE address on BDF
- * address in EEH core components.
- */
-static int eeh_get_pe_addr(int config_addr,
-                        struct eeh_early_enable_info *info)
-{
-	unsigned int rets[3];
-	int ret;
-
-	/* Use latest config-addr token on power6 */
-	if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
-		/* Make sure we have a PE in hand */
-		ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-			config_addr, info->buid_hi, info->buid_lo, 1);
-		if (ret || (rets[0]==0))
-			return 0;
-
-		ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
-			config_addr, info->buid_hi, info->buid_lo, 0);
-		if (ret)
-			return 0;
-		return rets[0];
-	}
-
-	/* Use older config-addr token on power5 */
-	if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
-		ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
-			config_addr, info->buid_hi, info->buid_lo, 0);
-		if (ret)
-			return 0;
-		return rets[0];
-	}
-	return 0;
-}
-
 /**
  * eeh_early_enable - Early enable EEH on the indicated device
  * @dn: device node
@@ -1110,7 +1058,6 @@ static int eeh_get_pe_addr(int config_addr,
 static void *eeh_early_enable(struct device_node *dn, void *data)
 {
 	unsigned int rets[3];
-	struct eeh_early_enable_info *info = data;
 	int ret;
 	const u32 *class_code = of_get_property(dn, "class-code", NULL);
 	const u32 *vendor_id = of_get_property(dn, "vendor-id", NULL);
@@ -1155,7 +1102,7 @@ static void *eeh_early_enable(struct device_node *dn, void *data)
 			/* If the newer, better, ibm,get-config-addr-info is supported, 
 			 * then use that instead.
 			 */
-			pdn->eeh_pe_config_addr = eeh_get_pe_addr(pdn->eeh_config_addr, info);
+			pdn->eeh_pe_config_addr = eeh_ops->get_pe_addr(dn);
 
 			/* Some older systems (Power4) allow the
 			 * ibm,set-eeh-option call to succeed even on nodes
@@ -1264,7 +1211,6 @@ int __exit eeh_ops_unregister(const char *name)
 void __init eeh_init(void)
 {
 	struct device_node *phb, *np;
-	struct eeh_early_enable_info info;
 	int ret;
 
 	/* call platform initialization function */
@@ -1289,8 +1235,6 @@ void __init eeh_init(void)
 	ibm_read_slot_reset_state2 = rtas_token("ibm,read-slot-reset-state2");
 	ibm_read_slot_reset_state = rtas_token("ibm,read-slot-reset-state");
 	ibm_slot_error_detail = rtas_token("ibm,slot-error-detail");
-	ibm_get_config_addr_info = rtas_token("ibm,get-config-addr-info");
-	ibm_get_config_addr_info2 = rtas_token("ibm,get-config-addr-info2");
 	ibm_configure_bridge = rtas_token("ibm,configure-bridge");
 	ibm_configure_pe = rtas_token("ibm,configure-pe");
 
@@ -1313,9 +1257,7 @@ void __init eeh_init(void)
 		if (buid == 0 || PCI_DN(phb) == NULL)
 			continue;
 
-		info.buid_lo = BUID_LO(buid);
-		info.buid_hi = BUID_HI(buid);
-		traverse_pci_devices(phb, eeh_early_enable, &info);
+		traverse_pci_devices(phb, eeh_early_enable, NULL);
 	}
 
 	if (eeh_subsystem_enabled)
@@ -1339,7 +1281,6 @@ void __init eeh_init(void)
 static void eeh_add_device_early(struct device_node *dn)
 {
 	struct pci_controller *phb;
-	struct eeh_early_enable_info info;
 
 	if (!dn || !PCI_DN(dn))
 		return;
@@ -1349,9 +1290,7 @@ static void eeh_add_device_early(struct device_node *dn)
 	if (NULL == phb || 0 == phb->buid)
 		return;
 
-	info.buid_hi = BUID_HI(phb->buid);
-	info.buid_lo = BUID_LO(phb->buid);
-	eeh_early_enable(dn, &info);
+	eeh_early_enable(dn, NULL);
 }
 
 /**
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index c48a9e6..2b9543a 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -176,7 +176,51 @@ static int pseries_eeh_set_option(struct device_node *dn, int option)
  */
 static int pseries_eeh_get_pe_addr(struct device_node *dn)
 {
-	return 0;
+	struct pci_dn *pdn;
+	int ret = 0;
+	int rets[3];
+
+	pdn = PCI_DN(dn);
+
+	if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
+		/*
+		 * First of all, we need to make sure there has one PE
+		 * associated with the device. Otherwise, PE address is
+		 * meaningless.
+		 */
+		ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
+				pdn->eeh_config_addr, BUID_HI(pdn->phb->buid),
+				BUID_LO(pdn->phb->buid), 1);
+		if (ret || (rets[0] == 0))
+			return 0;
+
+		/* Retrieve the associated PE config address */
+		ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
+				pdn->eeh_config_addr, BUID_HI(pdn->phb->buid),
+				BUID_LO(pdn->phb->buid), 0);
+		if (ret) {
+			pr_warning("%s: Failed to get PE address for %s\n",
+				__func__, dn->full_name);
+			return 0;
+		}
+
+		return rets[0];
+	}
+
+	if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
+		ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
+				pdn->eeh_config_addr, BUID_HI(pdn->phb->buid),
+				BUID_LO(pdn->phb->buid), 0);
+		if (ret) {
+			pr_warning("%s: Failed to get PE address for %s\n",
+				__func__, dn->full_name);
+			return 0;
+		}
+
+		return rets[0];
+	}
+
+	return ret;
 }
 
 /**
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 08/21] pSeries platform EEH wait PE state
From: Gavin Shan @ 2012-02-28  6:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: shangw
In-Reply-To: <1330409051-8941-1-git-send-email-shangw@linux.vnet.ibm.com>

On pSeries platform, the PE state might be temporarily unavailable.
In that case, the firmware will return the corresponding wait time.
That means the kernel has to wait for appropriate time in order to
get the PE state.

The patch does the implementation for that. Besides, the function
has been abstracted through struct eeh_ops::wait_state so that EEH core
components could support multiple platforms in future.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ppc-pci.h           |    1 -
 arch/powerpc/platforms/pseries/eeh.c         |   46 +------------------------
 arch/powerpc/platforms/pseries/eeh_driver.c  |    2 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   47 +++++++++++++++++++++++++-
 4 files changed, 49 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 6150349..1cfb2b0 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -58,7 +58,6 @@ struct pci_dev *pci_get_device_by_addr(unsigned long addr);
 void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
 int eeh_pci_enable(struct pci_dn *pdn, int function);
 int eeh_reset_pe(struct pci_dn *);
-int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs);
 void eeh_restore_bars(struct pci_dn *);
 void eeh_configure_bridge(struct pci_dn *);
 int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c
index 8d11f1f..b5b03d4 100644
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -287,48 +287,6 @@ void eeh_slot_error_detail(struct pci_dn *pdn, int severity)
 }
 
 /**
- * eeh_wait_for_slot_status - Returns error status of slot
- * @pdn: pci device node
- * @max_wait_msecs: maximum number to millisecs to wait
- *
- * Return negative value if a permanent error, else return
- * Partition Endpoint (PE) status value.
- *
- * If @max_wait_msecs is positive, then this routine will
- * sleep until a valid status can be obtained, or until
- * the max allowed wait time is exceeded, in which case
- * a -2 is returned.
- */
-int eeh_wait_for_slot_status(struct pci_dn *pdn, int max_wait_msecs)
-{
-	int rc;
-	int mwait;
-
-	while (1) {
-		rc = eeh_ops->get_state(pdn->node, &mwait);
-		if (rc != EEH_STATE_UNAVAILABLE)
-			return rc;
-
-		if (max_wait_msecs <= 0) break;
-
-		if (mwait <= 0) {
-			printk(KERN_WARNING "EEH: Firmware returned bad wait value=%d\n",
-				mwait);
-			mwait = 1000;
-		} else if (mwait > 300*1000) {
-			printk(KERN_WARNING "EEH: Firmware is taking too long, time=%d\n",
-				mwait);
-			mwait = 300*1000;
-		}
-		max_wait_msecs -= mwait;
-		msleep(mwait);
-	}
-
-	printk(KERN_WARNING "EEH: Timed out waiting for slot status\n");
-	return -2;
-}
-
-/**
  * eeh_token_to_phys - Convert EEH address token to phys address
  * @token: I/O token, should be address in the form 0xA....
  *
@@ -640,7 +598,7 @@ int eeh_pci_enable(struct pci_dn *pdn, int function)
 		printk(KERN_WARNING "EEH: Unexpected state change %d, err=%d dn=%s\n",
 		        function, rc, pdn->node->full_name);
 
-	rc = eeh_wait_for_slot_status(pdn, PCI_BUS_RESET_WAIT_MSEC);
+	rc = eeh_ops->wait_state(pdn->node, PCI_BUS_RESET_WAIT_MSEC);
 	if (rc > 0 && (rc & EEH_STATE_MMIO_ENABLED) &&
 	   (function == EEH_OPT_THAW_MMIO))
 		return 0;
@@ -838,7 +796,7 @@ int eeh_reset_pe(struct pci_dn *pdn)
 	for (i=0; i<3; i++) {
 		eeh_reset_pe_once(pdn);
 
-		rc = eeh_wait_for_slot_status(pdn, PCI_BUS_RESET_WAIT_MSEC);
+		rc = eeh_ops->wait_state(pdn->node, PCI_BUS_RESET_WAIT_MSEC);
 		if (rc == (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE))
 			return 0;
 
diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
index 4c6e0c1c..584defe 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -396,7 +396,7 @@ struct pci_dn * handle_eeh_events (struct eeh_event *event)
 
 	/* Get the current PCI slot state. This can take a long time,
 	 * sometimes over 3 seconds for certain systems. */
-	rc = eeh_wait_for_slot_status (frozen_pdn, MAX_WAIT_FOR_RECOVERY*1000);
+	rc = eeh_ops->wait_state(frozen_pdn->node, MAX_WAIT_FOR_RECOVERY*1000);
 	if (rc < 0 || rc == EEH_STATE_NOT_SUPPORT) {
 		printk(KERN_WARNING "EEH: Permanent failure\n");
 		goto hard_fail;
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 39567b2..7b60131 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -331,7 +331,52 @@ static int pseries_eeh_reset(struct device_node *dn, int option)
  */
 static int pseries_eeh_wait_state(struct device_node *dn, int max_wait)
 {
-	return 0;
+	int ret;
+	int mwait;
+
+	/*
+	 * According to PAPR, the state of PE might be temporarily
+	 * unavailable. Under the circumstance, we have to wait
+	 * for indicated time determined by firmware. The maximal
+	 * wait time is 5 minutes, which is acquired from the original
+	 * EEH implementation. Also, the original implementation
+	 * also defined the minimal wait time as 1 second.
+	 */
+#define EEH_STATE_MIN_WAIT_TIME	(1000)
+#define EEH_STATE_MAX_WAIT_TIME	(300 * 1000)
+
+	while (1) {
+		ret = pseries_eeh_get_state(dn, &mwait);
+
+		/*
+		 * If the PE's state is temporarily unavailable,
+		 * we have to wait for the specified time. Otherwise,
+		 * the PE's state will be returned immediately.
+		 */
+		if (ret != EEH_STATE_UNAVAILABLE)
+			return ret;
+
+		if (max_wait <= 0) {
+			pr_warning("%s: Timeout when getting PE's state (%d)\n",
+				__func__, max_wait);
+			return EEH_STATE_NOT_SUPPORT;
+		}
+
+		if (mwait <= 0) {
+			pr_warning("%s: Firmware returned bad wait value %d\n",
+				__func__, mwait);
+			mwait = EEH_STATE_MIN_WAIT_TIME;
+		} else if (mwait > EEH_STATE_MAX_WAIT_TIME) {
+			pr_warning("%s: Firmware returned too long wait value %d\n",
+				__func__, mwait);
+			mwait = EEH_STATE_MAX_WAIT_TIME;
+		}
+
+		max_wait -= mwait;
+		msleep(mwait);
+	}
+
+	return EEH_STATE_NOT_SUPPORT;
 }
 
 /**
-- 
1.7.5.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox