LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] arch/powerpc/lib/copy_32.S: Use alternate memcpy for MPC512x and MPC52xx
From: Scott Wood @ 2010-07-09 16:18 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Albrecht Dreß, Steve Deiters, linuxppc-dev, David Woodhouse
In-Reply-To: <E0D806F7-A109-4AAF-9899-429659E2C3D9@kernel.crashing.org>

On Fri, 9 Jul 2010 14:59:09 +0200
Segher Boessenkool <segher@kernel.crashing.org> wrote:

> >>> Actually, this is something which might need closer attention -
> >>> and maybe some support in the device tree indicating which read or
> >>> write width a device can accept?
> >>
> >> There already is "device-width"; the drivers never should use any
> >> other access width unless they *know* that will work.
> >
> > Wouldn't you want to use "bank-width" instead?
> 
> We were talking about single devices.  But, sure, when you have
> multiple devices in parallel the driver needs to know about that.
> 
> > It would be nice to have a device tree property that can specify
> > that all access widths supported by the CPU will work, though.
> 
> Oh please no.  A device binding should not depend on what CPU there
> is in the system.  There could be multiple CPUs of different
> architectures, even.

What I meant by that was that the flash interface was claiming that it
is not the limiting factor in which access widths are useable -- it
would be a way to claim that it is as flexible as ordinary memory in
that regard.

If there is a transaction size that is capable of being presented to
this component that it cannot handle, it would not present this
property.

> To figure out how to access a device, the driver looks at the device's
> node, and all its parent nodes (or asks generic code to do that, or
> platform code).

"looks" or "should look"? :-)

If there are transaction sizes supported by the CPU that won't work
with a given device through no fault of that device (or the interface
to that device for which we don't have a separate node), then in
theory, yes, it should be described at a higher level.

In reality, device tree parsing code is not AI, so rather than say "the
driver looks at this and figures it out" it would be better to provide
a more specific proposal of how a device tree might express this and
what the driver would look for, if you think the simple solution is
not expressive enough.

-Scott

^ permalink raw reply

* Re: kernel boot stuck at udbg_putc_cpm()
From: Scott Wood @ 2010-07-09 15:59 UTC (permalink / raw)
  To: Shawn Jin; +Cc: ppcdev
In-Reply-To: <AANLkTinNSzP-WvxC2kqZop2kvd1kSUMDuwI01GOBXApE@mail.gmail.com>

On Fri, 9 Jul 2010 00:35:43 -0700
Shawn Jin <shawnxjin@gmail.com> wrote:

> I changed my toolchain and rebuilt the kernel image. This time all the
> messages below magically displayed on the serial port. :-D Are all
> these the early debugging messages?

Yes, it's an alternate output for the regular console (there are
sometimes more messages, if you hook up .progress in your ppc_md, but
that's mainly of interest if you don't get this far).

> Now the kernel stuck at the while loop that waits for transmitter fifo
> to be empty. It seems that the CPM UART stopped working in the middle
> of printing a message. I'm using minicom to connect to the serial
> port. I heard minicom is problematic. Will it be the cause here?

I doubt it...

You're probably getting to the point where udbg is disabled because the
real serial driver is trying to take over -- and something's going
wrong with the real serial port driver.  Check to make sure the brg
config is correct (both the input clock and the baud rate you're trying
to switch to).  Commenting out the call to cpm_set_brg can be
a quick way of determining if that's the problem.

-Scott

^ permalink raw reply

* Re: [PATCH] arch/powerpc/lib/copy_32.S: Use alternate memcpy for MPC512x and MPC52xx
From: Segher Boessenkool @ 2010-07-09 13:03 UTC (permalink / raw)
  To: Albrecht Dreß; +Cc: David Woodhouse, Steve Deiters, linuxppc-dev
In-Reply-To: <1278619791.1801.4@antares>

> Hmm, unfortunately, it's usage is not clearly documented in mtd- 
> physmap.txt,

It's pretty clear I think.  Patches for making it better are welcome  
of course.

> so I never thought of this parameter.  And IMHO the problem goes  
> further - basically *any* chip which is attached to the LPB can be  
> affected by this problem, so it might be better to have a more  
> general approach like a "chip select property".

You cannot treat devices on the LPB as random access, that's all.   
Drivers
that assume they can, cannot be used for devices on the LPB.


Segher

^ permalink raw reply

* Re: [PATCH] arch/powerpc/lib/copy_32.S: Use alternate memcpy for MPC512x and MPC52xx
From: Segher Boessenkool @ 2010-07-09 12:59 UTC (permalink / raw)
  To: Scott Wood
  Cc: Albrecht Dreß, Steve Deiters, linuxppc-dev, David Woodhouse
In-Reply-To: <20100708150904.79feffdd@schlenkerla.am.freescale.net>

>>> Actually, this is something which might need closer attention -
>>> and maybe some support in the device tree indicating which read or
>>> write width a device can accept?
>>
>> There already is "device-width"; the drivers never should use any
>> other access width unless they *know* that will work.
>
> Wouldn't you want to use "bank-width" instead?

We were talking about single devices.  But, sure, when you have
multiple devices in parallel the driver needs to know about that.

> It would be nice to have a device tree property that can specify that
> all access widths supported by the CPU will work, though.

Oh please no.  A device binding should not depend on what CPU there
is in the system.  There could be multiple CPUs of different
architectures, even.

To figure out how to access a device, the driver looks at the device's
node, and all its parent nodes (or asks generic code to do that, or
platform code).


Segher

^ permalink raw reply

* [PATCH] powerpc/40x: Distinguish AMCC PowerPC 405EX and 405EXr correctly
From: Lee Nipper @ 2010-07-09 11:17 UTC (permalink / raw)
  To: jwboyer; +Cc: linuxppc-dev, Lee Nipper

The recent AMCC 405EX Rev D without Security uses a PVR value
that matches the old 405EXr Rev A/B with Security.
The 405EX Rev D without Security would be shown
incorrectly as an 405EXr. The pvr_mask of 0xffff0004
is no longer sufficient to distinguish the 405EX from 405EXr.

This patch replaces 2 entries in the cpu_specs table
and adds 8 more, each using pvr_mask of 0xffff000f
and appropriate pvr_value to distinguish the AMCC
PowerPC 405EX and 405EXr instances.
The cpu_name for these entries now includes the
Rev, in similar fashion to the 440GX.

Signed-off-by: Lee Nipper <lee.nipper@gmail.com>
---
Patch applies against v2.6.35-rc4.
Tested with 405EX Rev C and Rev D.
Followed u-boot arch/powerpc/include/asm/processor.h for 405EX[r] PVR values.

 arch/powerpc/kernel/cputable.c |  118 +++++++++++++++++++++++++++++++++++++---
 1 files changed, 111 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 87aa0f3..65e2b4e 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -1364,10 +1364,10 @@ static struct cpu_spec __initdata cpu_specs[] = {
 		.machine_check		= machine_check_4xx,
 		.platform		= "ppc405",
 	},
-	{	/* 405EX */
-		.pvr_mask		= 0xffff0004,
-		.pvr_value		= 0x12910004,
-		.cpu_name		= "405EX",
+	{	/* 405EX Rev. A/B with Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910007,
+		.cpu_name		= "405EX Rev. A/B",
 		.cpu_features		= CPU_FTRS_40X,
 		.cpu_user_features	= PPC_FEATURE_32 |
 			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
@@ -1377,10 +1377,114 @@ static struct cpu_spec __initdata cpu_specs[] = {
 		.machine_check		= machine_check_4xx,
 		.platform		= "ppc405",
 	},
-	{	/* 405EXr */
-		.pvr_mask		= 0xffff0004,
+	{	/* 405EX Rev. C without Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x1291000d,
+		.cpu_name		= "405EX Rev. C",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EX Rev. C with Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x1291000f,
+		.cpu_name		= "405EX Rev. C",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EX Rev. D without Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910003,
+		.cpu_name		= "405EX Rev. D",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EX Rev. D with Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910005,
+		.cpu_name		= "405EX Rev. D",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EXr Rev. A/B without Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910001,
+		.cpu_name		= "405EXr Rev. A/B",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EXr Rev. C without Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910009,
+		.cpu_name		= "405EXr Rev. C",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EXr Rev. C with Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x1291000b,
+		.cpu_name		= "405EXr Rev. C",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EXr Rev. D without Security */
+		.pvr_mask		= 0xffff000f,
 		.pvr_value		= 0x12910000,
-		.cpu_name		= "405EXr",
+		.cpu_name		= "405EXr Rev. D",
+		.cpu_features		= CPU_FTRS_40X,
+		.cpu_user_features	= PPC_FEATURE_32 |
+			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
+		.mmu_features		= MMU_FTR_TYPE_40x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc405",
+	},
+	{	/* 405EXr Rev. D with Security */
+		.pvr_mask		= 0xffff000f,
+		.pvr_value		= 0x12910002,
+		.cpu_name		= "405EXr Rev. D",
 		.cpu_features		= CPU_FTRS_40X,
 		.cpu_user_features	= PPC_FEATURE_32 |
 			PPC_FEATURE_HAS_MMU | PPC_FEATURE_HAS_4xxMAC,
-- 
1.6.0.4

^ permalink raw reply related

* Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface
From: Alexander Graf @ 2010-07-09  9:15 UTC (permalink / raw)
  To: MJ embd; +Cc: linuxppc-dev, KVM list, kvm-ppc
In-Reply-To: <AANLkTil6RekYBNnpM2MjSH7rcrHQZdHRwK5cBrvKQKhM@mail.gmail.com>


On 09.07.2010, at 11:11, MJ embd wrote:

> On Thu, Jul 1, 2010 at 4:13 PM, Alexander Graf <agraf@suse.de> wrote:
>> We just introduced a new PV interface that screams for documentation. =
So here
>> it is - a shiny new and awesome text file describing the internal =
works of
>> the PPC KVM paravirtual interface.
>>=20
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>=20
>> +
>> +
>> +Some instructions require more logic to determine what's going on =
than a load
>> +or store instruction can deliver. To enable patching of those, we =
keep some
>> +RAM around where we can live translate instructions to. What happens =
is the
>> +following:
>> +
>> +       1) copy emulation code to memory
>> +       2) patch that code to fit the emulated instruction
>> +       3) patch that code to return to the original pc + 4
>> +       4) patch the original instruction to branch to the new code
>> +
>> +That way we can inject an arbitrary amount of code as replacement =
for a single
>> +instruction. This allows us to check for pending interrupts when =
setting EE=3D1
>> +for example.
>> +
>=20
> Which patch does this mapping ? Can you please point to that.

The branch patching is in patch 22/27. For the respective users, see =
patch 23-26/27.


Alex

^ permalink raw reply

* Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface
From: MJ embd @ 2010-07-09  9:11 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, KVM list, kvm-ppc
In-Reply-To: <1277980982-12433-28-git-send-email-agraf@suse.de>

On Thu, Jul 1, 2010 at 4:13 PM, Alexander Graf <agraf@suse.de> wrote:
> We just introduced a new PV interface that screams for documentation. So =
here
> it is - a shiny new and awesome text file describing the internal works o=
f
> the PPC KVM paravirtual interface.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
>
> ---
>
> v1 -> v2:
>
> =A0- clarify guest implementation
> =A0- clarify that privileged instructions still work
> =A0- explain safe MSR bits
> =A0- Fix dsisr patch description
> =A0- change hypervisor calls to use new register values
> ---
> =A0Documentation/kvm/ppc-pv.txt | =A0185 ++++++++++++++++++++++++++++++++=
++++++++++
> =A01 files changed, 185 insertions(+), 0 deletions(-)
> =A0create mode 100644 Documentation/kvm/ppc-pv.txt
>
> diff --git a/Documentation/kvm/ppc-pv.txt b/Documentation/kvm/ppc-pv.txt
> new file mode 100644
> index 0000000..82de6c6
> --- /dev/null
> +++ b/Documentation/kvm/ppc-pv.txt
> @@ -0,0 +1,185 @@
> +The PPC KVM paravirtual interface
> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +The basic execution principle by which KVM on PowerPC works is to run al=
l kernel
> +space code in PR=3D1 which is user space. This way we trap all privilege=
d
> +instructions and can emulate them accordingly.
> +
> +Unfortunately that is also the downfall. There are quite some privileged
> +instructions that needlessly return us to the hypervisor even though the=
y
> +could be handled differently.
> +
> +This is what the PPC PV interface helps with. It takes privileged instru=
ctions
> +and transforms them into unprivileged ones with some help from the hyper=
visor.
> +This cuts down virtualization costs by about 50% on some of my benchmark=
s.
> +
> +The code for that interface can be found in arch/powerpc/kernel/kvm*
> +
> +Querying for existence
> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +To find out if we're running on KVM or not, we overlay the PVR register.=
 Usually
> +the PVR register contains an id that identifies your CPU type. If, howev=
er, you
> +pass KVM_PVR_PARA in the register that you want the PVR result in, the r=
egister
> +still contains KVM_PVR_PARA after the mfpvr call.
> +
> + =A0 =A0 =A0 LOAD_REG_IMM(r5, KVM_PVR_PARA)
> + =A0 =A0 =A0 mfpvr =A0 r5
> + =A0 =A0 =A0 [r5 still contains KVM_PVR_PARA]
> +
> +Once determined to run under a PV capable KVM, you can now use hypercall=
s as
> +described below.
> +
> +PPC hypercalls
> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +The only viable ways to reliably get from guest context to host context =
are:
> +
> + =A0 =A0 =A0 1) Call an invalid instruction
> + =A0 =A0 =A0 2) Call the "sc" instruction with a parameter to "sc"
> + =A0 =A0 =A0 3) Call the "sc" instruction with parameters in GPRs
> +
> +Method 1 is always a bad idea. Invalid instructions can be replaced late=
r on
> +by valid instructions, rendering the interface broken.
> +
> +Method 2 also has downfalls. If the parameter to "sc" is !=3D 0 the spec=
 is
> +rather unclear if the sc is targeted directly for the hypervisor or the
> +supervisor. It would also require that we read the syscall issuing instr=
uction
> +every time a syscall is issued, slowing down guest syscalls.
> +
> +Method 3 is what KVM uses. We pass magic constants (KVM_SC_MAGIC_R0 and
> +KVM_SC_MAGIC_R3) in r0 and r3 respectively. If a syscall instruction wit=
h these
> +magic values arrives from the guest's kernel mode, we take the syscall a=
s a
> +hypercall.
> +
> +The parameters are as follows:
> +
> + =A0 =A0 =A0 r0 =A0 =A0 =A0 =A0 =A0 =A0 =A0KVM_SC_MAGIC_R0
> + =A0 =A0 =A0 r3 =A0 =A0 =A0 =A0 =A0 =A0 =A0KVM_SC_MAGIC_R3 =A0 =A0 =A0 =
=A0 Return code
> + =A0 =A0 =A0 r4 =A0 =A0 =A0 =A0 =A0 =A0 =A0Hypercall number
> + =A0 =A0 =A0 r5 =A0 =A0 =A0 =A0 =A0 =A0 =A0First parameter
> + =A0 =A0 =A0 r6 =A0 =A0 =A0 =A0 =A0 =A0 =A0Second parameter
> + =A0 =A0 =A0 r7 =A0 =A0 =A0 =A0 =A0 =A0 =A0Third parameter
> + =A0 =A0 =A0 r8 =A0 =A0 =A0 =A0 =A0 =A0 =A0Fourth parameter
> +
> +Hypercall definitions are shared in generic code, so the same hypercall =
numbers
> +apply for x86 and powerpc alike.
> +
> +The magic page
> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +To enable communication between the hypervisor and guest there is a new =
shared
> +page that contains parts of supervisor visible register state. The guest=
 can
> +map this shared page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE.
> +
> +With this hypercall issued the guest always gets the magic page mapped a=
t the
> +desired location in effective and physical address space. For now, we al=
ways
> +map the page to -4096. This way we can access it using absolute load and=
 store
> +functions. The following instruction reads the first field of the magic =
page:
> +
> + =A0 =A0 =A0 ld =A0 =A0 =A0rX, -4096(0)
> +
> +The interface is designed to be extensible should there be need later to=
 add
> +additional registers to the magic page. If you add fields to the magic p=
age,
> +also define a new hypercall feature to indicate that the host can give y=
ou more
> +registers. Only if the host supports the additional features, make use o=
f them.
> +
> +The magic page has the following layout as described in
> +arch/powerpc/include/asm/kvm_para.h:
> +
> +struct kvm_vcpu_arch_shared {
> + =A0 =A0 =A0 __u64 scratch1;
> + =A0 =A0 =A0 __u64 scratch2;
> + =A0 =A0 =A0 __u64 scratch3;
> + =A0 =A0 =A0 __u64 critical; =A0 =A0 =A0 =A0 /* Guest may not get interr=
upts if =3D=3D r1 */
> + =A0 =A0 =A0 __u64 sprg0;
> + =A0 =A0 =A0 __u64 sprg1;
> + =A0 =A0 =A0 __u64 sprg2;
> + =A0 =A0 =A0 __u64 sprg3;
> + =A0 =A0 =A0 __u64 srr0;
> + =A0 =A0 =A0 __u64 srr1;
> + =A0 =A0 =A0 __u64 dar;
> + =A0 =A0 =A0 __u64 msr;
> + =A0 =A0 =A0 __u32 dsisr;
> + =A0 =A0 =A0 __u32 int_pending; =A0 =A0 =A0/* Tells the guest if we have=
 an interrupt */
> +};
> +
> +Additions to the page must only occur at the end. Struct fields are alwa=
ys 32
> +bit aligned.
> +
> +MSR bits
> +=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +The MSR contains bits that require hypervisor intervention and bits that=
 do
> +not require direct hypervisor intervention because they only get interpr=
eted
> +when entering the guest or don't have any impact on the hypervisor's beh=
avior.
> +
> +The following bits are safe to be set inside the guest:
> +
> + =A0MSR_EE
> + =A0MSR_RI
> + =A0MSR_CR
> + =A0MSR_ME
> +
> +If any other bit changes in the MSR, please still use mtmsr(d).
> +
> +Patched instructions
> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> +
> +The "ld" and "std" instructions are transormed to "lwz" and "stw" instru=
ctions
> +respectively on 32 bit systems with an added offset of 4 to accomodate f=
or big
> +endianness.
> +
> +The following is a list of mapping the Linux kernel performs when runnin=
g as
> +guest. Implementing any of those mappings is optional, as the instructio=
n traps
> +also act on the shared page. So calling privileged instructions still wo=
rks as
> +before.
> +
> +From =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 To
> +=3D=3D=3D=3D =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =3D=3D
> +
> +mfmsr =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0ld =A0 =A0 =A0rX, magic_page->msr
> +mfsprg rX, 0 =A0 =A0 =A0 =A0 =A0 ld =A0 =A0 =A0rX, magic_page->sprg0
> +mfsprg rX, 1 =A0 =A0 =A0 =A0 =A0 ld =A0 =A0 =A0rX, magic_page->sprg1
> +mfsprg rX, 2 =A0 =A0 =A0 =A0 =A0 ld =A0 =A0 =A0rX, magic_page->sprg2
> +mfsprg rX, 3 =A0 =A0 =A0 =A0 =A0 ld =A0 =A0 =A0rX, magic_page->sprg3
> +mfsrr0 rX =A0 =A0 =A0 =A0 =A0 =A0 =A0ld =A0 =A0 =A0rX, magic_page->srr0
> +mfsrr1 rX =A0 =A0 =A0 =A0 =A0 =A0 =A0ld =A0 =A0 =A0rX, magic_page->srr1
> +mfdar =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0ld =A0 =A0 =A0rX, magic_page->dar
> +mfdsisr =A0 =A0 =A0 =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0lwz =A0 =A0 rX, mag=
ic_page->dsisr
> +
> +mtmsr =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0std =A0 =A0 rX, magic_page->msr
> +mtsprg 0, rX =A0 =A0 =A0 =A0 =A0 std =A0 =A0 rX, magic_page->sprg0
> +mtsprg 1, rX =A0 =A0 =A0 =A0 =A0 std =A0 =A0 rX, magic_page->sprg1
> +mtsprg 2, rX =A0 =A0 =A0 =A0 =A0 std =A0 =A0 rX, magic_page->sprg2
> +mtsprg 3, rX =A0 =A0 =A0 =A0 =A0 std =A0 =A0 rX, magic_page->sprg3
> +mtsrr0 rX =A0 =A0 =A0 =A0 =A0 =A0 =A0std =A0 =A0 rX, magic_page->srr0
> +mtsrr1 rX =A0 =A0 =A0 =A0 =A0 =A0 =A0std =A0 =A0 rX, magic_page->srr1
> +mtdar =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0std =A0 =A0 rX, magic_page->dar
> +mtdsisr =A0 =A0 =A0 =A0rX =A0 =A0 =A0 =A0 =A0 =A0 =A0stw =A0 =A0 rX, mag=
ic_page->dsisr
> +
> +tlbsync =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nop
> +
> +mtmsrd rX, 0 =A0 =A0 =A0 =A0 =A0 b =A0 =A0 =A0 <special mtmsr section>
> +mtmsr =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0b =A0 =A0 =A0 <special mtmsr se=
ction>
> +
> +mtmsrd rX, 1 =A0 =A0 =A0 =A0 =A0 b =A0 =A0 =A0 <special mtmsrd section>
> +
> +[BookE only]
> +wrteei [0|1] =A0 =A0 =A0 =A0 =A0 b =A0 =A0 =A0 <special wrteei section>
> +
> +
> +Some instructions require more logic to determine what's going on than a=
 load
> +or store instruction can deliver. To enable patching of those, we keep s=
ome
> +RAM around where we can live translate instructions to. What happens is =
the
> +following:
> +
> + =A0 =A0 =A0 1) copy emulation code to memory
> + =A0 =A0 =A0 2) patch that code to fit the emulated instruction
> + =A0 =A0 =A0 3) patch that code to return to the original pc + 4
> + =A0 =A0 =A0 4) patch the original instruction to branch to the new code
> +
> +That way we can inject an arbitrary amount of code as replacement for a =
single
> +instruction. This allows us to check for pending interrupts when setting=
 EE=3D1
> +for example.
> +

Which patch does this mapping ? Can you please point to that.


> --
> 1.6.0.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>



--=20
-mj

^ permalink raw reply

* Re: [PATCH 02/13] powerpc/book3e: Hack to get gdb moving along on Book3E 64-bit
From: K.Prasad @ 2010-07-09  8:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1278656215-24705-2-git-send-email-benh@kernel.crashing.org>

On Fri, Jul 09, 2010 at 04:16:44PM +1000, Benjamin Herrenschmidt wrote:

Hi,
   A few questions and some trivial comments below.

> Our handling of debug interrupts on Book3E 64-bit is not quite
> the way it should be just yet. This is a workaround to let gdb
> work at least for now. We ensure that when context switching,
> we set the appropriate DBCR0 value for the new task. We also
> make sure that we turn off MSR[DE] within the kernel, and set
> it as part of the bits that get set when going back to userspace.
> 

I think I'm missing the code where MSR_DE is set before returning
to user-space? I just found one instance where MSR_USER64 (which now
includes MSR_DE) is used (in start_thread()). If not set, we'll lose the
interrupts caused in IDM too.

> In the long run, we will probably set the userspace DBCR0 on the
> exception exit code path and ensure we have some proper kernel
> value to set on the way into the kernel, a bit like ppc32 does,
> but that will take more work.

The effort to port ppc32 BookIII E debug register usage to use generic
hw-breakpoint interfaces (linuxppc-dev message-id:
20100629165152.GA8586@in.ibm.com), in its final form, should cleanup
most of this code. Even the hook switch_booke_debug_regs() in
__switch_to() should be done away.

> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  arch/powerpc/include/asm/reg_booke.h |    4 ++--
>  arch/powerpc/kernel/process.c        |   22 ++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
> index 2360317..66dc6f0 100644
> --- a/arch/powerpc/include/asm/reg_booke.h
> +++ b/arch/powerpc/include/asm/reg_booke.h
> @@ -29,8 +29,8 @@
>  #if defined(CONFIG_PPC_BOOK3E_64)
>  #define MSR_		MSR_ME | MSR_CE
>  #define MSR_KERNEL      MSR_ | MSR_CM
> -#define MSR_USER32	MSR_ | MSR_PR | MSR_EE
> -#define MSR_USER64	MSR_USER32 | MSR_CM
> +#define MSR_USER32	MSR_ | MSR_PR | MSR_EE | MSR_DE
> +#define MSR_USER64	MSR_USER32 | MSR_CM | MSR_DE

MSR_DE is included twice in MSR_USER64 (once through MSR_USER32).

>  #elif defined (CONFIG_40x)
>  #define MSR_KERNEL	(MSR_ME|MSR_RI|MSR_IR|MSR_DR|MSR_CE)
>  #define MSR_USER	(MSR_KERNEL|MSR_PR|MSR_EE)
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 1e78453..551f671 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -477,6 +477,28 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  	new_thread = &new->thread;
>  	old_thread = &current->thread;
> 
> +#if defined(CONFIG_PPC_BOOK3E_64)
> +	/* XXX Current Book3E code doesn't deal with kernel side DBCR0,

You may want to use the style
	/*
	 * ......

> +	 * we always hold the user values, so we set it now.
> +	 *
> +	 * However, we ensure the kernel MSR:DE is appropriately cleared too
> +	 * to avoid spurrious single step exceptions in the kernel.
                 ^^^spurious^^^

> +	 *
> +	 * This will have to change to merge with the ppc32 code at some point,
> +	 * but I don't like much what ppc32 is doing today so there's some
> +	 * thinking needed there
> +	 */
> +	if ((new_thread->dbcr0 | old_thread->dbcr0) & DBCR0_IDM) {
> +		u32 dbcr0;

thread->dbcr0 is defined as "unsigned long" in processor.h however "u32 dbcr0"
here must be fine (given that DBCR0 uses 32-bits in ppc32 and uses only 32:63
bits in BOOKIIIE_64). Should dbcr<0-n> be made u32, given that there
will be no 64-bit long value to store (or am I missing something)?

Thanks,
K.Prasad


> +
> +		mtmsr(mfmsr() & ~MSR_DE);
> +		isync();
> +		dbcr0 = mfspr(SPRN_DBCR0);
> +		dbcr0 = (dbcr0 & DBCR0_EDM) | new_thread->dbcr0;
> +		mtspr(SPRN_DBCR0, dbcr0);
> +	}
> +#endif /* CONFIG_PPC64_BOOK3E */
> +
>  #ifdef CONFIG_PPC64
>  	/*
>  	 * Collect processor utilization data per process
> -- 
> 1.6.3.3
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* Re: Oops while running fs_racer test on a POWER6 box against latest git
From: Nick Piggin @ 2010-07-09  8:35 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Latchesar Ionkov, Nick Piggin, LKML, linuxppc-dev@ozlabs.org,
	Alexander Viro, Ron Minnich, divya, hch@lst.de,
	maciej.rutecki@gmail.com
In-Reply-To: <4C36D0F8.7070303@fusionio.com>

On Fri, Jul 09, 2010 at 09:34:16AM +0200, Jens Axboe wrote:
> On 2010-07-09 08:57, divya wrote:
> > On Friday 02 July 2010 12:16 PM, divya wrote:
> >> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
> >>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
> >>>> While running fs_racer test from LTP on a POWER6 box against latest
> >>>> git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
> >>>> following
> >>>> warning followed by multiple oops.
> >>>>
> >>> I created a Bugzilla entry at
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=16324
> >>> for your bug report, please add your address to the CC list in there, 
> >>> thanks!
> >>>
> >>>
> >> Here I find a cleaner back trace while running fs_racer test from LTP 
> >> on a POWER6
> >> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
> >>
> >> Badness at kernel/mutex-debug.c:64
> >> BUG: key (null) not in .data!
> >> NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
> >> REGS: c00000010bb176f0 TRAP: 0700   Not tainted  
> >> (2.6.35-rc3-git5-autotest)
> >> BUG: key 00000000000001d8 not in .data!
> >> BUG: key 00000000000001e0 not in .data!
> >> BUG: key 00000000000001e8 not in .data!
> >> MSR: 8000000000029032
> >> Unable to handle kernel paging request for data at address 0x00000028
> >> Faulting instruction address: 0xc0000000003ad0ec
> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> SMP NR_CPUS=1024 NUMA pSeries
> >> last sysfs file: 
> >> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> >> Page fault in user mode with in_atomic() = 1 mm = c00000010943e600
> >> Modules linked in:
> >> NIP = fff9e98fc40  MSR = 800000004001d032
> >>  ipv6 fuse loop
> >> Unable to handle kernel paging request for unknown fault
> >>  dm_mod
> >> Faulting instruction address: 0xc00000000008d0f4
> >>  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
> >> scsi_transport_srp scsi_tgt scsi_mod
> >> NIP: c0000000003ad0ec LR: c00000000064c3b0 CTR: c0000000003a6eb0
> >> REGS: c000000109b4f610 TRAP: 0300   Not tainted  
> >> (2.6.35-rc3-git5-autotest)
> >> MSR: 8000000000009032<EE,ME,IR,DR>   CR: 88004484  XER: 00000001
> >> DAR: 0000000000000028, DSISR: 0000000040010000
> >> TASK = c000000109a98600[7403] 'mkdir' THREAD: c000000109b4c000 CPU: 19
> >> GPR00: 0000000080000013 c000000109b4f890 c000000000d3d798 
> >> 0000000000000028
> >> GPR04: 0000000000000000 0000000000000000 0000000000000000 
> >> 0000000000000001
> >> GPR08: 0000000000000000 0000000000000028 c000000000189f2c 
 >> c000000109a98600
> >> GPR12: 0000000024004424 c00000000f602f80 00000000000041ff 
> >> 0000000000000001
> >> GPR16: 0000000000000002 c00000010d8304c0 c000000109b4fb44 
> >> 0000000000000000
> >> GPR20: c00000010df77908 fffffffffffff000 0000000000010000 
> >> 00000000000041ff
> >> GPR24: c00000010df77758 c000000109fa1800 c00000010df77908 
> >> c0000000ff236600
> >> GPR28: 0000000000000028 0000000000000040 c000000000ca7b38 
> >> c000000000189f2c
> >> NIP [c0000000003ad0ec] .do_raw_spin_trylock+0x10/0x48
> >> LR [c00000000064c3b0] ._raw_spin_lock+0x50/0xa4
> >> Call Trace:
> >> [c000000109b4f890] [c00000000064c3a4] ._raw_spin_lock+0x44/0xa4 
> >> (unreliable)
> >> [c000000109b4f920] [c000000000189f2c] .new_inode+0x4c/0xe4
> >> [c000000109b4f9b0] [c0000000002257fc] .ext3_new_inode+0x84/0xb70
> >> [c000000109b4fad0] [c00000000022f1ec] .ext3_mkdir+0x130/0x438
> >> [c000000109b4fbe0] [c00000000017adb4] .vfs_mkdir+0xb8/0x160
> >> [c000000109b4fc80] [c00000000017e52c] .SyS_mkdirat+0xb0/0x114
> >> [c000000109b4fdc0] [c00000000017a730] .SyS_mkdir+0x1c/0x30
> >> [c000000109b4fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> >> Instruction dump:
> >> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
> >> 38000000 7c691b78 980d0214 800d0008<7d601829>  2c0b0000 40c20010 7c00192d
> >> Oops: Weird page fault, sig: 11 [#2]
> >>
> >> Pls let me know if this back trace would help in analyzing further.
> >> Meanwhile I shall do a git bisect and send the inputs.

The call stack for Badness at kernel/mutex-debug.c:64 (or whatever
explodes first) would be handy.  This one seems jumbled still. What
spinlock is in the trace? inode_lock?  That would indicate some random
corruption or breakage in the lock debugging.

> >>
> >> Thanks
> >> Divya
> >>
> >>
> >>
> > Hi All,
> > 
> >  From the git bisect,seems like the commit
> >  57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
> >  issue.

Call me blind but I can't see the problem. Are you sure this commit
breaks it?

^ permalink raw reply

* Re: Oops while running fs_racer test on a POWER6 box against latest git
From: Jens Axboe @ 2010-07-09  7:34 UTC (permalink / raw)
  To: divya
  Cc: Latchesar Ionkov, Nick Piggin, LKML, linuxppc-dev@ozlabs.org,
	Alexander Viro, Ron Minnich, hch@lst.de, maciej.rutecki@gmail.com
In-Reply-To: <4C36C876.9090404@linux.vnet.ibm.com>

On 2010-07-09 08:57, divya wrote:
> On Friday 02 July 2010 12:16 PM, divya wrote:
>> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
>>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
>>>> While running fs_racer test from LTP on a POWER6 box against latest
>>>> git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
>>>> following
>>>> warning followed by multiple oops.
>>>>
>>> I created a Bugzilla entry at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=16324
>>> for your bug report, please add your address to the CC list in there, 
>>> thanks!
>>>
>>>
>> Here I find a cleaner back trace while running fs_racer test from LTP 
>> on a POWER6
>> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
>>
>> Badness at kernel/mutex-debug.c:64
>> BUG: key (null) not in .data!
>> NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
>> REGS: c00000010bb176f0 TRAP: 0700   Not tainted  
>> (2.6.35-rc3-git5-autotest)
>> BUG: key 00000000000001d8 not in .data!
>> BUG: key 00000000000001e0 not in .data!
>> BUG: key 00000000000001e8 not in .data!
>> MSR: 8000000000029032
>> Unable to handle kernel paging request for data at address 0x00000028
>> Faulting instruction address: 0xc0000000003ad0ec
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> SMP NR_CPUS=1024 NUMA pSeries
>> last sysfs file: 
>> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
>> Page fault in user mode with in_atomic() = 1 mm = c00000010943e600
>> Modules linked in:
>> NIP = fff9e98fc40  MSR = 800000004001d032
>>  ipv6 fuse loop
>> Unable to handle kernel paging request for unknown fault
>>  dm_mod
>> Faulting instruction address: 0xc00000000008d0f4
>>  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
>> scsi_transport_srp scsi_tgt scsi_mod
>> NIP: c0000000003ad0ec LR: c00000000064c3b0 CTR: c0000000003a6eb0
>> REGS: c000000109b4f610 TRAP: 0300   Not tainted  
>> (2.6.35-rc3-git5-autotest)
>> MSR: 8000000000009032<EE,ME,IR,DR>   CR: 88004484  XER: 00000001
>> DAR: 0000000000000028, DSISR: 0000000040010000
>> TASK = c000000109a98600[7403] 'mkdir' THREAD: c000000109b4c000 CPU: 19
>> GPR00: 0000000080000013 c000000109b4f890 c000000000d3d798 
>> 0000000000000028
>> GPR04: 0000000000000000 0000000000000000 0000000000000000 
>> 0000000000000001
>> GPR08: 0000000000000000 0000000000000028 c000000000189f2c 
>> c000000109a98600
>> GPR12: 0000000024004424 c00000000f602f80 00000000000041ff 
>> 0000000000000001
>> GPR16: 0000000000000002 c00000010d8304c0 c000000109b4fb44 
>> 0000000000000000
>> GPR20: c00000010df77908 fffffffffffff000 0000000000010000 
>> 00000000000041ff
>> GPR24: c00000010df77758 c000000109fa1800 c00000010df77908 
>> c0000000ff236600
>> GPR28: 0000000000000028 0000000000000040 c000000000ca7b38 
>> c000000000189f2c
>> NIP [c0000000003ad0ec] .do_raw_spin_trylock+0x10/0x48
>> LR [c00000000064c3b0] ._raw_spin_lock+0x50/0xa4
>> Call Trace:
>> [c000000109b4f890] [c00000000064c3a4] ._raw_spin_lock+0x44/0xa4 
>> (unreliable)
>> [c000000109b4f920] [c000000000189f2c] .new_inode+0x4c/0xe4
>> [c000000109b4f9b0] [c0000000002257fc] .ext3_new_inode+0x84/0xb70
>> [c000000109b4fad0] [c00000000022f1ec] .ext3_mkdir+0x130/0x438
>> [c000000109b4fbe0] [c00000000017adb4] .vfs_mkdir+0xb8/0x160
>> [c000000109b4fc80] [c00000000017e52c] .SyS_mkdirat+0xb0/0x114
>> [c000000109b4fdc0] [c00000000017a730] .SyS_mkdir+0x1c/0x30
>> [c000000109b4fe30] [c0000000000085b4] syscall_exit+0x0/0x40
>> Instruction dump:
>> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
>> 38000000 7c691b78 980d0214 800d0008<7d601829>  2c0b0000 40c20010 7c00192d
>> Oops: Weird page fault, sig: 11 [#2]
>>
>> Pls let me know if this back trace would help in analyzing further.
>> Meanwhile I shall do a git bisect and send the inputs.
>>
>> Thanks
>> Divya
>>
>>
>>
> Hi All,
> 
>  From the git bisect,seems like the commit
>  57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
>  issue.

CC'ing Nick and Al.

-- 
Jens Axboe

^ permalink raw reply

* Re: kernel boot stuck at udbg_putc_cpm()
From: Shawn Jin @ 2010-07-09  7:35 UTC (permalink / raw)
  To: Scott Wood; +Cc: ppcdev
In-Reply-To: <AANLkTimUxcsTMXNPOFfJxybp8u7Wv7Mt_nhW1EE01RqG@mail.gmail.com>

I changed my toolchain and rebuilt the kernel image. This time all the
messages below magically displayed on the serial port. :-D Are all
these the early debugging messages?

> Here is the kernel log buf dump. Anything suspicious?
>
> <6>Using My MPC870 machine description
> <5>Linux version 2.6.33.5 (shawn@ubuntu) (gcc version 4.3.3 (GCC) )
> #10 Mon Jul 5 22:58:30 PDT 2010
> <7>Top of RAM: 0x8000000, Total RAM: 0x8000000
> <7>Memory hole size: 0MB
> <4>Zone PFN ranges:
...
<snipped>
...
> <7>time_init: decrementer frequency =3D 3.750000 MHz
> <7>time_init: processor frequency =A0 =3D 120.000000 MHz
> <6>clocksource: timebase mult[42aaaaab] shift[22] registered
> <7>clockevent: decrementer mult[f5c28f] shift[32] cpu[0]
> <7> =A0alloc irq_desc for 18 on node 0
> <7> =A0alloc kstat_irqs on node 0
> <7>irq: irq 4 on host /soc@fa200000/cpm@9c0/interrupt-controller@930
> mapped to virtual irq 18

Now the kernel stuck at the while loop that waits for transmitter fifo
to be empty. It seems that the CPM UART stopped working in the middle
of printing a message. I'm using minicom to connect to the serial
port. I heard minicom is problematic. Will it be the cause here?

(gdb) target remote ppcbdi:2001
Remote debugging using ppcbdi:2001
0xc00f348c in cpm_uart_console_write (co=3D<value optimized out>,
    s=3D0xc0174df3 "console [ttyCPM0] enabled, bootconsole disabled\n", cou=
nt=3D48)
    at /home/rayan/wti/code/wti-linux-2.6.33.5/arch/powerpc/include/asm/io.=
h:154
154     DEF_MMIO_IN_BE(in_be16, 16, lhz);
(gdb) next
1161                    while ((in_be16(&bdp->cbd_sc) & BD_SC_READY) !=3D 0=
)
(gdb) next
154     DEF_MMIO_IN_BE(in_be16, 16, lhz);
(gdb) next
1161                    while ((in_be16(&bdp->cbd_sc) & BD_SC_READY) !=3D 0=
)
(gdb) list
1156            for (i =3D 0; i < count; i++, s++) {
1157                    /* Wait for transmitter fifo to empty.
1158                     * Ready indicates output is ready, and xmt is doin=
g
1159                     * that, not that it is ready for us to send.
1160                     */
1161                    while ((in_be16(&bdp->cbd_sc) & BD_SC_READY) !=3D 0=
)
1162                            ;
1163
1164                    /* Send the character out.
1165                     * If the buffer address is in the CPM DPRAM, don't

Thanks,
-Shawn.

^ permalink raw reply

* Re: Oops while running fs_racer test on a POWER6 box against latest git
From: divya @ 2010-07-09  6:57 UTC (permalink / raw)
  To: maciej.rutecki
  Cc: Latchesar Ionkov, jaxboe, LKML, linuxppc-dev, Ron Minnich, hch
In-Reply-To: <4C2D8B63.2030500@linux.vnet.ibm.com>

On Friday 02 July 2010 12:16 PM, divya wrote:
> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
>>> While running fs_racer test from LTP on a POWER6 box against latest
>>> git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
>>> following
>>> warning followed by multiple oops.
>>>
>> I created a Bugzilla entry at
>> https://bugzilla.kernel.org/show_bug.cgi?id=16324
>> for your bug report, please add your address to the CC list in there, 
>> thanks!
>>
>>
> Here I find a cleaner back trace while running fs_racer test from LTP 
> on a POWER6
> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
>
> Badness at kernel/mutex-debug.c:64
> BUG: key (null) not in .data!
> NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
> REGS: c00000010bb176f0 TRAP: 0700   Not tainted  
> (2.6.35-rc3-git5-autotest)
> BUG: key 00000000000001d8 not in .data!
> BUG: key 00000000000001e0 not in .data!
> BUG: key 00000000000001e8 not in .data!
> MSR: 8000000000029032
> Unable to handle kernel paging request for data at address 0x00000028
> Faulting instruction address: 0xc0000000003ad0ec
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=1024 NUMA pSeries
> last sysfs file: 
> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> Page fault in user mode with in_atomic() = 1 mm = c00000010943e600
> Modules linked in:
> NIP = fff9e98fc40  MSR = 800000004001d032
>  ipv6 fuse loop
> Unable to handle kernel paging request for unknown fault
>  dm_mod
> Faulting instruction address: 0xc00000000008d0f4
>  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
> scsi_transport_srp scsi_tgt scsi_mod
> NIP: c0000000003ad0ec LR: c00000000064c3b0 CTR: c0000000003a6eb0
> REGS: c000000109b4f610 TRAP: 0300   Not tainted  
> (2.6.35-rc3-git5-autotest)
> MSR: 8000000000009032<EE,ME,IR,DR>   CR: 88004484  XER: 00000001
> DAR: 0000000000000028, DSISR: 0000000040010000
> TASK = c000000109a98600[7403] 'mkdir' THREAD: c000000109b4c000 CPU: 19
> GPR00: 0000000080000013 c000000109b4f890 c000000000d3d798 
> 0000000000000028
> GPR04: 0000000000000000 0000000000000000 0000000000000000 
> 0000000000000001
> GPR08: 0000000000000000 0000000000000028 c000000000189f2c 
> c000000109a98600
> GPR12: 0000000024004424 c00000000f602f80 00000000000041ff 
> 0000000000000001
> GPR16: 0000000000000002 c00000010d8304c0 c000000109b4fb44 
> 0000000000000000
> GPR20: c00000010df77908 fffffffffffff000 0000000000010000 
> 00000000000041ff
> GPR24: c00000010df77758 c000000109fa1800 c00000010df77908 
> c0000000ff236600
> GPR28: 0000000000000028 0000000000000040 c000000000ca7b38 
> c000000000189f2c
> NIP [c0000000003ad0ec] .do_raw_spin_trylock+0x10/0x48
> LR [c00000000064c3b0] ._raw_spin_lock+0x50/0xa4
> Call Trace:
> [c000000109b4f890] [c00000000064c3a4] ._raw_spin_lock+0x44/0xa4 
> (unreliable)
> [c000000109b4f920] [c000000000189f2c] .new_inode+0x4c/0xe4
> [c000000109b4f9b0] [c0000000002257fc] .ext3_new_inode+0x84/0xb70
> [c000000109b4fad0] [c00000000022f1ec] .ext3_mkdir+0x130/0x438
> [c000000109b4fbe0] [c00000000017adb4] .vfs_mkdir+0xb8/0x160
> [c000000109b4fc80] [c00000000017e52c] .SyS_mkdirat+0xb0/0x114
> [c000000109b4fdc0] [c00000000017a730] .SyS_mkdir+0x1c/0x30
> [c000000109b4fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> Instruction dump:
> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
> 38000000 7c691b78 980d0214 800d0008<7d601829>  2c0b0000 40c20010 7c00192d
> Oops: Weird page fault, sig: 11 [#2]
>
> Pls let me know if this back trace would help in analyzing further.
> Meanwhile I shall do a git bisect and send the inputs.
>
> Thanks
> Divya
>
>
>
Hi All,

 From the git bisect,seems like the commit 57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above issue.

Thanks
Divya

^ permalink raw reply

* Re: [PATCH 00/27] KVM PPC PV framework
From: Alexander Graf @ 2010-07-09  6:33 UTC (permalink / raw)
  To: MJ embd; +Cc: linuxppc-dev, KVM list, kvm-ppc
In-Reply-To: <AANLkTilp94NW1GMMew3oI-4czkUEA5W-CFqN9UVs8xcZ@mail.gmail.com>


On 09.07.2010, at 06:57, MJ embd wrote:

> On Thu, Jul 1, 2010 at 4:12 PM, Alexander Graf <agraf@suse.de> wrote:
>> On PPC we run PR=3D0 (kernel mode) code in PR=3D1 (user mode) and =
don't use the
>> hypervisor extensions.
>>=20
>> While that is all great to show that virtualization is possible, =
there are
>> quite some cases where the emulation overhead of privileged =
instructions is
>> killing performance.
>>=20
>> This patchset tackles exactly that issue. It introduces a paravirtual =
framework
>> using which KVM and Linux share a page to exchange register state =
with. That
>=20
> KVM and Linux or KVM and GuestOS ?

KVM and GuestOS. The first user is of course Linux.

Alex

^ permalink raw reply

* [PATCH 13/13] powerpc/oprofile: Don't build server oprofile drivers on 64-bit BookE
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-12-git-send-email-benh@kernel.crashing.org>

They will fail to build due to the lack of mtmsrd, and wouldn't
be useful anyways

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/oprofile/Makefile |    2 +-
 arch/powerpc/oprofile/common.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/oprofile/Makefile b/arch/powerpc/oprofile/Makefile
index 73e1c2c..e219ca4 100644
--- a/arch/powerpc/oprofile/Makefile
+++ b/arch/powerpc/oprofile/Makefile
@@ -16,6 +16,6 @@ oprofile-y := $(DRIVER_OBJS) common.o backtrace.o
 oprofile-$(CONFIG_OPROFILE_CELL) += op_model_cell.o \
 		cell/spu_profiler.o cell/vma_map.o \
 		cell/spu_task_sync.o
-oprofile-$(CONFIG_PPC64) += op_model_rs64.o op_model_power4.o op_model_pa6t.o
+oprofile-$(CONFIG_PPC_BOOK3S_64) += op_model_rs64.o op_model_power4.o op_model_pa6t.o
 oprofile-$(CONFIG_FSL_EMB_PERFMON) += op_model_fsl_emb.o
 oprofile-$(CONFIG_6xx) += op_model_7450.o
diff --git a/arch/powerpc/oprofile/common.c b/arch/powerpc/oprofile/common.c
index 21f16ed..d65e68f 100644
--- a/arch/powerpc/oprofile/common.c
+++ b/arch/powerpc/oprofile/common.c
@@ -199,7 +199,7 @@ int __init oprofile_arch_init(struct oprofile_operations *ops)
 		return -ENODEV;
 
 	switch (cur_cpu_spec->oprofile_type) {
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
 #ifdef CONFIG_OPROFILE_CELL
 		case PPC_OPROFILE_CELL:
 			if (firmware_has_feature(FW_FEATURE_LPAR))
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 12/13] powerpc/book3e: Adjust the page sizes list based on MMU config
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-11-git-send-email-benh@kernel.crashing.org>

Use the MMU config registers to scan for available direct and
indirect page sizes and print out the result. Will be needed
for future hugetlbfs implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/mmu-book3e.h |    4 +
 arch/powerpc/include/asm/reg_booke.h  |    1 +
 arch/powerpc/mm/tlb_nohash.c          |  136 +++++++++++++++++++++++++--------
 3 files changed, 109 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 7469581..87a1d78 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -193,6 +193,10 @@ struct mmu_psize_def
 {
 	unsigned int	shift;	/* number of bits */
 	unsigned int	enc;	/* PTE encoding */
+	unsigned int    ind;    /* Corresponding indirect page size shift */
+	unsigned int	flags;
+#define MMU_PAGE_SIZE_DIRECT	0x1	/* Supported as a direct size */
+#define MMU_PAGE_SIZE_INDIRECT	0x2	/* Supported as an indirect size */
 };
 extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
index 66dc6f0..667a498 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -62,6 +62,7 @@
 #define SPRN_TLB0PS	0x158	/* TLB 0 Page Size Register */
 #define SPRN_MAS5_MAS6	0x15c	/* MMU Assist Register 5 || 6 */
 #define SPRN_MAS8_MAS1	0x15d	/* MMU Assist Register 8 || 1 */
+#define SPRN_EPTCFG	0x15e	/* Embedded Page Table Config */
 #define SPRN_MAS7_MAS3	0x174	/* MMU Assist Register 7 || 3 */
 #define SPRN_MAS0_MAS1	0x175	/* MMU Assist Register 0 || 1 */
 #define SPRN_IVOR0	0x190	/* Interrupt Vector Offset Register 0 */
diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index 2ce42bf..3b10f80 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -46,6 +46,7 @@
 struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
 	[MMU_PAGE_4K] = {
 		.shift	= 12,
+		.ind	= 20,
 		.enc	= BOOK3E_PAGESZ_4K,
 	},
 	[MMU_PAGE_16K] = {
@@ -54,6 +55,7 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
 	},
 	[MMU_PAGE_64K] = {
 		.shift	= 16,
+		.ind	= 28,
 		.enc	= BOOK3E_PAGESZ_64K,
 	},
 	[MMU_PAGE_1M] = {
@@ -62,6 +64,7 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
 	},
 	[MMU_PAGE_16M] = {
 		.shift	= 24,
+		.ind	= 36,
 		.enc	= BOOK3E_PAGESZ_16M,
 	},
 	[MMU_PAGE_256M] = {
@@ -344,16 +347,108 @@ void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address)
 	}
 }
 
-/*
- * Early initialization of the MMU TLB code
- */
-static void __early_init_mmu(int boot_cpu)
+static void setup_page_sizes(void)
+{
+	unsigned int tlb0cfg = mfspr(SPRN_TLB0CFG);
+	unsigned int tlb0ps = mfspr(SPRN_TLB0PS);
+	unsigned int eptcfg = mfspr(SPRN_EPTCFG);
+	int i, psize;
+
+	/* Look for supported direct sizes */
+	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
+		struct mmu_psize_def *def = &mmu_psize_defs[psize];
+
+		if (tlb0ps & (1U << (def->shift - 10)))
+			def->flags |= MMU_PAGE_SIZE_DIRECT;
+	}
+
+	/* Indirect page sizes supported ? */
+	if ((tlb0cfg & TLBnCFG_IND) == 0)
+		goto no_indirect;
+
+	/* Now, we only deal with one IND page size for each
+	 * direct size. Hopefully all implementations today are
+	 * unambiguous, but we might want to be careful in the
+	 * future.
+	 */
+	for (i = 0; i < 3; i++) {
+		unsigned int ps, sps;
+
+		sps = eptcfg & 0x1f;
+		eptcfg >>= 5;
+		ps = eptcfg & 0x1f;
+		eptcfg >>= 5;
+		if (!ps || !sps)
+			continue;
+		for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
+			struct mmu_psize_def *def = &mmu_psize_defs[psize];
+
+			if (ps == (def->shift - 10))
+				def->flags |= MMU_PAGE_SIZE_INDIRECT;
+			if (sps == (def->shift - 10))
+				def->ind = ps + 10;
+		}
+	}
+ no_indirect:
+
+	/* Cleanup array and print summary */
+	pr_info("MMU: Supported page sizes\n");
+	for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
+		struct mmu_psize_def *def = &mmu_psize_defs[psize];
+		const char *__page_type_names[] = {
+			"unsupported",
+			"direct",
+			"indirect",
+			"direct & indirect"
+		};
+		if (def->flags == 0) {
+			def->shift = 0;	
+			continue;
+		}
+		pr_info("  %8ld KB as %s\n", 1ul << (def->shift - 10),
+			__page_type_names[def->flags & 0x3]);
+	}
+}
+
+static void setup_mmu_htw(void)
 {
 	extern unsigned int interrupt_base_book3e;
 	extern unsigned int exc_data_tlb_miss_htw_book3e;
 	extern unsigned int exc_instruction_tlb_miss_htw_book3e;
 
 	unsigned int *ibase = &interrupt_base_book3e;
+
+	/* Check if HW tablewalk is present, and if yes, enable it by:
+	 *
+	 * - patching the TLB miss handlers to branch to the
+	 *   one dedicates to it
+	 *
+	 * - setting the global book3e_htw_enabled
+       	 */
+	unsigned int tlb0cfg = mfspr(SPRN_TLB0CFG);
+
+	if ((tlb0cfg & TLBnCFG_IND) &&
+	    (tlb0cfg & TLBnCFG_PT)) {
+		/* Our exceptions vectors start with a NOP and -then- a branch
+		 * to deal with single stepping from userspace which stops on
+		 * the second instruction. Thus we need to patch the second
+		 * instruction of the exception, not the first one
+		 */
+		patch_branch(ibase + (0x1c0 / 4) + 1,
+			     (unsigned long)&exc_data_tlb_miss_htw_book3e, 0);
+		patch_branch(ibase + (0x1e0 / 4) + 1,
+			     (unsigned long)&exc_instruction_tlb_miss_htw_book3e, 0);
+		book3e_htw_enabled = 1;
+	}
+	pr_info("MMU: Book3E Page Tables %s\n",
+		book3e_htw_enabled ? "Enabled" : "Disabled");
+}
+
+/*
+ * Early initialization of the MMU TLB code
+ */
+static void __early_init_mmu(int boot_cpu)
+{
 	unsigned int mas4;
 
 	/* XXX This will have to be decided at runtime, but right
@@ -370,40 +465,17 @@ static void __early_init_mmu(int boot_cpu)
 	 */
 	mmu_vmemmap_psize = MMU_PAGE_16M;
 
-	/* Check if HW tablewalk is present, and if yes, enable it by:
-	 *
-	 * - patching the TLB miss handlers to branch to the
-	 *   one dedicates to it
-	 *
-	 * - setting the global book3e_htw_enabled
-	 *
-	 * - Set MAS4:INDD and default page size
-	 */
-
 	/* XXX This code only checks for TLB 0 capabilities and doesn't
 	 *     check what page size combos are supported by the HW. It
 	 *     also doesn't handle the case where a separate array holds
 	 *     the IND entries from the array loaded by the PT.
 	 */
 	if (boot_cpu) {
-		unsigned int tlb0cfg = mfspr(SPRN_TLB0CFG);
-
-		/* Check if HW loader is supported */
-		if ((tlb0cfg & TLBnCFG_IND) &&
-		    (tlb0cfg & TLBnCFG_PT)) {
-			/* Our exceptions vectors start with a NOP and -then- a branch
-			 * to deal with single stepping from userspace which stops on
-			 * the second instruction. Thus we need to patch the second
-			 * instruction of the exception, not the first one
-			 */
-			patch_branch(ibase + (0x1c0 / 4) + 1,
-				(unsigned long)&exc_data_tlb_miss_htw_book3e, 0);
-			patch_branch(ibase + (0x1e0 / 4) + 1,
-				(unsigned long)&exc_instruction_tlb_miss_htw_book3e, 0);
-			book3e_htw_enabled = 1;
-		}
-		pr_info("MMU: Book3E Page Tables %s\n",
-			book3e_htw_enabled ? "Enabled" : "Disabled");
+		/* Look for supported page sizes */
+		setup_page_sizes();
+
+		/* Look for HW tablewalk support */
+		setup_mmu_htw();
 	}
 
 	/* Set MAS4 based on page table setting */
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 10/13] powerpc/book3e: Fix single step when using HW page tables
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-9-git-send-email-benh@kernel.crashing.org>

We patch the TLB miss exception vectors to point to alternate
functions when using HW page table on BookE.

However, we were patching in a new branch in the first instruction
of the exception handler instead of the second one, thus overriding
the nop that is in the first instruction.

This cause problems when single stepping as we rely on that nop for
the single step to stop properly within the exception vector range
rather than on the target of the branch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/exceptions-64e.S |    6 ++++++
 arch/powerpc/mm/tlb_nohash.c         |   13 +++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 316465a..5c43063 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -191,6 +191,12 @@ exc_##n##_bad_stack:							    \
 	sth	r1,PACA_TRAP_SAVE(r13);	/* store trap */		    \
 	b	bad_stack_book3e;	/* bad stack error */
 
+/* WARNING: If you change the layout of this stub, make sure you chcek
+	*   the debug exception handler which handles single stepping
+	*   into exceptions from userspace, and the MM code in
+	*   arch/powerpc/mm/tlb_nohash.c which patches the branch here
+	*   and would need to be updated if that branch is moved
+	*/
 #define	EXCEPTION_STUB(loc, label)					\
 	. = interrupt_base_book3e + loc;				\
 	nop;	/* To make debug interrupts happy */			\
diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index e81d5d6..2ce42bf 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -391,10 +391,15 @@ static void __early_init_mmu(int boot_cpu)
 		/* Check if HW loader is supported */
 		if ((tlb0cfg & TLBnCFG_IND) &&
 		    (tlb0cfg & TLBnCFG_PT)) {
-			patch_branch(ibase + (0x1c0 / 4),
-			     (unsigned long)&exc_data_tlb_miss_htw_book3e, 0);
-			patch_branch(ibase + (0x1e0 / 4),
-			     (unsigned long)&exc_instruction_tlb_miss_htw_book3e, 0);
+			/* Our exceptions vectors start with a NOP and -then- a branch
+			 * to deal with single stepping from userspace which stops on
+			 * the second instruction. Thus we need to patch the second
+			 * instruction of the exception, not the first one
+			 */
+			patch_branch(ibase + (0x1c0 / 4) + 1,
+				(unsigned long)&exc_data_tlb_miss_htw_book3e, 0);
+			patch_branch(ibase + (0x1e0 / 4) + 1,
+				(unsigned long)&exc_instruction_tlb_miss_htw_book3e, 0);
 			book3e_htw_enabled = 1;
 		}
 		pr_info("MMU: Book3E Page Tables %s\n",
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 11/13] powerpc/book3e: Add TLB dump in xmon for Book3E
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-10-git-send-email-benh@kernel.crashing.org>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/xmon/xmon.c |  152 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 152 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 8bad7d5..0554445 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -155,6 +155,9 @@ static int do_spu_cmd(void);
 #ifdef CONFIG_44x
 static void dump_tlb_44x(void);
 #endif
+#ifdef CONFIG_PPC_BOOK3E
+static void dump_tlb_book3e(void);
+#endif
 
 static int xmon_no_auto_backtrace;
 
@@ -888,6 +891,11 @@ cmds(struct pt_regs *excp)
 			dump_tlb_44x();
 			break;
 #endif
+#ifdef CONFIG_PPC_BOOK3E
+		case 'u':
+			dump_tlb_book3e();
+			break;
+#endif
 		default:
 			printf("Unrecognized command: ");
 		        do {
@@ -2701,6 +2709,150 @@ static void dump_tlb_44x(void)
 }
 #endif /* CONFIG_44x */
 
+#ifdef CONFIG_PPC_BOOK3E
+static void dump_tlb_book3e(void)
+{
+	u32 mmucfg, pidmask, lpidmask;
+	u64 ramask;
+	int i, tlb, ntlbs, pidsz, lpidsz, rasz, lrat = 0;
+	int mmu_version;
+	static const char *pgsz_names[] = {
+		"  1K",
+		"  2K",
+		"  4K",
+		"  8K",
+		" 16K",
+		" 32K",
+		" 64K",
+		"128K",
+		"256K",
+		"512K",
+		"  1M",
+		"  2M",
+		"  4M",
+		"  8M",
+		" 16M",
+		" 32M",
+		" 64M",
+		"128M",
+		"256M",
+		"512M",
+		"  1G",
+		"  2G",
+		"  4G",
+		"  8G",
+		" 16G",
+		" 32G",
+		" 64G",
+		"128G",
+		"256G",
+		"512G",
+		"  1T",
+		"  2T",
+	};
+
+	/* Gather some infos about the MMU */
+	mmucfg = mfspr(SPRN_MMUCFG);
+	mmu_version = (mmucfg & 3) + 1;
+	ntlbs = ((mmucfg >> 2) & 3) + 1;
+	pidsz = ((mmucfg >> 6) & 0x1f) + 1;
+	lpidsz = (mmucfg >> 24) & 0xf;
+	rasz = (mmucfg >> 16) & 0x7f;
+	if ((mmu_version > 1) && (mmucfg & 0x10000))
+		lrat = 1;
+	printf("Book3E MMU MAV=%d.0,%d TLBs,%d-bit PID,%d-bit LPID,%d-bit RA\n",
+	       mmu_version, ntlbs, pidsz, lpidsz, rasz);
+	pidmask = (1ul << pidsz) - 1;
+	lpidmask = (1ul << lpidsz) - 1;
+	ramask = (1ull << rasz) - 1;
+
+	for (tlb = 0; tlb < ntlbs; tlb++) {
+		u32 tlbcfg;
+		int nent, assoc, new_cc = 1;
+		printf("TLB %d:\n------\n", tlb);
+		switch(tlb) {
+		case 0:
+			tlbcfg = mfspr(SPRN_TLB0CFG);
+			break;
+		case 1:
+			tlbcfg = mfspr(SPRN_TLB1CFG);
+			break;
+		case 2:
+			tlbcfg = mfspr(SPRN_TLB2CFG);
+			break;
+		case 3:
+			tlbcfg = mfspr(SPRN_TLB3CFG);
+			break;
+		default:
+			printf("Unsupported TLB number !\n");
+			continue;
+		}
+		nent = tlbcfg & 0xfff;
+		assoc = (tlbcfg >> 24) & 0xff;
+		for (i = 0; i < nent; i++) {
+			u32 mas0 = MAS0_TLBSEL(tlb);
+			u32 mas1 = MAS1_TSIZE(BOOK3E_PAGESZ_4K);
+			u64 mas2 = 0;
+			u64 mas7_mas3;
+			int esel = i, cc = i;
+
+			if (assoc != 0) {
+				cc = i / assoc;
+				esel = i % assoc;
+				mas2 = cc * 0x1000;
+			}
+
+			mas0 |= MAS0_ESEL(esel);
+			mtspr(SPRN_MAS0, mas0);
+			mtspr(SPRN_MAS1, mas1);
+			mtspr(SPRN_MAS2, mas2);
+			asm volatile("tlbre  0,0,0" : : : "memory");
+			mas1 = mfspr(SPRN_MAS1);
+			mas2 = mfspr(SPRN_MAS2);
+			mas7_mas3 = mfspr(SPRN_MAS7_MAS3);
+			if (assoc && (i % assoc) == 0)
+				new_cc = 1;
+			if (!(mas1 & MAS1_VALID))
+				continue;
+			if (assoc == 0)
+				printf("%04x- ", i);
+			else if (new_cc)
+				printf("%04x-%c", cc, 'A' + esel);
+			else
+				printf("    |%c", 'A' + esel);
+			new_cc = 0;
+			printf(" %016llx %04x %s %c%c AS%c",
+			       mas2 & ~0x3ffull,
+			       (mas1 >> 16) & 0x3fff,
+			       pgsz_names[(mas1 >> 7) & 0x1f],
+			       mas1 & MAS1_IND ? 'I' : ' ',
+			       mas1 & MAS1_IPROT ? 'P' : ' ',
+			       mas1 & MAS1_TS ? '1' : '0');
+			printf(" %c%c%c%c%c%c%c",
+			       mas2 & MAS2_X0 ? 'a' : ' ',
+			       mas2 & MAS2_X1 ? 'v' : ' ',
+			       mas2 & MAS2_W  ? 'w' : ' ',
+			       mas2 & MAS2_I  ? 'i' : ' ',
+			       mas2 & MAS2_M  ? 'm' : ' ',
+			       mas2 & MAS2_G  ? 'g' : ' ',
+			       mas2 & MAS2_E  ? 'e' : ' ');
+			printf(" %016llx", mas7_mas3 & ramask & ~0x7ffull);
+			if (mas1 & MAS1_IND)
+				printf(" %s\n",
+				       pgsz_names[(mas7_mas3 >> 1) & 0x1f]);
+			else
+				printf(" U%c%c%c S%c%c%c\n",
+				       mas7_mas3 & MAS3_UX ? 'x' : ' ',
+				       mas7_mas3 & MAS3_UW ? 'w' : ' ',
+				       mas7_mas3 & MAS3_UR ? 'r' : ' ',
+				       mas7_mas3 & MAS3_SX ? 'x' : ' ',
+				       mas7_mas3 & MAS3_SW ? 'w' : ' ',
+				       mas7_mas3 & MAS3_SR ? 'r' : ' ');
+		}
+	}
+}
+#endif /* CONFIG_PPC_BOOK3E */
+
 static void xmon_init(int enable)
 {
 #ifdef CONFIG_PPC_ISERIES
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 08/13] powerpc/book3e: Resend doorbell exceptions to ourself
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-7-git-send-email-benh@kernel.crashing.org>

From: Michael Ellerman <michael@ellerman.id.au>

If we are soft disabled and receive a doorbell exception we don't process
it immediately. This means we need to check on the way out of irq restore
if there are any doorbell exceptions to process.

The problem is at that point we don't know what our regs are, and that
in turn makes xmon unhappy. To workaround the problem, instead of checking
for and processing doorbells, we check for any doorbells and if there were
any we send ourselves another.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/dbell.h |    1 +
 arch/powerpc/kernel/dbell.c      |   10 ++++++++++
 arch/powerpc/kernel/irq.c        |    4 ++--
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index ced7e48..0893ab9 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -29,6 +29,7 @@ enum ppc_dbell {
 
 extern void doorbell_message_pass(int target, int msg);
 extern void doorbell_exception(struct pt_regs *regs);
+extern void doorbell_check_self(void);
 extern void doorbell_setup_this_cpu(void);
 
 static inline void ppc_msgsnd(enum ppc_dbell type, u32 flags, u32 tag)
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index f7b5188..3307a52 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -81,6 +81,16 @@ out:
 	set_irq_regs(old_regs);
 }
 
+void doorbell_check_self(void)
+{
+	struct doorbell_cpu_info *info = &__get_cpu_var(doorbell_cpu_info);
+
+	if (!info->messages)
+		return;
+
+	ppc_msgsnd(PPC_DBELL, 0, info->tag);
+}
+
 #else /* CONFIG_SMP */
 void doorbell_exception(struct pt_regs *regs)
 {
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 2f6dc7f..8f96d31 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -156,8 +156,8 @@ notrace void raw_local_irq_restore(unsigned long en)
 		return;
 
 #if defined(CONFIG_BOOKE) && defined(CONFIG_SMP)
-	/* Check for pending doorbell interrupts on SMP */
-	doorbell_exception(NULL);
+	/* Check for pending doorbell interrupts and resend to ourself */
+	doorbell_check_self();
 #endif
 
 	/*
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 09/13] powerpc/book3e: Add generic 64-bit idle powersave support
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-8-git-send-email-benh@kernel.crashing.org>

We use a similar technique to ppc32: We set a thread local flag
to indicate that we are about to enter or have entered the stop
state, and have fixup code in the async interrupt entry code that
reacts to this flag to make us return to a different location
(sets NIP to LINK in our case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/machdep.h   |    1 +
 arch/powerpc/kernel/Makefile         |    2 +-
 arch/powerpc/kernel/exceptions-64e.S |   23 ++++++++++
 arch/powerpc/kernel/idle_book3e.S    |   81 ++++++++++++++++++++++++++++++++++
 4 files changed, 106 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/idle_book3e.S

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 2bad6e5..adc8e6c 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -278,6 +278,7 @@ extern void e500_idle(void);
 extern void power4_idle(void);
 extern void power4_cpu_offline_powersave(void);
 extern void ppc6xx_idle(void);
+extern void book3e_idle(void);
 
 /*
  * ppc_md contains a copy of the machine description structure for the
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 8a33318..77d831a 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -37,7 +37,7 @@ obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
-obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
+obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o idle_book3e.o
 obj-$(CONFIG_PPC64)		+= vdso64/
 obj-$(CONFIG_ALTIVEC)		+= vecemu.o
 obj-$(CONFIG_PPC_970_NAP)	+= idle_power4.o
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index a42637c..316465a 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -204,11 +204,30 @@ exc_##n##_bad_stack:							    \
 	lis	r,TSR_FIS@h;						\
 	mtspr	SPRN_TSR,r
 
+/* Used by asynchronous interrupt that may happen in the idle loop.
+ *
+ * This check if the thread was in the idle loop, and if yes, returns
+ * to the caller rather than the PC. This is to avoid a race if
+ * interrupts happen before the wait instruction.
+ */
+#define CHECK_NAPPING()							\
+	clrrdi	r11,r1,THREAD_SHIFT;					\
+	ld	r10,TI_LOCAL_FLAGS(r11);				\
+	andi.	r9,r10,_TLF_NAPPING;					\
+	beq+	1f;							\
+	ld	r8,_LINK(r1);						\
+	rlwinm	r7,r10,0,~_TLF_NAPPING;					\
+	std	r8,_NIP(r1);						\
+	std	r7,TI_LOCAL_FLAGS(r11);					\
+1:
+
+
 #define MASKABLE_EXCEPTION(trapnum, label, hdlr, ack)			\
 	START_EXCEPTION(label);						\
 	NORMAL_EXCEPTION_PROLOG(trapnum, PROLOG_ADDITION_MASKABLE)	\
 	EXCEPTION_COMMON(trapnum, PACA_EXGEN, INTS_DISABLE_ALL)		\
 	ack(r8);							\
+	CHECK_NAPPING();						\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;				\
 	bl	hdlr;							\
 	b	.ret_from_except_lite;
@@ -257,6 +276,7 @@ interrupt_end_book3e:
 	CRIT_EXCEPTION_PROLOG(0x100, PROLOG_ADDITION_NONE)
 //	EXCEPTION_COMMON(0x100, PACA_EXCRIT, INTS_DISABLE_ALL)
 //	bl	special_reg_save_crit
+//	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
 //	bl	.critical_exception
 //	b	ret_from_crit_except
@@ -268,6 +288,7 @@ interrupt_end_book3e:
 //	EXCEPTION_COMMON(0x200, PACA_EXMC, INTS_DISABLE_ALL)
 //	bl	special_reg_save_mc
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
+//	CHECK_NAPPING();
 //	bl	.machine_check_exception
 //	b	ret_from_mc_except
 	b	.
@@ -338,6 +359,7 @@ interrupt_end_book3e:
 	CRIT_EXCEPTION_PROLOG(0x9f0, PROLOG_ADDITION_NONE)
 //	EXCEPTION_COMMON(0x9f0, PACA_EXCRIT, INTS_DISABLE_ALL)
 //	bl	special_reg_save_crit
+//	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
 //	bl	.unknown_exception
 //	b	ret_from_crit_except
@@ -434,6 +456,7 @@ kernel_dbg_exc:
 	CRIT_EXCEPTION_PROLOG(0x2080, PROLOG_ADDITION_NONE)
 //	EXCEPTION_COMMON(0x2080, PACA_EXCRIT, INTS_DISABLE_ALL)
 //	bl	special_reg_save_crit
+//	CHECK_NAPPING();
 //	addi	r3,r1,STACK_FRAME_OVERHEAD
 //	bl	.doorbell_critical_exception
 //	b	ret_from_crit_except
diff --git a/arch/powerpc/kernel/idle_book3e.S b/arch/powerpc/kernel/idle_book3e.S
new file mode 100644
index 0000000..3150804
--- /dev/null
+++ b/arch/powerpc/kernel/idle_book3e.S
@@ -0,0 +1,81 @@
+/*
+ * Copyright 2010 IBM Corp, Benjamin Herrenschmidt <benh@kernel.crashing.org>
+ *
+ * Generic idle routine for Book3E processors
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/threads.h>
+#include <asm/reg.h>
+#include <asm/ppc_asm.h>
+#include <asm/asm-offsets.h>
+#include <asm/ppc-opcode.h>
+#include <asm/processor.h>
+#include <asm/thread_info.h>
+
+/* 64-bit version only for now */
+#ifdef CONFIG_PPC64
+
+_GLOBAL(book3e_idle)
+	/* Save LR for later */
+	mflr	r0
+	std	r0,16(r1)
+
+	/* Hard disable interrupts */
+	wrteei	0
+
+	/* Now check if an interrupt came in while we were soft disabled
+	 * since we may otherwise lose it (doorbells etc...). We know
+	 * that since PACAHARDIRQEN will have been cleared in that case.
+	 */
+	lbz	r3,PACAHARDIRQEN(r13)
+	cmpwi	cr0,r3,0
+	beqlr
+
+	/* Now we are going to mark ourselves as soft and hard enables in
+	 * order to be able to take interrupts while asleep. We inform lockdep
+	 * of that. We don't actually turn interrupts on just yet tho.
+	 */
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	.trace_hardirqs_on
+#endif
+	li	r0,1
+	stb	r0,PACASOFTIRQEN(r13)
+	stb	r0,PACAHARDIRQEN(r13)
+	
+	/* Interrupts will make use return to LR, so get something we want
+	 * in there
+	 */
+	bl	1f
+
+	/* We are back from the interrupt, the caller will local_irq_enable()
+	 * so to avoid stupid warning, let's turn them off here if irqtrace
+	 * is enabled.
+	 */
+#ifdef CONFIG_TRACE_IRQFLAGS
+	li	r0,0
+	stb	r0,PACASOFTIRQEN(r13)
+	bl	.trace_hardirqs_off
+#endif
+	ld	r0,16(r1)
+	mtlr	r0
+	blr
+
+1:	/* Let's set the _TLF_NAPPING flag so interrupts make us return
+	 * to the right spot
+	*/
+	clrrdi	r11,r1,THREAD_SHIFT
+	ld	r10,TI_LOCAL_FLAGS(r11)
+	ori	r10,r10,_TLF_NAPPING
+	std	r10,TI_LOCAL_FLAGS(r11)
+
+	/* We can now re-enable hard interrupts and go to sleep */
+	wrteei	1
+1:	PPC_WAIT(0)
+	b	1b
+
+#endif /* CONFIG_PPC64 */
\ No newline at end of file
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 06/13] powerpc/book3e: Hookup doorbells exceptions on 64-bit Book3E
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-5-git-send-email-benh@kernel.crashing.org>

Note that critical doorbells are an unimplemented stub just like
other critical or machine check handlers, since we haven't done
support for "levelled" exceptions yet.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/Makefile         |    1 +
 arch/powerpc/kernel/exceptions-64e.S |   21 +++++++++++++++++----
 arch/powerpc/kernel/irq.c            |    7 +++++++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 0100604..8a33318 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -68,6 +68,7 @@ obj64-$(CONFIG_HIBERNATION)	+= swsusp_asm64.o
 obj-$(CONFIG_MODULES)		+= module.o module_$(CONFIG_WORD_SIZE).o
 obj-$(CONFIG_44x)		+= cpu_setup_44x.o
 obj-$(CONFIG_FSL_BOOKE)		+= cpu_setup_fsl_booke.o dbell.o
+obj-$(CONFIG_PPC_BOOK3E_64)	+= dbell.o
 
 extra-y				:= head_$(CONFIG_WORD_SIZE).o
 extra-$(CONFIG_PPC_BOOK3E_32)	:= head_new_booke.o
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 24dcc0e..a42637c 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -246,11 +246,9 @@ interrupt_base_book3e:					/* fake trap */
 	EXCEPTION_STUB(0x1a0, watchdog)			/* 0x09f0 */
 	EXCEPTION_STUB(0x1c0, data_tlb_miss)
 	EXCEPTION_STUB(0x1e0, instruction_tlb_miss)
+	EXCEPTION_STUB(0x280, doorbell)
+	EXCEPTION_STUB(0x2a0, doorbell_crit)
 
-#if 0
-	EXCEPTION_STUB(0x280, processor_doorbell)
-	EXCEPTION_STUB(0x220, processor_doorbell_crit)
-#endif
 	.globl interrupt_end_book3e
 interrupt_end_book3e:
 
@@ -428,6 +426,19 @@ interrupt_end_book3e:
 kernel_dbg_exc:
 	b	.	/* NYI */
 
+/* Doorbell interrupt */
+	MASKABLE_EXCEPTION(0x2070, doorbell, .doorbell_exception, ACK_NONE)
+
+/* Doorbell critical Interrupt */
+	START_EXCEPTION(doorbell_crit);
+	CRIT_EXCEPTION_PROLOG(0x2080, PROLOG_ADDITION_NONE)
+//	EXCEPTION_COMMON(0x2080, PACA_EXCRIT, INTS_DISABLE_ALL)
+//	bl	special_reg_save_crit
+//	addi	r3,r1,STACK_FRAME_OVERHEAD
+//	bl	.doorbell_critical_exception
+//	b	ret_from_crit_except
+	b	.
+
 
 /*
  * An interrupt came in while soft-disabled; clear EE in SRR1,
@@ -563,6 +574,8 @@ BAD_STACK_TRAMPOLINE(0xd00)
 BAD_STACK_TRAMPOLINE(0xe00)
 BAD_STACK_TRAMPOLINE(0xf00)
 BAD_STACK_TRAMPOLINE(0xf20)
+BAD_STACK_TRAMPOLINE(0x2070)
+BAD_STACK_TRAMPOLINE(0x2080)
 
 	.globl	bad_stack_book3e
 bad_stack_book3e:
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index fa6f385..2f6dc7f 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -64,6 +64,8 @@
 #include <asm/ptrace.h>
 #include <asm/machdep.h>
 #include <asm/udbg.h>
+#include <asm/dbell.h>
+
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
 #include <asm/firmware.h>
@@ -153,6 +155,11 @@ notrace void raw_local_irq_restore(unsigned long en)
 	if (get_hard_enabled())
 		return;
 
+#if defined(CONFIG_BOOKE) && defined(CONFIG_SMP)
+	/* Check for pending doorbell interrupts on SMP */
+	doorbell_exception(NULL);
+#endif
+
 	/*
 	 * Need to hard-enable interrupts here.  Since currently disabled,
 	 * no need to take further asm precautions against preemption; but
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 07/13] powerpc/book3e: Use set_irq_regs() in the msgsnd/msgrcv IPI path
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: David Gibson
In-Reply-To: <1278656215-24705-6-git-send-email-benh@kernel.crashing.org>

From: David Gibson <david@gibson.dropbear.id.au>

include/asm-generic/irq_regs.h declares per-cpu irq_regs variables and
get_irq_regs() and set_irq_regs() helper functions to maintain them.
These can be used to access the proper pt_regs structure related to the
current interrupt entry (if any).

In the powerpc arch code, this is used to maintain irq regs on
decrementer and external interrupt exceptions.  However, for the
doorbell exceptions used by the msgsnd/msgrcv IPI mechanism of newer
BookE CPUs, the irq_regs are not kept up to date.

In particular this means that xmon will not work properly on SMP,
because the secondary xmon instances started by IPI will blow up when
they cannot retrieve the irq regs.

This patch fixes the problem by adding calls to maintain the irq regs
across doorbell exceptions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/dbell.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index 1c7a945..f7b5188 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -16,6 +16,7 @@
 #include <linux/percpu.h>
 
 #include <asm/dbell.h>
+#include <asm/irq_regs.h>
 
 #ifdef CONFIG_SMP
 struct doorbell_cpu_info {
@@ -63,17 +64,21 @@ void doorbell_message_pass(int target, int msg)
 
 void doorbell_exception(struct pt_regs *regs)
 {
+	struct pt_regs *old_regs = set_irq_regs(regs);
 	struct doorbell_cpu_info *info = &__get_cpu_var(doorbell_cpu_info);
 	int msg;
 
 	/* Warning: regs can be NULL when called from irq enable */
 
 	if (!info->messages || (num_online_cpus() < 2))
-		return;
+		goto out;
 
 	for (msg = 0; msg < 4; msg++)
 		if (test_and_clear_bit(msg, &info->messages))
 			smp_message_recv(msg);
+
+out:
+	set_irq_regs(old_regs);
 }
 
 #else /* CONFIG_SMP */
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 05/13] powerpc/book3e: Don't re-trigger decrementer on lazy irq restore
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-4-git-send-email-benh@kernel.crashing.org>

The decrementer on BookE acts as a level interrupt and doesn't
need to be re-triggered when going negative. It doesn't go
negative anyways (unless programmed to auto-reload with a
negative value) as it stops when reaching 0.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/irq.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 77be3d0..fa6f385 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -159,8 +159,17 @@ notrace void raw_local_irq_restore(unsigned long en)
 	 * use local_paca instead of get_paca() to avoid preemption checking.
 	 */
 	local_paca->hard_enabled = en;
+
+#ifndef CONFIG_BOOKE
+	/* On server, re-trigger the decrementer if it went negative since
+	 * some processors only trigger on edge transitions of the sign bit.
+	 *
+	 * BookE has a level sensitive decrementer (latches in TSR) so we
+	 * don't need that
+	 */
 	if ((int)mfspr(SPRN_DEC) < 0)
 		mtspr(SPRN_DEC, 1);
+#endif /* CONFIG_BOOKE */
 
 	/*
 	 * Force the delivery of pending soft-disabled interrupts on PS3.
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 03/13] powerpc/book3e: Move doorbell_exception from traps.c to dbell.c
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-2-git-send-email-benh@kernel.crashing.org>

... where it belongs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/dbell.c |   22 +++++++++++++++++++++-
 arch/powerpc/kernel/traps.c |   21 ---------------------
 2 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index 1493734..e3a7177 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -41,4 +41,24 @@ void smp_dbell_message_pass(int target, int msg)
 		ppc_msgsnd(PPC_DBELL, PPC_DBELL_MSG_BRDCAST, 0);
 	}
 }
-#endif
+
+void doorbell_exception(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+	int msg;
+
+	if (num_online_cpus() < 2)
+		return;
+
+	for (msg = 0; msg < 4; msg++)
+		if (test_and_clear_bit(msg, &dbell_smp_message[cpu]))
+			smp_message_recv(msg);
+}
+
+#else /* CONFIG_SMP */
+void doorbell_exception(struct pt_regs *regs)
+{
+	printk(KERN_WARNING "Received doorbell on non-smp system\n");
+}
+#endif /* CONFIG_SMP */
+
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index e5fe5a8..a45a63c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -55,9 +55,6 @@
 #endif
 #include <asm/kexec.h>
 #include <asm/ppc-opcode.h>
-#ifdef CONFIG_FSL_BOOKE
-#include <asm/dbell.h>
-#endif
 
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 int (*__debugger)(struct pt_regs *regs) __read_mostly;
@@ -1342,24 +1339,6 @@ void vsx_assist_exception(struct pt_regs *regs)
 #endif /* CONFIG_VSX */
 
 #ifdef CONFIG_FSL_BOOKE
-
-void doorbell_exception(struct pt_regs *regs)
-{
-#ifdef CONFIG_SMP
-	int cpu = smp_processor_id();
-	int msg;
-
-	if (num_online_cpus() < 2)
-		return;
-
-	for (msg = 0; msg < 4; msg++)
-		if (test_and_clear_bit(msg, &dbell_smp_message[cpu]))
-			smp_message_recv(msg);
-#else
-	printk(KERN_WARNING "Received doorbell on non-smp system\n");
-#endif
-}
-
 void CacheLockingException(struct pt_regs *regs, unsigned long address,
 			   unsigned long error_code)
 {
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 04/13] powerpc/book3e: More doorbell cleanups. Sample the PIR register
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <1278656215-24705-3-git-send-email-benh@kernel.crashing.org>

The doorbells use the content of the PIR register to match messages
from other CPUs. This may or may not be the same as our linux CPU
number, so using that as the "target" is no right.

Instead, we sample the PIR register at boot on every processor
and use that value subsequently when sending IPIs.

We also use a per-cpu message mask rather than a global array which
should limit cache line contention.

Note: We could use the CPU number in the device-tree instead of
the PIR register, as they are supposed to be equivalent. This
might prove useful if doorbells are to be used to kick CPUs out
of FW at boot time, thus before we can sample the PIR. This is
however not the case now and using the PIR just works.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/dbell.h  |    7 ++---
 arch/powerpc/kernel/dbell.c       |   47 ++++++++++++++++++++++++++----------
 arch/powerpc/platforms/85xx/smp.c |    4 ++-
 3 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index 501189a..ced7e48 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -27,10 +27,9 @@ enum ppc_dbell {
 	PPC_G_DBELL_MC = 4,	/* guest mcheck doorbell */
 };
 
-#ifdef CONFIG_SMP
-extern unsigned long dbell_smp_message[NR_CPUS];
-extern void smp_dbell_message_pass(int target, int msg);
-#endif
+extern void doorbell_message_pass(int target, int msg);
+extern void doorbell_exception(struct pt_regs *regs);
+extern void doorbell_setup_this_cpu(void);
 
 static inline void ppc_msgsnd(enum ppc_dbell type, u32 flags, u32 tag)
 {
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index e3a7177..1c7a945 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -13,45 +13,66 @@
 #include <linux/kernel.h>
 #include <linux/smp.h>
 #include <linux/threads.h>
+#include <linux/percpu.h>
 
 #include <asm/dbell.h>
 
 #ifdef CONFIG_SMP
-unsigned long dbell_smp_message[NR_CPUS];
+struct doorbell_cpu_info {
+	unsigned long	messages;	/* current messages bits */
+	unsigned int	tag;		/* tag value */
+};
 
-void smp_dbell_message_pass(int target, int msg)
+static DEFINE_PER_CPU(struct doorbell_cpu_info, doorbell_cpu_info);
+
+void doorbell_setup_this_cpu(void)
+{
+	struct doorbell_cpu_info *info = &__get_cpu_var(doorbell_cpu_info);
+
+	info->messages = 0;
+	info->tag = mfspr(SPRN_PIR) & 0x3fff;
+}
+
+void doorbell_message_pass(int target, int msg)
 {
+	struct doorbell_cpu_info *info;
 	int i;
 
-	if(target < NR_CPUS) {
-		set_bit(msg, &dbell_smp_message[target]);
-		ppc_msgsnd(PPC_DBELL, 0, target);
+	if (target < NR_CPUS) {
+		info = &per_cpu(doorbell_cpu_info, target);
+		set_bit(msg, &info->messages);
+		ppc_msgsnd(PPC_DBELL, 0, info->tag);
 	}
-	else if(target == MSG_ALL_BUT_SELF) {
+	else if (target == MSG_ALL_BUT_SELF) {
 		for_each_online_cpu(i) {
 			if (i == smp_processor_id())
 				continue;
-			set_bit(msg, &dbell_smp_message[i]);
-			ppc_msgsnd(PPC_DBELL, 0, i);
+			info = &per_cpu(doorbell_cpu_info, i);
+			set_bit(msg, &info->messages);
+			ppc_msgsnd(PPC_DBELL, 0, info->tag);
 		}
 	}
 	else { /* target == MSG_ALL */
-		for_each_online_cpu(i)
-			set_bit(msg, &dbell_smp_message[i]);
+		for_each_online_cpu(i) {
+			info = &per_cpu(doorbell_cpu_info, i);
+			set_bit(msg, &info->messages);
+		}
 		ppc_msgsnd(PPC_DBELL, PPC_DBELL_MSG_BRDCAST, 0);
 	}
 }
 
 void doorbell_exception(struct pt_regs *regs)
 {
-	int cpu = smp_processor_id();
+	struct doorbell_cpu_info *info = &__get_cpu_var(doorbell_cpu_info);
 	int msg;
 
-	if (num_online_cpus() < 2)
+	/* Warning: regs can be NULL when called from irq enable */
+
+	if (!info->messages || (num_online_cpus() < 2))
 		return;
 
 	for (msg = 0; msg < 4; msg++)
-		if (test_and_clear_bit(msg, &dbell_smp_message[cpu]))
+		if (test_and_clear_bit(msg, &info->messages))
 			smp_message_recv(msg);
 }
 
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index a15f582..4c3cde9 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -99,6 +99,8 @@ static void __init
 smp_85xx_setup_cpu(int cpu_nr)
 {
 	mpic_setup_this_cpu();
+	if (cpu_has_feature(CPU_FTR_DBELL))
+		doorbell_setup_this_cpu();
 }
 
 struct smp_ops_t smp_85xx_ops = {
@@ -117,7 +119,7 @@ void __init mpc85xx_smp_init(void)
 	}
 
 	if (cpu_has_feature(CPU_FTR_DBELL))
-		smp_85xx_ops.message_pass = smp_dbell_message_pass;
+		smp_85xx_ops.message_pass = doorbell_message_pass;
 
 	BUG_ON(!smp_85xx_ops.message_pass);
 
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 01/13] powerpc/book3e: mtmsr should not be mtmsrd on book3e 64-bit
From: Benjamin Herrenschmidt @ 2010-07-09  6:16 UTC (permalink / raw)
  To: linuxppc-dev

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/reg.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index d62fdf4..d8be016 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -890,7 +890,7 @@
 #ifndef __ASSEMBLY__
 #define mfmsr()		({unsigned long rval; \
 			asm volatile("mfmsr %0" : "=r" (rval)); rval;})
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
 #define __mtmsrd(v, l)	asm volatile("mtmsrd %0," __stringify(l) \
 				     : : "r" (v) : "memory")
 #define mtmsrd(v)	__mtmsrd((v), 0)
-- 
1.6.3.3

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox