LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2 3/3] KVM: PPC: epapr: install ev_idle hcall for e500 guest
From: Scott Wood @ 2012-01-09 21:15 UTC (permalink / raw)
  To: Liu Yu; +Cc: kvm, kvm-ppc, agraf, linuxppc-dev, timur
In-Reply-To: <1325754412-29963-2-git-send-email-yu.liu@freescale.com>

On 01/05/2012 03:06 AM, Liu Yu wrote:
> diff --git a/arch/powerpc/kernel/idle_e500.S b/arch/powerpc/kernel/idle_e500.S
> index 3e2b95c..6ea95f0 100644
> --- a/arch/powerpc/kernel/idle_e500.S
> +++ b/arch/powerpc/kernel/idle_e500.S
> @@ -85,6 +85,23 @@ END_FTR_SECTION_IFSET(CPU_FTR_L2CSR|CPU_FTR_CAN_NAP)
>  2:	b	2b
>  #endif /* !E500MC */
>  
> +#ifdef CONFIG_KVM_GUEST
> +/*
> + * r3 contains the pointer to in[8]
> + * r4 contains the pointer to out[8]
> + * r5 contains the hcall vendor and nr
> + * r6 contains the handler which send hcall
> + */
> +_GLOBAL(e500_ev_idle)
> +	rlwinm	r7,r1,0,0,31-THREAD_SHIFT	/* current thread_info */
> +	lwz	r8,TI_LOCAL_FLAGS(r7)	/* set napping bit */
> +	ori	r8,r8,_TLF_NAPPING	/* so when we take an exception */
> +	stw	r8,TI_LOCAL_FLAGS(r7)	/* it will return to our caller */
> +	wrteei	1
> +	mtctr	r6
> +	bctr
> +#endif /* KVM_GUEST */

You'll need to branch back to the hcall invocation in an infinite loop
-- the only way we should leave is via an interrupt.

> +static void kvm_hcall_idle(void)
> +{
> +#ifdef CONFIG_KVM_E500
> +	ulong in[8];
> +	ulong out[8];
> +
> +	e500_ev_idle(in, out, HC_VENDOR_EPAPR | HC_EV_IDLE, kvm_hypercall);
> +#endif
> +}

kvm_hypercall is C code.  As stated before, you cannot use C code while
_TLF_NAPPING is set.

> +static bool kvm_para_has_idle(void)
> +{
> +#ifdef CONFIG_BOOKE
> +	return epapr_hcall_has_idle;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static int __init kvm_guest_init(void)
>  {
>  	if (!kvm_para_available())
> @@ -594,6 +614,10 @@ static int __init kvm_guest_init(void)
>  	powersave_nap = 1;
>  #endif
>  
> +	/* Install hcall based power_save for guest kernel */
> +	if (kvm_para_has_idle())
> +		ppc_md.power_save = kvm_hcall_idle;

Why did you only move it halfway out of KVM code?  ePAPR features such
as idle hcall should work on any ePAPR hypervisor, even with all KVM
code disabled.

Plus everything Alex said. :-)

-Scott

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 21:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <F0432B7A-E742-48B0-8016-3F2D0A800553@suse.de>

On 01/09/2012 11:48 AM, Alexander Graf wrote:
> 
> On 21.12.2011, at 02:34, Scott Wood wrote:
>> +#ifdef CONFIG_PPC_FPU
>> +	/* Save userspace FPU state in stack */
>> +	enable_kernel_fp();
>> +	memcpy(fpr, current->thread.fpr, sizeof(current->thread.fpr));
>> +	fpscr = current->thread.fpscr.val;
>> +	fpexc_mode = current->thread.fpexc_mode;
>> +
>> +	/* Restore guest FPU state to thread */
>> +	memcpy(current->thread.fpr, vcpu->arch.fpr, sizeof(vcpu->arch.fpr));
>> +	current->thread.fpscr.val = vcpu->arch.fpscr;
>> +
>> +	/*
>> +	 * Since we can't trap on MSR_FP in GS-mode, we consider the guest
>> +	 * as always using the FPU.  Kernel usage of FP (via
>> +	 * enable_kernel_fp()) in this thread must not occur while
>> +	 * vcpu->fpu_active is set.
>> +	 */
>> +	vcpu->fpu_active = 1;
>> +
>> +	kvmppc_load_guest_fp(vcpu);
>> +#endif
> 
> Do you think it's possible to combine this with the book3s_pr code, so we don't duplicate too much here?

book3s_pr is a bit different in that it can trap when the guest sets
MSR[FP].

Maybe a few lines could be factored out (the first memcpy, fpscr,
fpexc_mode).  I'm not sure that it makes sense given the lack of
isolation between what it's doing and what the rest of the code is doing.

>> +/*
>> + * Load up guest vcpu FP state if it's needed.
>> + * It also set the MSR_FP in thread so that host know
>> + * we're holding FPU, and then host can help to save
>> + * guest vcpu FP state if other threads require to use FPU.
>> + * This simulates an FP unavailable fault.
>> + *
>> + * It requires to be called with preemption disabled.
>> + */
>> +static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu)
>> +{
>> +#ifdef CONFIG_PPC_FPU
>> +	if (vcpu->fpu_active && !(current->thread.regs->msr & MSR_FP)) {
>> +		load_up_fpu();
>> +		current->thread.regs->msr |= MSR_FP;
> 
> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).

shadow_msr is the real MSR.

shared->msr is the guest's view of MSR.

current->thread.regs->msr is nominally userspace's MSR.  In this case we
use it to tell host Linux that FP is in use and must be saved on context
switch.  The actual userspace MSR_FP is known to be clear at this point
because we called enable_kernel_fp().  It will be clear again when we
return to userspace because we'll call giveup_fpu().

-Scott

^ permalink raw reply

* [git pull] Please pull powerpc.git merge branch
From: Kumar Gala @ 2012-01-09 21:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev

The following changes since commit a0e86bd4252519321b0d102dc4ed90557aa7bee9:

  audit: always follow va_copy() with va_end() (2012-01-08 14:15:21 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git merge

Michael Neuling (1):
      powerpc: fix compile error with 85xx/p1022_ds.c

 arch/powerpc/platforms/85xx/p1022_ds.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:17 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B60A5.1040208@freescale.com>


On 09.01.2012, at 22:48, Scott Wood wrote:

> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>=20
>> On 21.12.2011, at 02:34, Scott Wood wrote:
>>> +#ifdef CONFIG_PPC_FPU
>>> +	/* Save userspace FPU state in stack */
>>> +	enable_kernel_fp();
>>> +	memcpy(fpr, current->thread.fpr, sizeof(current->thread.fpr));
>>> +	fpscr =3D current->thread.fpscr.val;
>>> +	fpexc_mode =3D current->thread.fpexc_mode;
>>> +
>>> +	/* Restore guest FPU state to thread */
>>> +	memcpy(current->thread.fpr, vcpu->arch.fpr, =
sizeof(vcpu->arch.fpr));
>>> +	current->thread.fpscr.val =3D vcpu->arch.fpscr;
>>> +
>>> +	/*
>>> +	 * Since we can't trap on MSR_FP in GS-mode, we consider the =
guest
>>> +	 * as always using the FPU.  Kernel usage of FP (via
>>> +	 * enable_kernel_fp()) in this thread must not occur while
>>> +	 * vcpu->fpu_active is set.
>>> +	 */
>>> +	vcpu->fpu_active =3D 1;
>>> +
>>> +	kvmppc_load_guest_fp(vcpu);
>>> +#endif
>>=20
>> Do you think it's possible to combine this with the book3s_pr code, =
so we don't duplicate too much here?
>=20
> book3s_pr is a bit different in that it can trap when the guest sets
> MSR[FP].

Ah, there's no doorbell? So you always have to swap fpu registers? You =
still have to enable it manually when preempting in, right? IIRC ppc32 =
does lazy fpu activation.

> Maybe a few lines could be factored out (the first memcpy, fpscr,
> fpexc_mode).  I'm not sure that it makes sense given the lack of
> isolation between what it's doing and what the rest of the code is =
doing.

Yeah, looking at the code it does look pretty different. Too bad - I =
would've hoped to throw the vmx code in as well so we could get =
vmx/vsx/whatever for free later.

>=20
>>> +/*
>>> + * Load up guest vcpu FP state if it's needed.
>>> + * It also set the MSR_FP in thread so that host know
>>> + * we're holding FPU, and then host can help to save
>>> + * guest vcpu FP state if other threads require to use FPU.
>>> + * This simulates an FP unavailable fault.
>>> + *
>>> + * It requires to be called with preemption disabled.
>>> + */
>>> +static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +#ifdef CONFIG_PPC_FPU
>>> +	if (vcpu->fpu_active && !(current->thread.regs->msr & MSR_FP)) {
>>> +		load_up_fpu();
>>> +		current->thread.regs->msr |=3D MSR_FP;
>>=20
>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>=20
> shadow_msr is the real MSR.
>=20
> shared->msr is the guest's view of MSR.
>=20
> current->thread.regs->msr is nominally userspace's MSR.  In this case =
we
> use it to tell host Linux that FP is in use and must be saved on =
context
> switch.  The actual userspace MSR_FP is known to be clear at this =
point
> because we called enable_kernel_fp().  It will be clear again when we
> return to userspace because we'll call giveup_fpu().

Ah, this is thread.regs, not vcpu.regs. Sorry, I misread that part. This =
way it obviously makes a lot more sense.


Alex

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 22:39 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <08DCCAC0-82FA-46FC-974C-E7CCA53474A3@suse.de>

On 01/09/2012 04:17 PM, Alexander Graf wrote:
> 
> On 09.01.2012, at 22:48, Scott Wood wrote:
> 
>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>
>>> Do you think it's possible to combine this with the book3s_pr code, so we don't duplicate too much here?
>>
>> book3s_pr is a bit different in that it can trap when the guest sets
>> MSR[FP].
> 
> Ah, there's no doorbell? So you always have to swap fpu registers? You still have to enable it manually when preempting in, right? IIRC ppc32 does lazy fpu activation.

Right.

Preempting in is handled by calling kvmppc_load_guest_fp() (which should
be renamed to be booke-specific, since the semantics are tied to
booke.c) from kvmppc_core_vcpu_load() in e500mc.c.

>>> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).
>>
>> shadow_msr is the real MSR.
>>
>> shared->msr is the guest's view of MSR.

Correction -- this applies to PR-mode (e500v2).

In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
silently prevents it from modifying certain bits), which gets saved on
exit into shared->msr.

-Scott

^ permalink raw reply

* Re: Mac address in the DT
From: Wolfgang Denk @ 2012-01-09 22:40 UTC (permalink / raw)
  To: smitha.vanga; +Cc: scottwood, linuxppc-dev
In-Reply-To: <40631E9A2581F14BA60888C87A76A1FE01D360@HYD-MKD-MBX4.wipro.com>

Dear smitha.vanga@wipro.com,

In message <40631E9A2581F14BA60888C87A76A1FE01D360@HYD-MKD-MBX4.wipro.com> you wrote:
>  
> >Setenv set_mac 'cp 0xffec0000 0x100000 1024;fdt addr 0xc00000 8192;fdt set=
>  /soc8272@f0000000/ethernet@24000 mac-address "[00 44 00 55 00 66]";erase 0x=
> ffec0000 0xffec4000;cp 0xc00000 0xffec0000 1024;bootm 0xfe060000 - 0xffec000=
> 0'
> 
> > run set_mac
> 
> For the above command I want to replace the mac address with the ethaddr> How
> do I do that. I tried $ethaddr but I get extra : characters.

Why would you do that at all?  U-Boot will run fdt_fixup_ethernet()
for all (at least AFAICT) supported architectures, which will replace 
"mac-address" and "local-mac-address" for all "ethernet?" interfaces
in the device tree for which a corresponding "ethadd" / "eth?addr"
environment variable is set.

No additional actions are needed.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
"One day," said a dull voice from down below, "I'm going to  be  back
in  form again and you're going to be very sorry you said that. For a
very long time. I might even go so far as to make even more Time just
for you to be sorry in."              - Terry Pratchett, _Small Gods_

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:47 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B6C91.1090102@freescale.com>


On 09.01.2012, at 23:39, Scott Wood wrote:

> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>=20
>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>=20
>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>=20
>>>> Do you think it's possible to combine this with the book3s_pr code, =
so we don't duplicate too much here?
>>>=20
>>> book3s_pr is a bit different in that it can trap when the guest sets
>>> MSR[FP].
>>=20
>> Ah, there's no doorbell? So you always have to swap fpu registers? =
You still have to enable it manually when preempting in, right? IIRC =
ppc32 does lazy fpu activation.
>=20
> Right.
>=20
> Preempting in is handled by calling kvmppc_load_guest_fp() (which =
should
> be renamed to be booke-specific, since the semantics are tied to
> booke.c) from kvmppc_core_vcpu_load() in e500mc.c.

Ah, and that one's called on sched_in. All is well then :).

>=20
>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>>>=20
>>> shadow_msr is the real MSR.
>>>=20
>>> shared->msr is the guest's view of MSR.
>=20
> Correction -- this applies to PR-mode (e500v2).
>=20
> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
> silently prevents it from modifying certain bits), which gets saved on
> exit into shared->msr.

Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really getting =
confused with having 3 potential msr variables in the vcpu struct.


Alex

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 22:54 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <96605829-1BD3-4668-AEB8-7D27775C2868@suse.de>

On 01/09/2012 04:47 PM, Alexander Graf wrote:
> 
> On 09.01.2012, at 23:39, Scott Wood wrote:
> 
>> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>>
>>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>>
>>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).
>>>>
>>>> shadow_msr is the real MSR.
>>>>
>>>> shared->msr is the guest's view of MSR.
>>
>> Correction -- this applies to PR-mode (e500v2).
>>
>> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
>> silently prevents it from modifying certain bits), which gets saved on
>> exit into shared->msr.
> 
> Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really getting confused with having 3 potential msr variables in the vcpu struct.

An ifdef would take us further down the road of not being able to
support both in the same kernel image (not sure whether that's a
long-term goal -- probably won't happen any time soon with e500v2+e500mc
even disregarding KVM, but maybe it'll be relevant on some other chips),
and in general increase the mess in the struct definition.  How about a
comment?

-Scott

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:56 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B700B.7020006@freescale.com>


On 09.01.2012, at 23:54, Scott Wood wrote:

> On 01/09/2012 04:47 PM, Alexander Graf wrote:
>>=20
>> On 09.01.2012, at 23:39, Scott Wood wrote:
>>=20
>>> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>>>=20
>>>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>>>=20
>>>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>>>>>=20
>>>>> shadow_msr is the real MSR.
>>>>>=20
>>>>> shared->msr is the guest's view of MSR.
>>>=20
>>> Correction -- this applies to PR-mode (e500v2).
>>>=20
>>> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
>>> silently prevents it from modifying certain bits), which gets saved =
on
>>> exit into shared->msr.
>>=20
>> Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really =
getting confused with having 3 potential msr variables in the vcpu =
struct.
>=20
> An ifdef would take us further down the road of not being able to
> support both in the same kernel image (not sure whether that's a
> long-term goal -- probably won't happen any time soon with =
e500v2+e500mc
> even disregarding KVM, but maybe it'll be relevant on some other =
chips),
> and in general increase the mess in the struct definition.  How about =
a
> comment?

Well, I'd like to make sure we don't accidentally access the wrong =
field. But yes, a comment should be ok.

Alex

^ permalink raw reply

* [PATCH] powerpc: Fix RCU idle and hcall tracing
From: Anton Blanchard @ 2012-01-10  0:29 UTC (permalink / raw)
  To: benh, paulus, paulmck; +Cc: linuxppc-dev


Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
region. Since pSeries calls H_CEDE in the idle loop, we were violating
this rule.

commit a7b152d5342c (powerpc: Tell RCU about idle after hcall tracing)
tried to work around it by delaying the rcu_idle_enter until after we
called the hcall tracepoint, but there are a number of issues with it.

The hcall tracepoint trampoline code is called conditionally when the
tracepoint is enabled. If the tracepoint is not enabled we never call
rcu_idle_enter. The idle_uses_rcu check was also done at compile time
which breaks multiplatform builds.

The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
and the hypervisor dispatch trace log to work out if we called H_CEDE.

This fixes a hang during boot on pSeries.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-build/arch/powerpc/kernel/idle.c
===================================================================
--- linux-build.orig/arch/powerpc/kernel/idle.c	2012-01-10 11:07:22.091615183 +1100
+++ linux-build/arch/powerpc/kernel/idle.c	2012-01-10 11:07:57.172264229 +1100
@@ -50,12 +50,6 @@ static int __init powersave_off(char *ar
 }
 __setup("powersave=off", powersave_off);
 
-#if defined(CONFIG_PPC_PSERIES) && defined(CONFIG_TRACEPOINTS)
-static const bool idle_uses_rcu = 1;
-#else
-static const bool idle_uses_rcu;
-#endif
-
 /*
  * The body of the idle task.
  */
@@ -67,8 +61,7 @@ void cpu_idle(void)
 	set_thread_flag(TIF_POLLING_NRFLAG);
 	while (1) {
 		tick_nohz_idle_enter();
-		if (!idle_uses_rcu)
-			rcu_idle_enter();
+		rcu_idle_enter();
 
 		while (!need_resched() && !cpu_should_die()) {
 			ppc64_runlatch_off();
@@ -106,8 +99,7 @@ void cpu_idle(void)
 
 		HMT_medium();
 		ppc64_runlatch_on();
-		if (!idle_uses_rcu)
-			rcu_idle_exit();
+		rcu_idle_exit();
 		tick_nohz_idle_exit();
 		preempt_enable_no_resched();
 		if (cpu_should_die())
Index: linux-build/arch/powerpc/platforms/pseries/lpar.c
===================================================================
--- linux-build.orig/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:07:22.079614961 +1100
+++ linux-build/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:16:55.710226236 +1100
@@ -546,6 +546,13 @@ void __trace_hcall_entry(unsigned long o
 	unsigned long flags;
 	unsigned int *depth;
 
+	/*
+	 * We cannot call tracepoints inside RCU idle regions which
+	 * means we must not trace H_CEDE.
+	 */
+	if (opcode == H_CEDE)
+		return;
+
 	local_irq_save(flags);
 
 	depth = &__get_cpu_var(hcall_trace_depth);
@@ -556,8 +563,6 @@ void __trace_hcall_entry(unsigned long o
 	(*depth)++;
 	preempt_disable();
 	trace_hcall_entry(opcode, args);
-	if (opcode == H_CEDE)
-		rcu_idle_enter();
 	(*depth)--;
 
 out:
@@ -570,6 +575,9 @@ void __trace_hcall_exit(long opcode, uns
 	unsigned long flags;
 	unsigned int *depth;
 
+	if (opcode == H_CEDE)
+		return;
+
 	local_irq_save(flags);
 
 	depth = &__get_cpu_var(hcall_trace_depth);
@@ -578,8 +586,6 @@ void __trace_hcall_exit(long opcode, uns
 		goto out;
 
 	(*depth)++;
-	if (opcode == H_CEDE)
-		rcu_idle_exit();
 	trace_hcall_exit(opcode, retval, retbuf);
 	preempt_enable();
 	(*depth)--;

^ permalink raw reply

* Re: [PATCH] powerpc: Fix RCU idle and hcall tracing
From: Paul E. McKenney @ 2012-01-10  0:43 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: paulus, linuxppc-dev
In-Reply-To: <20120110112915.387e8b1c@kryten>

On Tue, Jan 10, 2012 at 11:29:15AM +1100, Anton Blanchard wrote:
> 
> Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
> region. Since pSeries calls H_CEDE in the idle loop, we were violating
> this rule.
> 
> commit a7b152d5342c (powerpc: Tell RCU about idle after hcall tracing)
> tried to work around it by delaying the rcu_idle_enter until after we
> called the hcall tracepoint, but there are a number of issues with it.
> 
> The hcall tracepoint trampoline code is called conditionally when the
> tracepoint is enabled. If the tracepoint is not enabled we never call
> rcu_idle_enter. The idle_uses_rcu check was also done at compile time
> which breaks multiplatform builds.
> 
> The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
> and the hypervisor dispatch trace log to work out if we called H_CEDE.
> 
> This fixes a hang during boot on pSeries.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
> 
> Index: linux-build/arch/powerpc/kernel/idle.c
> ===================================================================
> --- linux-build.orig/arch/powerpc/kernel/idle.c	2012-01-10 11:07:22.091615183 +1100
> +++ linux-build/arch/powerpc/kernel/idle.c	2012-01-10 11:07:57.172264229 +1100
> @@ -50,12 +50,6 @@ static int __init powersave_off(char *ar
>  }
>  __setup("powersave=off", powersave_off);
> 
> -#if defined(CONFIG_PPC_PSERIES) && defined(CONFIG_TRACEPOINTS)
> -static const bool idle_uses_rcu = 1;
> -#else
> -static const bool idle_uses_rcu;
> -#endif
> -
>  /*
>   * The body of the idle task.
>   */
> @@ -67,8 +61,7 @@ void cpu_idle(void)
>  	set_thread_flag(TIF_POLLING_NRFLAG);
>  	while (1) {
>  		tick_nohz_idle_enter();
> -		if (!idle_uses_rcu)
> -			rcu_idle_enter();
> +		rcu_idle_enter();
> 
>  		while (!need_resched() && !cpu_should_die()) {
>  			ppc64_runlatch_off();
> @@ -106,8 +99,7 @@ void cpu_idle(void)
> 
>  		HMT_medium();
>  		ppc64_runlatch_on();
> -		if (!idle_uses_rcu)
> -			rcu_idle_exit();
> +		rcu_idle_exit();
>  		tick_nohz_idle_exit();
>  		preempt_enable_no_resched();
>  		if (cpu_should_die())
> Index: linux-build/arch/powerpc/platforms/pseries/lpar.c
> ===================================================================
> --- linux-build.orig/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:07:22.079614961 +1100
> +++ linux-build/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:16:55.710226236 +1100
> @@ -546,6 +546,13 @@ void __trace_hcall_entry(unsigned long o
>  	unsigned long flags;
>  	unsigned int *depth;
> 
> +	/*
> +	 * We cannot call tracepoints inside RCU idle regions which
> +	 * means we must not trace H_CEDE.
> +	 */
> +	if (opcode == H_CEDE)
> +		return;
> +
>  	local_irq_save(flags);
> 
>  	depth = &__get_cpu_var(hcall_trace_depth);
> @@ -556,8 +563,6 @@ void __trace_hcall_entry(unsigned long o
>  	(*depth)++;
>  	preempt_disable();
>  	trace_hcall_entry(opcode, args);
> -	if (opcode == H_CEDE)
> -		rcu_idle_enter();
>  	(*depth)--;
> 
>  out:
> @@ -570,6 +575,9 @@ void __trace_hcall_exit(long opcode, uns
>  	unsigned long flags;
>  	unsigned int *depth;
> 
> +	if (opcode == H_CEDE)
> +		return;
> +
>  	local_irq_save(flags);
> 
>  	depth = &__get_cpu_var(hcall_trace_depth);
> @@ -578,8 +586,6 @@ void __trace_hcall_exit(long opcode, uns
>  		goto out;
> 
>  	(*depth)++;
> -	if (opcode == H_CEDE)
> -		rcu_idle_exit();
>  	trace_hcall_exit(opcode, retval, retbuf);
>  	preempt_enable();
>  	(*depth)--;
> 

^ permalink raw reply

* Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
From: Scott Wood @ 2012-01-10  0:51 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1A656B06-E290-4B75-8F98-F8FAB8B817F1@suse.de>

On 01/09/2012 11:46 AM, Alexander Graf wrote:
> 
> On 21.12.2011, at 02:34, Scott Wood wrote:
> 
>> Chips such as e500mc that implement category E.HV in Power ISA 2.06
>> provide hardware virtualization features, including a new MSR mode for
>> guest state.  The guest OS can perform many operations without trapping
>> into the hypervisor, including transitions to and from guest userspace.
>>
>> Since we can use SRR1[GS] to reliably tell whether an exception came from
>> guest state, instead of messing around with IVPR, we use DO_KVM similarly
>> to book3s.
> 
> Is there any benefit of using DO_KVM? I would assume that messing with IVPR is faster.

Using the GS bit to decide which handler to run means we won't get
confused if a machine check or critical interrupt happens between
entering/exiting the guest and updating IVPR (we could use the IS bit
similarly in PR-mode).

This could be supplemented with IVPR (though that will add a few cycles
to guest entry/exit) or some sort of runtime patching (would be more
coarse-grained, active when any KVM guest exists) to avoid adding
overhead to traps when KVM is not used, but I'd like to quantify that
overhead first.  It should be much lower than what happens on book3s.

>> Current issues include:
>> - Machine checks from guest state are not routed to the host handler.
>> - The guest can cause a host oops by executing an emulated instruction
>>   in a page that lacks read permission.  Existing e500/4xx support has
>>   the same problem.
> 
> We solve that in book3s pr by doing
> 
>   LAST_INST = <known bad value>;
>   PACA->kvm_mode = <recover at next inst>;
>   lwz(guest pc);
>   do_more_stuff();
> 
> That way when an exception occurs at lwz() the DO_KVM handler checks that we're in kvm mode "recover" which does basically srr0+=4; rfi;.

I was thinking we'd check ESR[EPID] or SRR1[IS] as appropriate, and
treat it as a kernel fault (search exception table) -- but this works
too and is a bit cleaner (could be other uses of external pid), at the
expense of a couple extra instructions in the emulation path (but
probably a slightly faster host TLB handler).

The check wouldn't go in DO_KVM, though, since on bookehv that only
deals with diverting flow when xSRR1[GS] is set, which wouldn't be the
case here.

>> @@ -243,16 +324,20 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
>> 	case BOOKE_IRQPRIO_AP_UNAVAIL:
>> 	case BOOKE_IRQPRIO_ALIGNMENT:
>> 		allowed = 1;
>> -		msr_mask = MSR_CE|MSR_ME|MSR_DE;
>> +		msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE;
> 
> No need to do this. You already force MSR_GS in set_msr();

OK.  This was here since before set_msr() started doing that. :-)

>> +	if (!current->thread.kvm_vcpu) {
>> +		WARN(1, "no vcpu\n");
>> +		return -EPERM;
>> +	}
> 
> Huh?

Oops, leftover debugging.

>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
>> +{
>> +	enum emulation_result er;
>> +
>> +	er = kvmppc_emulate_instruction(run, vcpu);
>> +	switch (er) {
>> +	case EMULATE_DONE:
>> +		/* don't overwrite subtypes, just account kvm_stats */
>> +		kvmppc_account_exit_stat(vcpu, EMULATED_INST_EXITS);
>> +		/* Future optimization: only reload non-volatiles if
>> +		 * they were actually modified by emulation. */
>> +		return RESUME_GUEST_NV;
>> +
>> +	case EMULATE_DO_DCR:
>> +		run->exit_reason = KVM_EXIT_DCR;
>> +		return RESUME_HOST;
>> +
>> +	case EMULATE_FAIL:
>> +		/* XXX Deliver Program interrupt to guest. */
>> +		printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
>> +		       __func__, vcpu->arch.regs.nip, vcpu->arch.last_inst);
> 
> This should be throttled, otherwise the guest can spam our logs.

Yes it should, but I'm just moving the code here.

>> +		/* For debugging, encode the failing instruction and
>> +		 * report it to userspace. */
>> +		run->hw.hardware_exit_reason = ~0ULL << 32;
>> +		run->hw.hardware_exit_reason |= vcpu->arch.last_inst;
> 
> 
> I'm fairly sure you want to fix this :)

Likewise, that's what booke.c already does.  What should it do instead?

> /**
>>  * kvmppc_handle_exit
>>  *
>> @@ -374,12 +530,39 @@ out:
>> int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>                        unsigned int exit_nr)
>> {
>> -	enum emulation_result er;
>> 	int r = RESUME_HOST;
>>
>> 	/* update before a new last_exit_type is rewritten */
>> 	kvmppc_update_timing_stats(vcpu);
>>
>> +	/*
>> +	 * If we actually care, we could copy MSR, DEAR, and ESR to regs,
>> +	 * insert an appropriate trap number, etc.
>> +	 *
>> +	 * Seems like a waste of cycles for something that should only matter
>> +	 * to someone using sysrq-t/p or similar host kernel debug facility.
>> +	 * We have other debug facilities to get that information from a
>> +	 * guest through userspace.
>> +	 */
>> +	switch (exit_nr) {
>> +	case BOOKE_INTERRUPT_EXTERNAL:
>> +		do_IRQ(&vcpu->arch.regs);
> 
> Ah, so that's what you want to use regs for. So is having a pt_regs
> struct that only contains useful register values in half its fields
> any useful here? Or could we keep control of the registers ourselves,
> enabling us to maybe one day optimize things more.

I think it contains enough to be useful for debugging code such as sysrq
and tracers, and as noted in the comment we could copy the rest if we
care enough.  MSR might be worth copying.

It will eventually be used for machine checks as well, which I'd like to
hand reasonable register state to, at least for GPRs, LR, and PC.

If there's a good enough performance reason, we could just copy
everything over for machine checks and pass NULL to do_IRQ (I think it
can take this -- a dummy regs struct if not), but it seems premature at
the moment unless the switch already causes measured performance loss
(cache utilization?).

>> @@ -387,30 +570,56 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>
>> 	switch (exit_nr) {
>> 	case BOOKE_INTERRUPT_MACHINE_CHECK:
>> -		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
>> -		kvmppc_dump_vcpu(vcpu);
>> -		r = RESUME_HOST;
>> +		kvm_resched(vcpu);
>> +		r = RESUME_GUEST;
> 
> huh?

Patch shuffling accident -- this belongs with a later patch that invokes
the host machine check handler similar to what is done with do_IRQ().
The host machine check handler needs some work first, though.

>> 		break;
>>
>> 	case BOOKE_INTERRUPT_EXTERNAL:
>> 		kvmppc_account_exit(vcpu, EXT_INTR_EXITS);
>> -		if (need_resched())
>> -			cond_resched();
>> +		kvm_resched(vcpu);
> 
> Why are we explicit about the resched? On book3s I just call kvm_resched(vcpu) before the switch().

There are a few exit types where we don't currently do the resched -- if
they're all bugs or don't-cares, we could move it out of the switch.

We probably should defer the check until after we've disabled
interrupts, similar to signals -- even if we didn't exit for an
interrupt, we could have received one after enabling them.

>> +		if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>> +			/* The guest TLB had a mapping, but the shadow TLB
>> +			 * didn't. This could be because:
>> +			 * a) the entry is mapping the host kernel, or
>> +			 * b) the guest used a large mapping which we're faking
>> +			 * Either way, we need to satisfy the fault without
>> +			 * invoking the guest. */
>> +			kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
>> +		} else {
>> +			/* Guest mapped and leaped at non-RAM! */
>> +			kvmppc_booke_queue_irqprio(vcpu,
>> +						   BOOKE_IRQPRIO_MACHINE_CHECK);
> 
> Are you sure? Couldn't this also be MMIO? That doesn't really improve the situation as executing from MMIO is tricky with the KVM model, but it's not necessarily bad. Oh well, I guess we'll have to do something and throwing an #MC isn't all that ugly.

I think I asked you about executing from MMIO once, and you said it
wasn't supported even in straight QEMU.  Have things changed?

>> diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
>> index 05d1d99..d53bcf2 100644
>> --- a/arch/powerpc/kvm/booke.h
>> +++ b/arch/powerpc/kvm/booke.h
>> @@ -48,7 +48,20 @@
>> #define BOOKE_IRQPRIO_PERFORMANCE_MONITOR 19
>> /* Internal pseudo-irqprio for level triggered externals */
>> #define BOOKE_IRQPRIO_EXTERNAL_LEVEL 20
>> -#define BOOKE_IRQPRIO_MAX 20
>> +#define BOOKE_IRQPRIO_DBELL 21
>> +#define BOOKE_IRQPRIO_DBELL_CRIT 22
>> +#define BOOKE_IRQPRIO_MAX 23
> 
> So was MAX wrong before or is it too big now?

MAX is just a marker for how many IRQPRIOs we have, not any sort of
external limit.  This patch adds new IRQPRIOs, so MAX goes up.

The actual limit is the number of bits in a long.

>> +	.if	\flags & NEED_EMU
>> +	lwz	r9, VCPU_KVM(r4)
> 
> writing r9
> 
>> +	.endif
>> +
>> +#ifdef CONFIG_KVM_EXIT_TIMING
>> +	/* save exit time */
>> +1:	mfspr	r7, SPRN_TBRU
>> +	mfspr	r8, SPRN_TBRL
>> +	mfspr	r9, SPRN_TBRU
> 
> overwriting r9 again?

Oops.  It's RFC for a reason. :-)

>> +#ifndef CONFIG_64BIT
> 
> Double negation is always hard to read. Please reverse the ifdef :)

OK.

>> +lightweight_exit:
>> +	PPC_STL	r2, HOST_R2(r1)
>> +
>> +	mfspr	r3, SPRN_PID
>> +	stw	r3, VCPU_HOST_PID(r4)
>> +	lwz	r3, VCPU_GUEST_PID(r4)
>> +	mtspr	SPRN_PID, r3
>> +
>> +	/* Save vcpu pointer for the exception handlers
>> +	 * must be done before loading guest r2.
>> +	 */
>> +//	SET_VCPU(r4)
> 
> hm?

Can just be removed, it's handled in booke's vcpu load/put.

>> +	lwz	r6, (VCPU_SHARED_MAS2 + 4)(r11)
>> +#else
>> +	ld	r6, (VCPU_SHARED_MAS2)(r11)
>> +#endif
>> +	lwz	r7, VCPU_SHARED_MAS7_3+4(r11)
>> +	lwz	r8, VCPU_SHARED_MAS4(r11)
>> +	mtspr	SPRN_MAS0, r3
>> +	mtspr	SPRN_MAS1, r5
>> +	mtspr	SPRN_MAS2, r6
>> +	mtspr	SPRN_MAS3, r7
>> +	mtspr	SPRN_MAS4, r8
>> +	lwz	r3, VCPU_SHARED_MAS6(r11)
>> +	lwz	r5, VCPU_SHARED_MAS7_3+0(r11)
>> +	mtspr	SPRN_MAS6, r3
>> +	mtspr	SPRN_MAS7, r5
>> +	/* Disable MAS register updates via exception */
>> +	mfspr	r3, SPRN_EPCR
>> +	oris	r3, r3, SPRN_EPCR_DMIUH@h
>> +	mtspr	SPRN_EPCR, r3
> 
> Shouldn't this happen before you set the MAS registers? :)

Yes (though we really shouldn't be getting a TLB miss here, at least on
e500mc).

>> +	/* Load some guest volatiles. */
>> +	PPC_LL	r3, VCPU_LR(r4)
>> +	PPC_LL	r5, VCPU_XER(r4)
>> +	PPC_LL	r6, VCPU_CTR(r4)
>> +	PPC_LL	r7, VCPU_CR(r4)
>> +	PPC_LL	r8, VCPU_PC(r4)
>> +#ifndef CONFIG_64BIT
>> +	lwz	r9, (VCPU_SHARED_MSR + 4)(r11)
>> +#else
>> +	ld	r9, (VCPU_SHARED_MSR)(r11)
>> +#endif
>> +	PPC_LL	r0, VCPU_GPR(r0)(r4)
>> +	PPC_LL	r1, VCPU_GPR(r1)(r4)
>> +	PPC_LL	r2, VCPU_GPR(r2)(r4)
>> +	PPC_LL	r10, VCPU_GPR(r10)(r4)
>> +	PPC_LL	r11, VCPU_GPR(r11)(r4)
>> +	PPC_LL	r12, VCPU_GPR(r12)(r4)
>> +	PPC_LL	r13, VCPU_GPR(r13)(r4)
>> +	mtlr	r3
>> +	mtxer	r5
>> +	mtctr	r6
>> +	mtcr	r7
>> +	mtsrr0	r8
>> +	mtsrr1	r9
> 
> Are you sure this should be shared->msr, not shadow_msr?

Yes, we don't use shadow_msr on bookehv.  I'll add a comment in the
struct definition as discussed in the other thread, as well as other
areas where there are subtle differences between PR-mode and GS-mode.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2][v2] mtd/nand:Fix wrong address read in is_blank()
From: Scott Wood @ 2012-01-10  1:07 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linux-mtd, linuxppc-dev, Poonam Aggrwal
In-Reply-To: <1326111847-13085-1-git-send-email-prabhakar@freescale.com>

On 01/09/2012 06:24 AM, Prabhakar Kushwaha wrote:
> @@ -215,12 +215,15 @@ static int is_blank(struct mtd_info *mtd, unsigned int bufnum)
>  static int check_read_ecc(struct mtd_info *mtd, struct fsl_ifc_ctrl *ctrl,
>  			  u32 *eccstat, unsigned int bufnum)
>  {
> +	struct nand_chip *chip = mtd->priv;
> +	int bufperpage = mtd->writesize / chip->ecc.size;
> +	int eccbuf_num = bufnum + (bufnum / bufperpage) * bufperpage;

This is unnecessarily complicated (and introduces two more, dependent,
runtime divisions).  I don't think there are any changes required in
this function.  You're awkwardly compensating for the fact that the
caller hasn't been updated for the new definition of bufnum.

bufperpage used in fsl_ifc_run_command should be doubled to account for
the OOB buffers.  We should probably rename it from "buf" to something
else (chunk? subpage?) to avoid confusion with bufnum_mask, which refers
to page-sized buffers.

>  	u32 reg = eccstat[bufnum / 4];
>  	int errors = (reg >> ((3 - bufnum % 4) * 8)) & 15;
>  
>  	if (errors == 15) { /* uncorrectable */
>  		/* Blank pages fail hw ECC checks */
> -		if (is_blank(mtd, bufnum))
> +		if (is_blank(mtd, eccbuf_num))
>  			return 1;
>  
>  		/*
> @@ -273,7 +276,7 @@ static void fsl_ifc_run_command(struct mtd_info *mtd)
>  		dev_err(priv->dev, "NAND Flash Write Protect Error\n");
>  
>  	if (nctrl->eccread) {
> -		int bufperpage = mtd->writesize / 512;
> +		int bufperpage = mtd->writesize / chip->ecc.size;
>  		int bufnum = (nctrl->page & priv->bufnum_mask) * bufperpage;
>  		int bufnum_end = bufnum + bufperpage - 1;
>  

Again, please calculate bufperpage (chunksperpage? subsperpage? perbuf?)
at driver init, as is done with bufnum_mask.

-Scott

^ permalink raw reply

* Re: [PATCH 2/2][v2] mtd/nand: Fix IFC driver to support 2K NAND page
From: Scott Wood @ 2012-01-10  1:10 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linux-mtd, linuxppc-dev, Poonam Aggrwal
In-Reply-To: <1326111866-13120-1-git-send-email-prabhakar@freescale.com>

On 01/09/2012 06:24 AM, Prabhakar Kushwaha wrote:
> 1) OOB area should be updated irrespective of NAND page size. Earlier it was
> updated only for 512byte NAND page.
> 
> 2) During OOB update fbcr should be equal to OOB size.
> 
> Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com>
> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
> ---
>  git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git (branch next)
> 
>  This patch is created on top of IFC driver patch (already floated in mailing
>  list). Please find their link:
>  http://patchwork.ozlabs.org/patch/133315/
>  http://patchwork.ozlabs.org/patch/133316/

Looks good.

-Scott

^ permalink raw reply

* Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
From: Alexander Graf @ 2012-01-10  3:11 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B8B7D.4@freescale.com>


On 10.01.2012, at 01:51, Scott Wood wrote:

> On 01/09/2012 11:46 AM, Alexander Graf wrote:
>>=20
>> On 21.12.2011, at 02:34, Scott Wood wrote:
>>=20
>>> Chips such as e500mc that implement category E.HV in Power ISA 2.06
>>> provide hardware virtualization features, including a new MSR mode =
for
>>> guest state.  The guest OS can perform many operations without =
trapping
>>> into the hypervisor, including transitions to and from guest =
userspace.
>>>=20
>>> Since we can use SRR1[GS] to reliably tell whether an exception came =
from
>>> guest state, instead of messing around with IVPR, we use DO_KVM =
similarly
>>> to book3s.
>>=20
>> Is there any benefit of using DO_KVM? I would assume that messing =
with IVPR is faster.
>=20
> Using the GS bit to decide which handler to run means we won't get
> confused if a machine check or critical interrupt happens between
> entering/exiting the guest and updating IVPR (we could use the IS bit
> similarly in PR-mode).
>=20
> This could be supplemented with IVPR (though that will add a few =
cycles
> to guest entry/exit) or some sort of runtime patching (would be more
> coarse-grained, active when any KVM guest exists) to avoid adding
> overhead to traps when KVM is not used, but I'd like to quantify that
> overhead first.  It should be much lower than what happens on book3s.

Hrm. Yeah, given that your DO_KVM handler is so much simpler, it might =
make sense to stick with that method. Benchmarks would be useful in the =
long run though.

>=20
>>> Current issues include:
>>> - Machine checks from guest state are not routed to the host =
handler.
>>> - The guest can cause a host oops by executing an emulated =
instruction
>>>  in a page that lacks read permission.  Existing e500/4xx support =
has
>>>  the same problem.
>>=20
>> We solve that in book3s pr by doing
>>=20
>>  LAST_INST =3D <known bad value>;
>>  PACA->kvm_mode =3D <recover at next inst>;
>>  lwz(guest pc);
>>  do_more_stuff();
>>=20
>> That way when an exception occurs at lwz() the DO_KVM handler checks =
that we're in kvm mode "recover" which does basically srr0+=3D4; rfi;.
>=20
> I was thinking we'd check ESR[EPID] or SRR1[IS] as appropriate, and
> treat it as a kernel fault (search exception table) -- but this works
> too and is a bit cleaner (could be other uses of external pid), at the
> expense of a couple extra instructions in the emulation path (but
> probably a slightly faster host TLB handler).
>=20
> The check wouldn't go in DO_KVM, though, since on bookehv that only
> deals with diverting flow when xSRR1[GS] is set, which wouldn't be the
> case here.

Yup, not sure where you'd put the check, as it'd slow down normal =
operation too. Hrm.

>=20
>>> @@ -243,16 +324,20 @@ static int kvmppc_booke_irqprio_deliver(struct =
kvm_vcpu *vcpu,
>>> 	case BOOKE_IRQPRIO_AP_UNAVAIL:
>>> 	case BOOKE_IRQPRIO_ALIGNMENT:
>>> 		allowed =3D 1;
>>> -		msr_mask =3D MSR_CE|MSR_ME|MSR_DE;
>>> +		msr_mask =3D MSR_GS | MSR_CE | MSR_ME | MSR_DE;
>>=20
>> No need to do this. You already force MSR_GS in set_msr();
>=20
> OK.  This was here since before set_msr() started doing that. :-)
>=20
>>> +	if (!current->thread.kvm_vcpu) {
>>> +		WARN(1, "no vcpu\n");
>>> +		return -EPERM;
>>> +	}
>>=20
>> Huh?
>=20
> Oops, leftover debugging.
>=20
>>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu =
*vcpu)
>>> +{
>>> +	enum emulation_result er;
>>> +
>>> +	er =3D kvmppc_emulate_instruction(run, vcpu);
>>> +	switch (er) {
>>> +	case EMULATE_DONE:
>>> +		/* don't overwrite subtypes, just account kvm_stats */
>>> +		kvmppc_account_exit_stat(vcpu, EMULATED_INST_EXITS);
>>> +		/* Future optimization: only reload non-volatiles if
>>> +		 * they were actually modified by emulation. */
>>> +		return RESUME_GUEST_NV;
>>> +
>>> +	case EMULATE_DO_DCR:
>>> +		run->exit_reason =3D KVM_EXIT_DCR;
>>> +		return RESUME_HOST;
>>> +
>>> +	case EMULATE_FAIL:
>>> +		/* XXX Deliver Program interrupt to guest. */
>>> +		printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
>>> +		       __func__, vcpu->arch.regs.nip, =
vcpu->arch.last_inst);
>>=20
>> This should be throttled, otherwise the guest can spam our logs.
>=20
> Yes it should, but I'm just moving the code here.

Yeah, only realized this later. Maybe next time (not for this patch set, =
next time you're sending something) just extract these mechanical parts, =
so it's easier to review the pieces where code actually changes :).

>=20
>>> +		/* For debugging, encode the failing instruction and
>>> +		 * report it to userspace. */
>>> +		run->hw.hardware_exit_reason =3D ~0ULL << 32;
>>> +		run->hw.hardware_exit_reason |=3D vcpu->arch.last_inst;
>>=20
>>=20
>> I'm fairly sure you want to fix this :)
>=20
> Likewise, that's what booke.c already does.  What should it do =
instead?

This is what book3s does:

                case EMULATE_FAIL:
                        printk(KERN_CRIT "%s: emulation at %lx failed =
(%08x)\n",
                               __func__, kvmppc_get_pc(vcpu), =
kvmppc_get_last_inst(vcpu));
                        kvmppc_core_queue_program(vcpu, flags);
                        r =3D RESUME_GUEST;

which also doesn't throttle the printk, but I think injecting a program =
fault into the guest is the most sensible thing to do if we don't know =
what the instruction is supposed to do. Best case we get an oops inside =
the guest telling us what broke :).

>=20
>> /**
>>> * kvmppc_handle_exit
>>> *
>>> @@ -374,12 +530,39 @@ out:
>>> int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>>                       unsigned int exit_nr)
>>> {
>>> -	enum emulation_result er;
>>> 	int r =3D RESUME_HOST;
>>>=20
>>> 	/* update before a new last_exit_type is rewritten */
>>> 	kvmppc_update_timing_stats(vcpu);
>>>=20
>>> +	/*
>>> +	 * If we actually care, we could copy MSR, DEAR, and ESR to =
regs,
>>> +	 * insert an appropriate trap number, etc.
>>> +	 *
>>> +	 * Seems like a waste of cycles for something that should only =
matter
>>> +	 * to someone using sysrq-t/p or similar host kernel debug =
facility.
>>> +	 * We have other debug facilities to get that information from a
>>> +	 * guest through userspace.
>>> +	 */
>>> +	switch (exit_nr) {
>>> +	case BOOKE_INTERRUPT_EXTERNAL:
>>> +		do_IRQ(&vcpu->arch.regs);
>>=20
>> Ah, so that's what you want to use regs for. So is having a pt_regs
>> struct that only contains useful register values in half its fields
>> any useful here? Or could we keep control of the registers ourselves,
>> enabling us to maybe one day optimize things more.
>=20
> I think it contains enough to be useful for debugging code such as =
sysrq
> and tracers, and as noted in the comment we could copy the rest if we
> care enough.  MSR might be worth copying.
>=20
> It will eventually be used for machine checks as well, which I'd like =
to
> hand reasonable register state to, at least for GPRs, LR, and PC.
>=20
> If there's a good enough performance reason, we could just copy
> everything over for machine checks and pass NULL to do_IRQ (I think it
> can take this -- a dummy regs struct if not), but it seems premature =
at
> the moment unless the switch already causes measured performance loss
> (cache utilization?).

I'm definitely not concerned about performance, but complexity and =
uniqueness. With the pt_regs struct, we have a bunch of fields in the =
vcpu that are there, but unused. I find that situation pretty confusing.

So yes, I would definitely prefer to copy registers during MC and keep =
the registers where they are today - unless there are SPRs for them of =
course.

Imagine we'd one day want to share GPRs with user space through the =
kvm_run structure (see the s390 patches on the ML for this). I really =
wouldn't want to make pt_regs part of our userspace ABI.

>=20
>>> @@ -387,30 +570,56 @@ int kvmppc_handle_exit(struct kvm_run *run, =
struct kvm_vcpu *vcpu,
>>>=20
>>> 	switch (exit_nr) {
>>> 	case BOOKE_INTERRUPT_MACHINE_CHECK:
>>> -		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
>>> -		kvmppc_dump_vcpu(vcpu);
>>> -		r =3D RESUME_HOST;
>>> +		kvm_resched(vcpu);
>>> +		r =3D RESUME_GUEST;
>>=20
>> huh?
>=20
> Patch shuffling accident -- this belongs with a later patch that =
invokes
> the host machine check handler similar to what is done with do_IRQ().
> The host machine check handler needs some work first, though.
>=20
>>> 		break;
>>>=20
>>> 	case BOOKE_INTERRUPT_EXTERNAL:
>>> 		kvmppc_account_exit(vcpu, EXT_INTR_EXITS);
>>> -		if (need_resched())
>>> -			cond_resched();
>>> +		kvm_resched(vcpu);
>>=20
>> Why are we explicit about the resched? On book3s I just call =
kvm_resched(vcpu) before the switch().
>=20
> There are a few exit types where we don't currently do the resched -- =
if
> they're all bugs or don't-cares, we could move it out of the switch.
>=20
> We probably should defer the check until after we've disabled
> interrupts, similar to signals -- even if we didn't exit for an
> interrupt, we could have received one after enabling them.

Yup. I just don't think you can call resched() with interrupts disabled, =
so a bit cleverness is probably required here.

>=20
>>> +		if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>>> +			/* The guest TLB had a mapping, but the shadow =
TLB
>>> +			 * didn't. This could be because:
>>> +			 * a) the entry is mapping the host kernel, or
>>> +			 * b) the guest used a large mapping which we're =
faking
>>> +			 * Either way, we need to satisfy the fault =
without
>>> +			 * invoking the guest. */
>>> +			kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
>>> +		} else {
>>> +			/* Guest mapped and leaped at non-RAM! */
>>> +			kvmppc_booke_queue_irqprio(vcpu,
>>> +						   =
BOOKE_IRQPRIO_MACHINE_CHECK);
>>=20
>> Are you sure? Couldn't this also be MMIO? That doesn't really improve =
the situation as executing from MMIO is tricky with the KVM model, but =
it's not necessarily bad. Oh well, I guess we'll have to do something =
and throwing an #MC isn't all that ugly.
>=20
> I think I asked you about executing from MMIO once, and you said it
> wasn't supported even in straight QEMU.  Have things changed?

Yeah, I talked to Anthony about that part and apparently the QEMU design =
does support execution from MMIO. But don't worry about it for now. I =
don't think we'll really have guest OSs doing this. And if they do, we =
can worry about it then.

>=20
>>> diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
>>> index 05d1d99..d53bcf2 100644
>>> --- a/arch/powerpc/kvm/booke.h
>>> +++ b/arch/powerpc/kvm/booke.h
>>> @@ -48,7 +48,20 @@
>>> #define BOOKE_IRQPRIO_PERFORMANCE_MONITOR 19
>>> /* Internal pseudo-irqprio for level triggered externals */
>>> #define BOOKE_IRQPRIO_EXTERNAL_LEVEL 20
>>> -#define BOOKE_IRQPRIO_MAX 20
>>> +#define BOOKE_IRQPRIO_DBELL 21
>>> +#define BOOKE_IRQPRIO_DBELL_CRIT 22
>>> +#define BOOKE_IRQPRIO_MAX 23
>>=20
>> So was MAX wrong before or is it too big now?
>=20
> MAX is just a marker for how many IRQPRIOs we have, not any sort of
> external limit.  This patch adds new IRQPRIOs, so MAX goes up.
>=20
> The actual limit is the number of bits in a long.

Yes, and before the highest value was 20 with MAX being 20, now the =
highest value is 22 with MAX being 23. Either MAX =3D=3D highest number =
or MAX =3D=3D highest number + 1, but you're changing the semantics of =
MAX here. Maybe it was wrong before, I don't know, hence I'm asking :).

>=20
>>> +	.if	\flags & NEED_EMU
>>> +	lwz	r9, VCPU_KVM(r4)
>>=20
>> writing r9
>>=20
>>> +	.endif
>>> +
>>> +#ifdef CONFIG_KVM_EXIT_TIMING
>>> +	/* save exit time */
>>> +1:	mfspr	r7, SPRN_TBRU
>>> +	mfspr	r8, SPRN_TBRL
>>> +	mfspr	r9, SPRN_TBRU
>>=20
>> overwriting r9 again?
>=20
> Oops.  It's RFC for a reason. :-)
>=20
>>> +#ifndef CONFIG_64BIT
>>=20
>> Double negation is always hard to read. Please reverse the ifdef :)
>=20
> OK.
>=20
>>> +lightweight_exit:
>>> +	PPC_STL	r2, HOST_R2(r1)
>>> +
>>> +	mfspr	r3, SPRN_PID
>>> +	stw	r3, VCPU_HOST_PID(r4)
>>> +	lwz	r3, VCPU_GUEST_PID(r4)
>>> +	mtspr	SPRN_PID, r3
>>> +
>>> +	/* Save vcpu pointer for the exception handlers
>>> +	 * must be done before loading guest r2.
>>> +	 */
>>> +//	SET_VCPU(r4)
>>=20
>> hm?
>=20
> Can just be removed, it's handled in booke's vcpu load/put.
>=20
>>> +	lwz	r6, (VCPU_SHARED_MAS2 + 4)(r11)
>>> +#else
>>> +	ld	r6, (VCPU_SHARED_MAS2)(r11)
>>> +#endif
>>> +	lwz	r7, VCPU_SHARED_MAS7_3+4(r11)
>>> +	lwz	r8, VCPU_SHARED_MAS4(r11)
>>> +	mtspr	SPRN_MAS0, r3
>>> +	mtspr	SPRN_MAS1, r5
>>> +	mtspr	SPRN_MAS2, r6
>>> +	mtspr	SPRN_MAS3, r7
>>> +	mtspr	SPRN_MAS4, r8
>>> +	lwz	r3, VCPU_SHARED_MAS6(r11)
>>> +	lwz	r5, VCPU_SHARED_MAS7_3+0(r11)
>>> +	mtspr	SPRN_MAS6, r3
>>> +	mtspr	SPRN_MAS7, r5
>>> +	/* Disable MAS register updates via exception */
>>> +	mfspr	r3, SPRN_EPCR
>>> +	oris	r3, r3, SPRN_EPCR_DMIUH@h
>>> +	mtspr	SPRN_EPCR, r3
>>=20
>> Shouldn't this happen before you set the MAS registers? :)
>=20
> Yes (though we really shouldn't be getting a TLB miss here, at least =
on
> e500mc).

Yeah, but the way it's now it gives you a false feeling of security :)

>=20
>>> +	/* Load some guest volatiles. */
>>> +	PPC_LL	r3, VCPU_LR(r4)
>>> +	PPC_LL	r5, VCPU_XER(r4)
>>> +	PPC_LL	r6, VCPU_CTR(r4)
>>> +	PPC_LL	r7, VCPU_CR(r4)
>>> +	PPC_LL	r8, VCPU_PC(r4)
>>> +#ifndef CONFIG_64BIT
>>> +	lwz	r9, (VCPU_SHARED_MSR + 4)(r11)
>>> +#else
>>> +	ld	r9, (VCPU_SHARED_MSR)(r11)
>>> +#endif
>>> +	PPC_LL	r0, VCPU_GPR(r0)(r4)
>>> +	PPC_LL	r1, VCPU_GPR(r1)(r4)
>>> +	PPC_LL	r2, VCPU_GPR(r2)(r4)
>>> +	PPC_LL	r10, VCPU_GPR(r10)(r4)
>>> +	PPC_LL	r11, VCPU_GPR(r11)(r4)
>>> +	PPC_LL	r12, VCPU_GPR(r12)(r4)
>>> +	PPC_LL	r13, VCPU_GPR(r13)(r4)
>>> +	mtlr	r3
>>> +	mtxer	r5
>>> +	mtctr	r6
>>> +	mtcr	r7
>>> +	mtsrr0	r8
>>> +	mtsrr1	r9
>>=20
>> Are you sure this should be shared->msr, not shadow_msr?
>=20
> Yes, we don't use shadow_msr on bookehv.  I'll add a comment in the
> struct definition as discussed in the other thread, as well as other
> areas where there are subtle differences between PR-mode and GS-mode.

Thanks!


Alex

^ permalink raw reply

* Re: [PATCH 2/2] Kbuild: Use dtc's -d (dependency) option
From: Shawn Guo @ 2012-01-10  3:12 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Michal Marek, Jon Loeliger, Russell King, linux-c6x-dev,
	Aurelien Jacquiot, microblaze-uclinux, Devicetree Discuss,
	linux-kbuild, linux-kernel, Rob Herring, linux, linuxppc-dev,
	linux-arm-kernel, David Gibson
In-Reply-To: <1326134295-15547-2-git-send-email-swarren@nvidia.com>

On Mon, Jan 09, 2012 at 11:38:15AM -0700, Stephen Warren wrote:
> This hooks dtc into Kbuild's dependency system.
> 
> Thus, for example, "make dtbs" will rebuild tegra-harmony.dtb if only
> tegra20.dtsi has changed yet tegra-harmony.dts has not. The previous
> lack of this feature recently caused me to have very confusing "git
> bisect" results.
> 
> For ARM, it's obvious what to add to $(targets). I'm not familiar enough
> with other architectures to know what to add there. Powerpc appears to
> already add various .dtb files into $(targets), but the other archs may
> need something added to $(targets) to work.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>

Though I did not look into the patches deeply, I know the problem very
well and it annoys me a lot.  It's great that we have patches to fix
it, so

Acked-by: Shawn Guo <shawn.guo@linaro.org>

-- 
Regards,
Shawn

^ permalink raw reply

* RE: [PATCH] powerpc/85xx: Add P1024rdb dts support
From: Tang Yuantian-B29983 @ 2012-01-10  3:45 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4F0B561A.7060903@freescale.com>


> On 01/09/2012 02:37 AM, b29983@freescale.com wrote:
> > +/include/ "p1024rdb.dtsi"
> > +/include/ "fsl/p1020si-post.dtsi"
>=20
> Is p1024 100% software-compatible with p1020?
>=20
> They have different manuals...
>=20
> -Scott

P1020rdb has vitesse-7385 switch.
fsl/p1020si-post.dtsi can be used for both boards.

Regards,
Yuantian

^ permalink raw reply

* Re: [SDK v1.2][PATCH 1/2 v3] powerpc/85xx: Add dts for P1021RDB-PC board
From: Xu Jiucheng @ 2012-01-10  7:17 UTC (permalink / raw)
  To: Scott Wood; +Cc: Matthew McClintock, linuxppc-dev
In-Reply-To: <4F0B541C.1050001@freescale.com>

=E5=9C=A8 2012-01-09Mon=E7=9A=84 14:54 -0600=EF=BC=8CScott Wood=E5=86=99=E9=
=81=93=EF=BC=9A
> On 01/09/2012 12:53 AM, Xu Jiucheng wrote:
> > +	nand@1,0 {
> > +		#address-cells =3D <1>;
> > +		#size-cells =3D <1>;
> > +		compatible =3D "fsl,p1020-fcm-nand",
> > +			     "fsl,elbc-fcm-nand";
>=20
> s/p1020/p1021/
>=20
> -Scott

Ok.

Thanks & Best Regards
Jiucheng

^ permalink raw reply

* Re: [SDK v1.2][PATCH 1/2 v3] powerpc/85xx: Add dts for P1021RDB-PC board
From: Xu Jiucheng @ 2012-01-10  7:22 UTC (permalink / raw)
  To: linuxppc-dev, galak; +Cc: Matthew McClintock
In-Reply-To: <1326092022-10085-1-git-send-email-B37781@freescale.com>

I'm sorry, please ignore this email.

Thanks & Best Regards
Jiucheng

=E5=9C=A8 2012-01-09Mon=E7=9A=84 14:53 +0800=EF=BC=8CXu Jiucheng=E5=86=99=
=E9=81=93=EF=BC=9A
> P1021RDB-PC Overview
> -----------------
> 1Gbyte DDR3 (on board DDR)
> 16Mbyte NOR flash
> 32Mbyte eSLC NAND Flash
> 256 Kbit M24256 I2C EEPROM
> 128 Mbit SPI Flash memory
> Real-time clock on I2C bus
> SD/MMC connector to interface with the SD memory card
> PCIex
>     - x1 PCIe slot or x1 PCIe to dual SATA controller
>     - x1 mini-PCIe slot
> USB 2.0
>     - ULPI PHY interface: SMSC USB3300 USB PHY and Genesys Logic=E2=80=99=
s GL850A
>     - Two USB2.0 Type A receptacles
>     - One USB2.0 signal to Mini PCIe slot
> eTSEC1: Connected to RGMII PHY VSC7385
> eTSEC2: Connected to SGMII PHY VSC8221
> eTSEC3: Connected to SGMII PHY AR8021
> DUART interface: supports two UARTs up to 115200 bps for console displa=
y
>=20
> Signed-off-by: Matthew McClintock <msm@freescale.com>
> Signed-off-by: Xu Jiucheng <B37781@freescale.com>
> ---
>  arch/powerpc/boot/dts/fsl/p1021si-post.dtsi |    4 +
>  arch/powerpc/boot/dts/p1021rdb.dts          |   96 +++++++++++
>  arch/powerpc/boot/dts/p1021rdb.dtsi         |  236 +++++++++++++++++++=
++++++++
>  arch/powerpc/boot/dts/p1021rdb_36b.dts      |   96 +++++++++++
>  4 files changed, 432 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/boot/dts/p1021rdb.dts
>  create mode 100644 arch/powerpc/boot/dts/p1021rdb.dtsi
>  create mode 100644 arch/powerpc/boot/dts/p1021rdb_36b.dts
>=20
> diff --git a/arch/powerpc/boot/dts/fsl/p1021si-post.dtsi b/arch/powerpc=
/boot/dts/fsl/p1021si-post.dtsi
> index 38ba54d..b7929c9 100644
> --- a/arch/powerpc/boot/dts/fsl/p1021si-post.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/p1021si-post.dtsi
> @@ -144,6 +144,10 @@
>  /include/ "pq3-usb2-dr-0.dtsi"
> =20
>  /include/ "pq3-esdhc-0.dtsi"
> +	sdhc@2e000 {
> +		sdhci,auto-cmd12;
> +	};
> +
>  /include/ "pq3-sec3.3-0.dtsi"
> =20
>  /include/ "pq3-mpic.dtsi"
> diff --git a/arch/powerpc/boot/dts/p1021rdb.dts b/arch/powerpc/boot/dts=
/p1021rdb.dts
> new file mode 100644
> index 0000000..90b6b4c
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/p1021rdb.dts
> @@ -0,0 +1,96 @@
> +/*
> + * P1021 RDB Device Tree Source
> + *
> + * Copyright 2011 Freescale Semiconductor Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions =
are met:
> + *     * Redistributions of source code must retain the above copyrigh=
t
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyr=
ight
> + *       notice, this list of conditions and the following disclaimer =
in the
> + *       documentation and/or other materials provided with the distri=
bution.
> + *     * Neither the name of Freescale Semiconductor nor the
> + *       names of its contributors may be used to endorse or promote p=
roducts
> + *       derived from this software without specific prior written per=
mission.
> + *
> + *
> + * ALTERNATIVELY, this software may be distributed under the terms of =
the
> + * GNU General Public License ("GPL") as published by the Free Softwar=
e
> + * Foundation, either version 2 of that License or (at your option) an=
y
> + * later version.
> + *
> + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor "AS IS" AND AN=
Y
> + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE I=
MPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE =
ARE
> + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR=
 ANY
> + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL =
DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR =
SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CA=
USED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, =
OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE U=
SE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/include/ "fsl/p1021si-pre.dtsi"
> +/ {
> +	model =3D "fsl,P1021RDB";
> +	compatible =3D "fsl,P1021RDB-PC";
> +
> +	memory {
> +		device_type =3D "memory";
> +	};
> +
> +	lbc: localbus@ffe05000 {
> +		reg =3D <0 0xffe05000 0 0x1000>;
> +
> +		/* NOR, NAND Flashes and Vitesse 5 port L2 switch */
> +		ranges =3D <0x0 0x0 0x0 0xef000000 0x01000000
> +			  0x1 0x0 0x0 0xff800000 0x00040000
> +			  0x2 0x0 0x0 0xffb00000 0x00020000>;
> +	};
> +
> +	soc: soc@ffe00000 {
> +		ranges =3D <0x0 0x0 0xffe00000 0x100000>;
> +	};
> +
> +	pci0: pcie@ffe09000 {
> +		ranges =3D <0x2000000 0x0 0xa0000000 0 0xa0000000 0x0 0x20000000
> +			  0x1000000 0x0 0x00000000 0 0xffc10000 0x0 0x10000>;
> +		reg =3D <0 0xffe09000 0 0x1000>;
> +		pcie@0 {
> +			ranges =3D <0x2000000 0x0 0xa0000000
> +				  0x2000000 0x0 0xa0000000
> +				  0x0 0x20000000
> +
> +				  0x1000000 0x0 0x0
> +				  0x1000000 0x0 0x0
> +				  0x0 0x100000>;
> +		};
> +	};
> +
> +	pci1: pcie@ffe0a000 {
> +		reg =3D <0 0xffe0a000 0 0x1000>;
> +		ranges =3D <0x2000000 0x0 0x80000000 0 0x80000000 0x0 0x20000000
> +			  0x1000000 0x0 0x00000000 0 0xffc00000 0x0 0x10000>;
> +		pcie@0 {
> +			ranges =3D <0x2000000 0x0 0x80000000
> +				  0x2000000 0x0 0x80000000
> +				  0x0 0x20000000
> +
> +				  0x1000000 0x0 0x0
> +				  0x1000000 0x0 0x0
> +				  0x0 0x100000>;
> +		};
> +	};
> +
> +	qe: qe@ffe80000 {
> +                ranges =3D <0x0 0x0 0xffe80000 0x40000>;
> +                reg =3D <0 0xffe80000 0 0x480>;
> +                brg-frequency =3D <0>;
> +                bus-frequency =3D <0>;
> +        };
> +};
> +
> +/include/ "p1021rdb.dtsi"
> +/include/ "fsl/p1021si-post.dtsi"
> diff --git a/arch/powerpc/boot/dts/p1021rdb.dtsi b/arch/powerpc/boot/dt=
s/p1021rdb.dtsi
> new file mode 100644
> index 0000000..22ecb6e
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/p1021rdb.dtsi
> @@ -0,0 +1,236 @@
> +/*
> + * P1021 RDB Device Tree Source stub (no addresses or top-level ranges=
)
> + *
> + * Copyright 2011 Freescale Semiconductor Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions =
are met:
> + *     * Redistributions of source code must retain the above copyrigh=
t
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyr=
ight
> + *       notice, this list of conditions and the following disclaimer =
in the
> + *       documentation and/or other materials provided with the distri=
bution.
> + *     * Neither the name of Freescale Semiconductor nor the
> + *       names of its contributors may be used to endorse or promote p=
roducts
> + *       derived from this software without specific prior written per=
mission.
> + *
> + *
> + * ALTERNATIVELY, this software may be distributed under the terms of =
the
> + * GNU General Public License ("GPL") as published by the Free Softwar=
e
> + * Foundation, either version 2 of that License or (at your option) an=
y
> + * later version.
> + *
> + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND =
ANY
> + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE I=
MPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE =
ARE
> + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR=
 ANY
> + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL =
DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR =
SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CA=
USED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, =
OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE U=
SE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +&lbc {
> +	nor@0,0 {
> +		#address-cells =3D <1>;
> +		#size-cells =3D <1>;
> +		compatible =3D "cfi-flash";
> +		reg =3D <0x0 0x0 0x1000000>;
> +		bank-width =3D <2>;
> +		device-width =3D <1>;
> +
> +		partition@0 {
> +			/* This location must not be altered  */
> +			/* 256KB for Vitesse 7385 Switch firmware */
> +			reg =3D <0x0 0x00040000>;
> +			label =3D "NOR Vitesse-7385 Firmware";
> +			read-only;
> +		};
> +
> +		partition@40000 {
> +			/* 256KB for DTB Image */
> +			reg =3D <0x00040000 0x00040000>;
> +			label =3D "NOR DTB Image";
> +		};
> +
> +		partition@80000 {
> +			/* 3.5 MB for Linux Kernel Image */
> +			reg =3D <0x00080000 0x00380000>;
> +			label =3D "NOR Linux Kernel Image";
> +		};
> +
> +		partition@400000 {
> +			/* 11MB for JFFS2 based Root file System */
> +			reg =3D <0x00400000 0x00b00000>;
> +			label =3D "NOR JFFS2 Root File System";
> +		};
> +
> +		partition@f00000 {
> +			/* This location must not be altered  */
> +			/* 512KB for u-boot Bootloader Image */
> +			/* 512KB for u-boot Environment Variables */
> +			reg =3D <0x00f00000 0x00100000>;
> +			label =3D "NOR U-Boot Image";
> +		};
> +	};
> +
> +	nand@1,0 {
> +		#address-cells =3D <1>;
> +		#size-cells =3D <1>;
> +		compatible =3D "fsl,p1020-fcm-nand",
> +			     "fsl,elbc-fcm-nand";
> +		reg =3D <0x1 0x0 0x40000>;
> +
> +		partition@0 {
> +			/* This location must not be altered  */
> +			/* 1MB for u-boot Bootloader Image */
> +			reg =3D <0x0 0x00100000>;
> +			label =3D "NAND U-Boot Image";
> +			read-only;
> +		};
> +
> +		partition@100000 {
> +			/* 1MB for DTB Image */
> +			reg =3D <0x00100000 0x00100000>;
> +			label =3D "NAND DTB Image";
> +		};
> +
> +		partition@200000 {
> +			/* 4MB for Linux Kernel Image */
> +			reg =3D <0x00200000 0x00400000>;
> +			label =3D "NAND Linux Kernel Image";
> +		};
> +
> +		partition@600000 {
> +			/* 4MB for Compressed Root file System Image */
> +			reg =3D <0x00600000 0x00400000>;
> +			label =3D "NAND Compressed RFS Image";
> +		};
> +
> +		partition@a00000 {
> +			/* 7MB for JFFS2 based Root file System */
> +			reg =3D <0x00a00000 0x00700000>;
> +			label =3D "NAND JFFS2 Root File System";
> +		};
> +
> +		partition@1100000 {
> +			/* 15MB for User Writable Area  */
> +			reg =3D <0x01100000 0x00f00000>;
> +			label =3D "NAND Writable User area";
> +		};
> +	};
> +
> +	L2switch@2,0 {
> +		#address-cells =3D <1>;
> +		#size-cells =3D <1>;
> +		compatible =3D "vitesse-7385";
> +		reg =3D <0x2 0x0 0x20000>;
> +	};
> +};
> +
> +&soc {
> +	i2c@3000 {
> +		rtc@68 {
> +			compatible =3D "pericom,pt7c4338";
> +			reg =3D <0x68>;
> +		};
> +	};
> +
> +	spi@7000 {
> +		flash@0 {
> +			#address-cells =3D <1>;
> +			#size-cells =3D <1>;
> +			compatible =3D "spansion,s25sl12801";
> +			reg =3D <0>;
> +			spi-max-frequency =3D <40000000>; /* input clock */
> +
> +			partition@u-boot {
> +				/* 512KB for u-boot Bootloader Image */
> +				reg =3D <0x0 0x00080000>;
> +				label =3D "SPI Flash U-Boot Image";
> +				read-only;
> +			};
> +
> +			partition@dtb {
> +				/* 512KB for DTB Image */
> +				reg =3D <0x00080000 0x00080000>;
> +				label =3D "SPI Flash DTB Image";
> +			};
> +
> +			partition@kernel {
> +				/* 4MB for Linux Kernel Image */
> +				reg =3D <0x00100000 0x00400000>;
> +				label =3D "SPI Flash Linux Kernel Image";
> +			};
> +
> +			partition@fs {
> +				/* 4MB for Compressed RFS Image */
> +				reg =3D <0x00500000 0x00400000>;
> +				label =3D "SPI Flash Compressed RFSImage";
> +			};
> +
> +			partition@jffs-fs {
> +				/* 7MB for JFFS2 based RFS */
> +				reg =3D <0x00900000 0x00700000>;
> +				label =3D "SPI Flash JFFS2 RFS";
> +			};
> +		};
> +	};
> +
> +	usb@22000 {
> +		phy_type =3D "ulpi";
> +	};
> +
> +	mdio@24000 {
> +		phy0: ethernet-phy@0 {
> +			interrupt-parent =3D <&mpic>;
> +			interrupts =3D <3 1>;
> +			reg =3D <0x0>;
> +		};
> +
> +		phy1: ethernet-phy@1 {
> +			interrupt-parent =3D <&mpic>;
> +			interrupts =3D <2 1>;
> +			reg =3D <0x1>;
> +		};
> +
> +		tbi0: tbi-phy@11 {
> +			reg =3D <0x11>;
> +			device_type =3D "tbi-phy";
> +		};
> +	};
> +
> +	mdio@25000 {
> +		tbi1: tbi-phy@11 {
> +			reg =3D <0x11>;
> +			device_type =3D "tbi-phy";
> +		};
> +	};
> +
> +	mdio@26000 {
> +		tbi2: tbi-phy@11 {
> +			reg =3D <0x11>;
> +			device_type =3D "tbi-phy";
> +		};
> +	};
> +
> +	enet0: ethernet@b0000 {
> +		fixed-link =3D <1 1 1000 0 0>;
> +		phy-connection-type =3D "rgmii-id";
> +
> +	};
> +
> +	enet1: ethernet@b1000 {
> +		phy-handle =3D <&phy0>;
> +		tbi-handle =3D <&tbi1>;
> +		phy-connection-type =3D "sgmii";
> +	};
> +
> +	enet2: ethernet@b2000 {
> +		phy-handle =3D <&phy1>;
> +		tbi-handle =3D <&tbi2>;
> +		phy-connection-type =3D "rgmii-id";
> +	};
> +};
> diff --git a/arch/powerpc/boot/dts/p1021rdb_36b.dts b/arch/powerpc/boot=
/dts/p1021rdb_36b.dts
> new file mode 100644
> index 0000000..ea6d8b5
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/p1021rdb_36b.dts
> @@ -0,0 +1,96 @@
> +/*
> + * P1021 RDB Device Tree Source (36-bit address map)
> + *
> + * Copyright 2011 Freescale Semiconductor Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions =
are met:
> + *     * Redistributions of source code must retain the above copyrigh=
t
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyr=
ight
> + *       notice, this list of conditions and the following disclaimer =
in the
> + *       documentation and/or other materials provided with the distri=
bution.
> + *     * Neither the name of Freescale Semiconductor nor the
> + *       names of its contributors may be used to endorse or promote p=
roducts
> + *       derived from this software without specific prior written per=
mission.
> + *
> + *
> + * ALTERNATIVELY, this software may be distributed under the terms of =
the
> + * GNU General Public License ("GPL") as published by the Free Softwar=
e
> + * Foundation, either version 2 of that License or (at your option) an=
y
> + * later version.
> + *
> + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor "AS IS" AND AN=
Y
> + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE I=
MPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE =
ARE
> + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR=
 ANY
> + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL =
DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR =
SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CA=
USED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, =
OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE U=
SE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/include/ "fsl/p1021si-pre.dtsi"
> +/ {
> +	model =3D "fsl,P1021RDB";
> +	compatible =3D "fsl,P1021RDB-PC";
> +
> +	memory {
> +		device_type =3D "memory";
> +	};
> +
> +	lbc: localbus@fffe05000 {
> +		reg =3D <0xf 0xffe05000 0 0x1000>;
> +
> +		/* NOR, NAND Flashes and Vitesse 5 port L2 switch */
> +		ranges =3D <0x0 0x0 0xf 0xef000000 0x01000000
> +			  0x1 0x0 0xf 0xff800000 0x00040000
> +			  0x2 0x0 0xf 0xffb00000 0x00020000>;
> +	};
> +
> +	soc: soc@fffe00000 {
> +		ranges =3D <0x0 0xf 0xffe00000 0x100000>;
> +	};
> +
> +	pci0: pcie@fffe09000 {
> +		ranges =3D <0x2000000 0x0 0xc0000000 0xc 0x20000000 0x0 0x20000000
> +			  0x1000000 0x0 0x00000000 0xf 0xffc10000 0x0 0x10000>;
> +		reg =3D <0xf 0xffe09000 0 0x1000>;
> +		pcie@0 {
> +			ranges =3D <0x2000000 0x0 0xa0000000
> +				  0x2000000 0x0 0xa0000000
> +				  0x0 0x20000000
> +
> +				  0x1000000 0x0 0x0
> +				  0x1000000 0x0 0x0
> +				  0x0 0x100000>;
> +		};
> +	};
> +
> +	pci1: pcie@fffe0a000 {
> +		reg =3D <0xf 0xffe0a000 0 0x1000>;
> +		ranges =3D <0x2000000 0x0 0x80000000 0xc 0x00000000 0x0 0x20000000
> +			  0x1000000 0x0 0x00000000 0xf 0xffc00000 0x0 0x10000>;
> +		pcie@0 {
> +			ranges =3D <0x2000000 0x0 0xc0000000
> +				  0x2000000 0x0 0xc0000000
> +				  0x0 0x20000000
> +
> +				  0x1000000 0x0 0x0
> +				  0x1000000 0x0 0x0
> +				  0x0 0x100000>;
> +		};
> +	};
> +
> +	qe: qe@fffe80000 {
> +                ranges =3D <0x0 0xf 0xffe80000 0x40000>;
> +                reg =3D <0xf 0xffe80000 0 0x480>;
> +                brg-frequency =3D <0>;
> +                bus-frequency =3D <0>;
> +        };
> +};
> +
> +/include/ "p1021rdb.dtsi"
> +/include/ "fsl/p1021si-post.dtsi"

^ permalink raw reply

* Re: [SDK v1.2][PATCH 2/2 v3] powerpc/85xx: Added P1021RDB-PC Platform support
From: Xu Jiucheng @ 2012-01-10  7:23 UTC (permalink / raw)
  To: galak, linuxppc-dev
In-Reply-To: <1326092022-10085-2-git-send-email-B37781@freescale.com>

I'm sorry, please ignore this email.

Thanks & Best Regards=20
Jiucheng

=E5=9C=A8 2012-01-09Mon=E7=9A=84 14:53 +0800=EF=BC=8CXu Jiucheng=E5=86=99=
=E9=81=93=EF=BC=9A
> Signed-off-by: Xu Jiucheng <B37781@freescale.com>
> ---
>  arch/powerpc/platforms/85xx/mpc85xx_rdb.c |   25 +++++++++++++++++++++=
++++
>  1 files changed, 25 insertions(+), 0 deletions(-)
>=20
> diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/p=
latforms/85xx/mpc85xx_rdb.c
> index 9feccbb..0c32668 100644
> --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> @@ -113,6 +113,7 @@ static void __init mpc85xx_rdb_setup_arch(void)
> =20
>  machine_device_initcall(p2020_rdb, mpc85xx_common_publish_devices);
>  machine_device_initcall(p1020_rdb, mpc85xx_common_publish_devices);
> +machine_device_initcall(p1021_rdb_pc, mpc85xx_common_publish_devices);
> =20
>  /*
>   * Called very early, device-tree isn't unflattened
> @@ -135,6 +136,15 @@ static int __init p1020_rdb_probe(void)
>  	return 0;
>  }
> =20
> +static int __init p1021_rdb_pc_probe(void)
> +{
> +	unsigned long root =3D of_get_flat_dt_root();
> +
> +	if (of_flat_dt_is_compatible(root, "fsl,P1021RDB-PC"))
> +		return 1;
> +	return 0;
> +}
> +
>  define_machine(p2020_rdb) {
>  	.name			=3D "P2020 RDB",
>  	.probe			=3D p2020_rdb_probe,
> @@ -162,3 +172,18 @@ define_machine(p1020_rdb) {
>  	.calibrate_decr		=3D generic_calibrate_decr,
>  	.progress		=3D udbg_progress,
>  };
> +
> +define_machine(p1021_rdb_pc) {
> +	.name			=3D "P1021 RDB-PC",
> +	.probe			=3D p1021_rdb_pc_probe,
> +	.setup_arch		=3D mpc85xx_rdb_setup_arch,
> +	.init_IRQ		=3D mpc85xx_rdb_pic_init,
> +#ifdef CONFIG_PCI
> +	.pcibios_fixup_bus	=3D fsl_pcibios_fixup_bus,
> +#endif
> +	.get_irq		=3D mpic_get_irq,
> +	.restart		=3D fsl_rstcr_restart,
> +	.calibrate_decr		=3D generic_calibrate_decr,
> +	.progress		=3D udbg_progress,
> +};
> +

^ permalink raw reply

* RE: [PATCH SDK1.2 1/3] powerpc/fsl-pci: Unify pci/pcie initialization code
From: Jia Hongtao-B38951 @ 2012-01-10  7:31 UTC (permalink / raw)
  To: Gala Kumar-B11780
  Cc: linuxppc-dev@lists.ozlabs.org, Li Yang-R58472, Jia Hongtao-B38951
In-Reply-To: <1324451458-4644-2-git-send-email-B38951@freescale.com>

Hi Kumar,
Do you have any idea on this series of patches?
Looking forward to your answer.
Thanks.

--Jia Hongtao.

-----Original Message-----
From: Jia Hongtao-B38951=20
Sent: Wednesday, December 21, 2011 3:11 PM
To: linuxppc-dev@lists.ozlabs.org
Cc: Li Yang-R58472; Gala Kumar-B11780; Jia Hongtao-B38951
Subject: [PATCH SDK1.2 1/3] powerpc/fsl-pci: Unify pci/pcie initialization =
code

We unified the Freescale pci/pcie initialization by changing the fsl_pci to=
 a platform driver.

In previous version pci/pcie initialization is in platform code which Initi=
alize pci bridge base on EP/RC or host/agent settings.

Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
---
 arch/powerpc/platforms/85xx/p1022_ds.c |   39 +++++++----------------
 arch/powerpc/sysdev/fsl_pci.c          |   53 ++++++++++++++++++++++++++++=
++++
 2 files changed, 65 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/p1022_ds.c b/arch/powerpc/platform=
s/85xx/p1022_ds.c
index 2bf4342..41de2c1 100644
--- a/arch/powerpc/platforms/85xx/p1022_ds.c
+++ b/arch/powerpc/platforms/85xx/p1022_ds.c
@@ -277,32 +277,9 @@ void __init mpc85xx_smp_init(void);
  */
 static void __init p1022_ds_setup_arch(void)  { -#ifdef CONFIG_PCI
-	struct device_node *np;
-#endif
-	dma_addr_t max =3D 0xffffffff;
-
 	if (ppc_md.progress)
 		ppc_md.progress("p1022_ds_setup_arch()", 0);
=20
-#ifdef CONFIG_PCI
-	for_each_compatible_node(np, "pci", "fsl,p1022-pcie") {
-		struct resource rsrc;
-		struct pci_controller *hose;
-
-		of_address_to_resource(np, 0, &rsrc);
-
-		if ((rsrc.start & 0xfffff) =3D=3D 0x8000)
-			fsl_add_bridge(np, 1);
-		else
-			fsl_add_bridge(np, 0);
-
-		hose =3D pci_find_hose_for_OF_device(np);
-		max =3D min(max, hose->dma_window_base_cur +
-			  hose->dma_window_size);
-	}
-#endif
-
 #if defined(CONFIG_FB_FSL_DIU) || defined(CONFIG_FB_FSL_DIU_MODULE)
 	diu_ops.get_pixel_format	=3D p1022ds_get_pixel_format;
 	diu_ops.set_gamma_table		=3D p1022ds_set_gamma_table;
@@ -316,11 +293,8 @@ static void __init p1022_ds_setup_arch(void)  #endif
=20
 #ifdef CONFIG_SWIOTLB
-	if (memblock_end_of_DRAM() > max) {
+	if (memblock_end_of_DRAM() > 0xffffffff)
 		ppc_swiotlb_enable =3D 1;
-		set_pci_dma_ops(&swiotlb_dma_ops);
-		ppc_md.pci_dma_dev_setup =3D pci_dma_dev_setup_swiotlb;
-	}
 #endif
=20
 	pr_info("Freescale P1022 DS reference board\n"); @@ -339,6 +313,17 @@ sta=
tic int __init p1022_ds_publish_devices(void)  }  machine_device_initcall(p=
1022_ds, p1022_ds_publish_devices);
=20
+static struct of_device_id __initdata p1022_pci_ids[] =3D {
+	{ .compatible =3D "fsl,p1022-pcie", },
+	{},
+};
+
+static int __init p1022_ds_publish_pci_device(void) {
+	return of_platform_bus_probe(NULL, p1022_pci_ids, NULL); }=20
+machine_arch_initcall(p1022_ds, p1022_ds_publish_pci_device);
+
 machine_arch_initcall(p1022_ds, swiotlb_setup_bus_notifier);
=20
 /*
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c =
index 4ce547e..a0f305d 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -712,3 +712,56 @@ u64 fsl_pci_immrbar_base(struct pci_controller *hose)
=20
 	return 0;
 }
+
+static const struct of_device_id pci_ids[] =3D {
+	{ .compatible =3D "fsl,mpc8540-pci", },
+	{ .compatible =3D "fsl,mpc8548-pcie", },
+	{ .compatible =3D "fsl,p1022-pcie", },
+	{},
+};
+
+static int __devinit fsl_pci_probe(struct platform_device *pdev) {
+	struct pci_controller *hose;
+
+	if (of_match_node(pci_ids, pdev->dev.of_node)) {
+		struct resource rsrc;
+		of_address_to_resource(pdev->dev.of_node, 0, &rsrc);
+		if ((rsrc.start & 0xfffff) =3D=3D 8000)
+			fsl_add_bridge(pdev->dev.of_node, 1);
+		else
+			fsl_add_bridge(pdev->dev.of_node, 0);
+
+#ifdef CONFIG_SWIOTLB
+		hose =3D pci_find_hose_for_OF_device(pdev->dev.of_node);
+		/*
+		 * if we couldn't map all of DRAM via the dma windows
+		 * we need SWIOTLB to handle buffers located outside of
+		 * dma capable memory region
+		 */
+		if (memblock_end_of_DRAM() > hose->dma_window_base_cur
+				+ hose->dma_window_size) {
+			ppc_swiotlb_enable =3D 1;
+			set_pci_dma_ops(&swiotlb_dma_ops);
+			ppc_md.pci_dma_dev_setup =3D pci_dma_dev_setup_swiotlb;
+		}
+#endif
+
+	}
+
+	return 0;
+}
+
+static struct platform_driver fsl_pci_driver =3D {
+	.driver =3D {
+		.name =3D "fsl-pci",
+		.of_match_table =3D pci_ids,
+	},
+	.probe =3D fsl_pci_probe,
+};
+
+static int __init fsl_pci_init(void)
+{
+	return platform_driver_register(&fsl_pci_driver);
+}
+arch_initcall(fsl_pci_init);
--
1.7.5.1

^ permalink raw reply related

* Re: [RFC PATCH 16/16] KVM: PPC: e500mc support
From: Avi Kivity @ 2012-01-10  8:37 UTC (permalink / raw)
  To: Scott Wood; +Cc: Liu Yu, kvm, agraf, kvm-ppc, Varun Sethi, linuxppc-dev
In-Reply-To: <4F0B4002.4050407@freescale.com>

On 01/09/2012 09:29 PM, Scott Wood wrote:
> > 
> > Best to include their signoffs, if possible.
>
> These patches are based in part on a bunch of different patches from
> these people (for which I did receive signoffs).  I was reluctant to put
> their signoff directly on the new patches, since I didn't want to make
> it look like they had submitted the patch in anything resembling its
> current form.  I wanted to give them credit for what they did, but not
> blame for what I did with their code.
>

Signoffs are for assigning neither credit nor blame, but for attributing
authorship and affirming that a contributor has the right to contribute
code or pass it along.  Please read the DCO at
https://lwn.net/Articles/437739/.

It's okay to miss them from time to time, especially for established
contributors, but avoid it whenever possible.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply

* RE: [PATCH 00/14] DMA-mapping framework redesign preparation
From: Marek Szyprowski @ 2012-01-10  8:42 UTC (permalink / raw)
  To: Marek Szyprowski, linux-kernel
  Cc: linux-arch, 'Stephen Rothwell', linux-alpha, linux-ia64,
	'Arnd Bergmann', linux-sh, x86, Andrzej Pietrasiewicz,
	'Jonathan Corbet', linaro-mm-sig, linux-mm,
	'Kyungmin Park', linux-mips, microblaze-uclinux,
	sparclinux, 'Thomas Gleixner', linuxppc-dev,
	'Andrew Morton', linux-arm-kernel, discuss
In-Reply-To: <1324643253-3024-1-git-send-email-m.szyprowski@samsung.com>

Hello,

To help everyone in testing and adapting our patches for his hardware 
platform I've rebased our patches onto the latest v3.2 Linux kernel and
prepared a few GIT branches in our public repository. These branches
contain our memory management related patches posted in the following
threads:

"[PATCHv18 0/11] Contiguous Memory Allocator":
http://www.spinics.net/lists/linux-mm/msg28125.html
later called CMAv18,

"[PATCH 00/14] DMA-mapping framework redesign preparation":
http://www.spinics.net/lists/linux-sh/msg09777.html
and
"[PATCH 0/8 v4] ARM: DMA-mapping framework redesign":
http://www.spinics.net/lists/arm-kernel/msg151147.html
with the following update:
http://www.spinics.net/lists/arm-kernel/msg154889.html
later called DMAv5.

These branches are available in our public GIT repository:

git://git.infradead.org/users/kmpark/linux-samsung
http://git.infradead.org/users/kmpark/linux-samsung/

The following branches are available:

1) 3.2-cma-v18
Vanilla Linux v3.2 with fixed CMA v18 patches (first patch replaced
with the one from v17 to fix SMP issues, see the respective thread).

2) 3.2-dma-v5
Vanilla Linux v3.2 + iommu/next (IOMMU maintainer's patches) branch
with DMA-preparation and DMA-mapping framework redesign patches.

3) 3.2-cma-v18-dma-v5
Previous two branches merged together (DMA-mapping on top of CMA)

4) 3.2-cma-v18-dma-v5-exynos
Previous branch rebased on top of iommu/next + kgene/for-next (Samsung
SoC platform maintainer's patches) with new Exynos4 IOMMU driver by 
KyongHo Cho and relevant glue code.

5) 3.2-dma-v5-exynos
Branch from point 2 rebased on top of iommu/next + kgene/for-next 
(Samsung SoC maintainer's patches) with new Exynos4 IOMMU driver by 
KyongHo Cho and relevant glue code.

I hope everyone will find a branch that suits his needs. :)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply

* RE: OpenPIC warnings on P1022RDK AMP configuration
From: Arshad, Farrukh @ 2012-01-10  8:43 UTC (permalink / raw)
  To: linuxppc-dev@lists.ozlabs.org

[-- Attachment #1: Type: text/plain, Size: 2242 bytes --]

Change kernel configuration as per following to fix this warning.

Disable Kernel Options -> Support for enabling/disabling CPUs
Disable Kernel Options -> Distribute interrupts on all CPUs by default

Regards,
Farrukh Arshad

From: Arshad, Farrukh
Sent: Friday, December 23, 2011 11:18 AM
To: linuxppc-dev@lists.ozlabs.org
Subject: OpenPIC warnings on P1022RDK AMP configuration

Greetings All

I am running dual linux on P1022RDK in AMP configuration. My memory partitioning is as below

---------------------------------------------------------------------
Core                      | Base Address  |    Size                 |
--------------------------|---------------|-------------------------|
Core 0 (MEL RT Kernel)    | 0x0000,0000   | 0x0C00,0000 - 192 (MB)  |
Core 1 (LTIB Kernel)      |0x0C00,0000    | 0x1000,0000 - 256 (MB)  |
MCAPI - Shared Mem        | 0x1C00,0000   | 0x0400,0000 - 64  (MB)  |
---------------------------------------------------------------------

My kernel command lines are for both cores

setenv core0bootargs root=/dev/mmcblk0p3 rootdelay=5 rw mem=192M
setenv core1bootargs root=/dev/nfs nfsroot=<serverip>:/<rootfs> ip=dhcp mem=256M

I have also partitioned my hardware among both cores and supplied appropriate "protected-sources" in mpic node of both DTS files. The problem is, I receive following warning continuously only on Core 0 console, but Core 1 is running smooth. In following warnings source irqs are mostly which I have assigned to Core 1, but they are present in Core 0 mpic node "protected-sources" list, then why I am getting these messages.

[    6.219204] __ratelimit: 5796 callbacks suppressed
[    6.223996]  OpenPIC  : Got protected source 30 !
[    7.005809]  OpenPIC  : Got protected source 30 !
[    8.005906]  OpenPIC  : Got protected source 30 !
[    8.433053]  OpenPIC  : Got protected source 29 !
[    8.437942]  OpenPIC  : Got protected source 30 !
[    8.443101]  OpenPIC  : Got protected source 30 !
[    8.448428]  OpenPIC  : Got protected source 30 !

Any thoughts ?

Best Regards

Farrukh Arshad
Sr. Software Development Engineer
Mentor Graphics Pakistan
Ph:   +92 - 423 - 609 - 92 - 09
Cell: +92 - 303 - 444 - 77 - 05


[-- Attachment #2: Type: text/html, Size: 8677 bytes --]

^ permalink raw reply

* Re: [PATCH v2] of: Change logic to overwrite cmd_line with CONFIG_CMDLINE
From: Benjamin Herrenschmidt @ 2012-01-10  9:10 UTC (permalink / raw)
  To: Doug Anderson; +Cc: devicetree-discuss, linuxppc-dev
In-Reply-To: <CAD=FV=UVo-vCe-s9tdXq_Od3jJnJ58Nx=4f-S3XfWBfDhxoXMg@mail.gmail.com>

On Fri, 2012-01-06 at 16:48 -0800, Doug Anderson wrote:
> I know this is a long-dead thread, but I was a little curious about
> the motivation here.

Hi ! Sorry, I planned to reply earlier and then forgot about it...

> I'm looking at trying to support CONFIG_CMDLINE_EXTEND (an ARM
> Kconfig) in this function and don't know in which cases I should look
> at the CONFIG_CMDLINE and in which cases I should use whatever
> happened to be in data before the function was called.

I'll have a look later (gotta run soon) but basically, the reason I did
that logic change is that in some specific circumstances and firmware
version, I end up writing the user-specified command line in the global
prior to actually booting the kernel :-)

So in that case, I really don't want CONFIG_CMDLINE to take over because
there's nothing in the device-tree, the user -did- specify something but
not via the device-tree.

Now, that's a bit of an oddball scenario but I felt that the logic
change was harmless.

For those interested in gory details, it's when doing the OPAL takeover
on some machines with the version 1 of OPAL firmware. We basically boot
under pHyp normally (pSeries hypervisor) and do a magic hcall which
relocates the kernel to contiguous physical memory and re-starts it with
a flat device-tree.

The takeover mechanism didn't provide me with a way for passing a
command line in that fdt. So I had to do it from prom_init (still
running under pHyp context), change the kernel global before it gets
relocated.

Cheers,
Ben.


> Here's the definition in the KConfig:
> <http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=blob;f=arch/arm/Kconfig;h=24626b0419ee97e963e68329a8eb6769360b46ea;hb=HEAD#l1984>
> 
> Which case do you have CONFIG_CMDLINE defined but not CMDLINE_FORCE?
> In those cases, do you happen to have CONFIG_CMDLINE_EXTEND or
> CMDLINE_FROM_BOOTLOADER defined?
> 
> Thanks much!
> 
> -Doug
> 
> ---
> 
> On Mon, Sep 19, 2011 at 9:55 PM, Grant Likely <grant.likely@secretlab.ca> wrote:
> >
> > On Tue, Sep 20, 2011 at 02:50:15PM +1000, Benjamin Herrenschmidt wrote:
> > > We used to overwrite with CONFIG_CMDLINE if we found a chosen
> > > node but failed to get bootargs out of it or they were empty,
> > > unless CONFIG_CMDLINE_FORCE is set.
> > >
> > > Instead change that to overwrite if "data" is non empty after
> > > the bootargs check. It allows arch code to have other mechanisms
> > > to retrieve the command line prior to parsing the device-tree.
> > >
> > > Note: CONFIG_CMDLINE_FORCE case should ideally be handled elsewhere
> > > as it won't work as it-is if the device-tree has no /chosen node
> > >
> > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > CC: devicetree-discuss@lists-ozlabs.org
> > > CC: Grant Likely <grant.likely@secretlab.ca>
> >
> > Looks okay to me.
> >
> > Acked-by: Grant Likely <grant.likely@secretlab.ca>
> >
> > > ---
> > >  drivers/of/fdt.c |    7 ++++++-
> > >  1 files changed, 6 insertions(+), 1 deletions(-)
> > >
> > > v2. Use "data" instead of "cmd_line" so it works on archs like
> > > mips who don't pass cmd_line to that function to start with, also
> > > add a comment explaining the mechanism.
> > >
> > > (resent with right list address as well while at it)
> > >
> > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > index 65200af..323b722 100644
> > > --- a/drivers/of/fdt.c
> > > +++ b/drivers/of/fdt.c
> > > @@ -681,9 +681,14 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
> > >       if (p != NULL && l > 0)
> > >               strlcpy(data, p, min((int)l, COMMAND_LINE_SIZE));
> > >
> > > +     /*
> > > +      * CONFIG_CMDLINE is meant to be a default in case nothing else
> > > +      * managed to set the command line, unless CONFIG_CMDLINE_FORCE
> > > +      * is set in which case we override whatever was found earlier.
> > > +      */
> > >  #ifdef CONFIG_CMDLINE
> > >  #ifndef CONFIG_CMDLINE_FORCE
> > > -     if (p == NULL || l == 0 || (l == 1 && (*p) == 0))
> > > +     if (!data[0])
> > >  #endif
> > >               strlcpy(data, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
> > >  #endif /* CONFIG_CMDLINE */
> > > --
> > > 1.7.4.1
> > >
> > >
> > >
> > >
> > _______________________________________________
> > devicetree-discuss mailing list
> > devicetree-discuss@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/devicetree-discuss

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox