LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] powerpc/85xx: don't init the mpic ipi for the SoC which has doorbell support
From: Scott Wood @ 2013-11-07 17:34 UTC (permalink / raw)
  To: Kevin Hao; +Cc: linuxppc
In-Reply-To: <1383808637-26769-1-git-send-email-haokexin@gmail.com>

On Thu, 2013-11-07 at 15:17 +0800, Kevin Hao wrote:
> It makes no sense to initialize the mpic ipi for the SoC which has
> doorbell support. So set the smp_85xx_ops.probe to NULL for this
> case. Since the smp_85xx_ops.probe is also used in function
> smp_85xx_setup_cpu() to check if we need to invoke
> mpic_setup_this_cpu(), we introduce a new setup_cpu function
> smp_85xx_basic_setup() to remove this dependency.

Is there any harm caused by setting up the IPIs?

What about other MPIC setup, such as setting the current task priority
register?


> Signed-off-by: Kevin Hao <haokexin@gmail.com>
> ---
> 
> Boot test on p2020rdb and p5020ds.
> 
>  arch/powerpc/platforms/85xx/smp.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
> index 281b7f01df63..d3b310f87ce9 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -388,15 +388,18 @@ static void mpc85xx_smp_machine_kexec(struct kimage *image)
>  }
>  #endif /* CONFIG_KEXEC */
>  
> -static void smp_85xx_setup_cpu(int cpu_nr)
> +static void smp_85xx_basic_setup(int cpu_nr)
>  {
> -	if (smp_85xx_ops.probe == smp_mpic_probe)
> -		mpic_setup_this_cpu();
> -
>  	if (cpu_has_feature(CPU_FTR_DBELL))
>  		doorbell_setup_this_cpu();
>  }
>  
> +static void smp_85xx_setup_cpu(int cpu_nr)
> +{
> +	mpic_setup_this_cpu();
> +	smp_85xx_basic_setup(cpu_nr);
> +}
> +
>  static const struct of_device_id mpc85xx_smp_guts_ids[] = {
>  	{ .compatible = "fsl,mpc8572-guts", },
>  	{ .compatible = "fsl,p1020-guts", },
> @@ -411,13 +414,14 @@ void __init mpc85xx_smp_init(void)
>  {
>  	struct device_node *np;
>  
> -	smp_85xx_ops.setup_cpu = smp_85xx_setup_cpu;
>  
>  	np = of_find_node_by_type(NULL, "open-pic");
>  	if (np) {
>  		smp_85xx_ops.probe = smp_mpic_probe;
> +		smp_85xx_ops.setup_cpu = smp_85xx_setup_cpu;
>  		smp_85xx_ops.message_pass = smp_mpic_message_pass;
> -	}
> +	} else
> +		smp_85xx_ops.setup_cpu = smp_85xx_basic_setup;
>  
>  	if (cpu_has_feature(CPU_FTR_DBELL)) {
>  		/*
> @@ -426,6 +430,7 @@ void __init mpc85xx_smp_init(void)
>  		 */
>  		smp_85xx_ops.message_pass = NULL;
>  		smp_85xx_ops.cause_ipi = doorbell_cause_ipi;
> +		smp_85xx_ops.probe = NULL;
>  	}

BTW, what exactly is probe() supposed to be doing?  It looks like its
main effect (with smp_mpic_probe) is to request IPIs, but the caller
seems to treat it mainly as a way to determine CPU count.

I looked at the caller of .probe() (which is smp_prepare_cpus()) to see
what happens when probe is NULL, and the handling of max_cpus doesn't
make much sense.  At first I was concerned by the gratuitous difference
between smp_mpic_probe() using cpu_possible_mask versus
smp_prepare_cpus() using NR_CPUS, but the value isn't even used (all the
code that consumed max_cpus after setting it has been removed), and the
value passed in to smp_prepare_cpus() is ignored.

-Scott

^ permalink raw reply

* Re: [PATCHv2 5/8] ASoC: SGTL5000: Enhance the SGTL5000 codec driver about regulator.
From: Mark Brown @ 2013-11-07 20:38 UTC (permalink / raw)
  To: Li Xiubo
  Cc: mark.rutland@arm.com, alsa-devel@alsa-project.org,
	linux-doc@vger.kernel.org, tiwai@suse.de, timur@tabi.org,
	perex@perex.cz, Huan Wang, LW@KARO-electronics.de,
	linux@arm.linux.org.uk, Shawn Guo, grant.likely@linaro.org,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	ian.campbell@citrix.com, pawel.moll@arm.com,
	swarren@wwwdotorg.org, rob.herring@calxeda.com, Zhengxiong Jin,
	oskar@scara.com, Fabio Estevam, lgirdwood@gmail.com,
	linux-kernel@vger.kernel.org, rob@landley.net, Guangyu Chen,
	shawn.guo@linaro.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1DD289F6464F0949A2FCA5AA6DC23F828783E6@039-SN2MPN1-013.039d.mgd.msft.net>

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

On Thu, Nov 07, 2013 at 03:01:02AM +0000, Li Xiubo wrote:
> > > The SGTL5000 is based on regulators and when it is disabled, there
> > > will be an error returns directly while the SGTL5000 codec is probing.

> > What makes you say this?  

> static int ldo_regulator_register(struct snd_soc_codec *codec,
>                                 struct regulator_init_data *init_data,
>                                 int voltage)

If the regulator is not used in the system then why is the driver
getting as far as trying to register it?  Surely this is a system
configuration error?  This all sounds like some problem with either the
system integration or the driver which is causing it to try to register
the regulator needlessly - you should be fixing that problem, not adding
ifdefs.  I'm still unclear on exactly what the issue is so it's hard to
say exactly what the best way forwards is.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] powerpc/85xx: don't init the mpic ipi for the SoC which has doorbell support
From: Kevin Hao @ 2013-11-08  1:54 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc
In-Reply-To: <1383845691.23598.125.camel@snotra.buserror.net>

[-- Attachment #1: Type: text/plain, Size: 2217 bytes --]

On Thu, Nov 07, 2013 at 11:34:51AM -0600, Scott Wood wrote:
> On Thu, 2013-11-07 at 15:17 +0800, Kevin Hao wrote:
> > It makes no sense to initialize the mpic ipi for the SoC which has
> > doorbell support. So set the smp_85xx_ops.probe to NULL for this
> > case. Since the smp_85xx_ops.probe is also used in function
> > smp_85xx_setup_cpu() to check if we need to invoke
> > mpic_setup_this_cpu(), we introduce a new setup_cpu function
> > smp_85xx_basic_setup() to remove this dependency.
> 
> Is there any harm caused by setting up the IPIs?

No real harm. Just make no sense to do so and it does cause confusion
when you cat /proc/interrupts and get something like this:
  507:          0          0   OpenPIC   2043 Edge      ipi call function
  508:          0          0   OpenPIC   2044 Edge      ipi reschedule
  509:          0          0   OpenPIC   2045 Edge      ipi call function single
  DBL:       7053      10137   Doorbell interrupts

> 
> What about other MPIC setup, such as setting the current task priority
> register?

This is done by the invoking of function mpic_setup_this_cpu() in
smp_85xx_setup_cpu().

> BTW, what exactly is probe() supposed to be doing?

It is supposed to do the platform specific smp initialization and then return
the CPU count.

>  It looks like its
> main effect (with smp_mpic_probe) is to request IPIs, but the caller
> seems to treat it mainly as a way to determine CPU count.
> 
> I looked at the caller of .probe() (which is smp_prepare_cpus()) to see
> what happens when probe is NULL, and the handling of max_cpus doesn't
> make much sense.  At first I was concerned by the gratuitous difference
> between smp_mpic_probe() using cpu_possible_mask versus
> smp_prepare_cpus() using NR_CPUS, but the value isn't even used (all the
> code that consumed max_cpus after setting it has been removed), and the
> value passed in to smp_prepare_cpus() is ignored.

Yes, the return value of .probe makes no sense now. Actually there is already
a patch to remove the check of the return value of .probe in function
smp_prepare_cpus().
  http://patchwork.ozlabs.org/patch/260574/

Thanks,
Kevin
> 
> -Scott
> 
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/2] powerpc/kvm: fix rare but potential deadlock scene
From: Liu ping fan @ 2013-11-08  2:07 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc
In-Reply-To: <7536FFB1-D186-42AC-9DF5-A46A72CB8F1D@suse.de>

On Thu, Nov 7, 2013 at 5:54 PM, Alexander Graf <agraf@suse.de> wrote:
>
> On 07.11.2013, at 07:22, Liu Ping Fan <kernelfans@gmail.com> wrote:
>
>> Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
>> realmode, so it can trigger the deadlock.
>>
>> Suppose the following scene:
>>
>> Two physical cpuM, cpuN, two VM instances A, B, each VM has a group of v=
cpus.
>>
>> If on cpuM, vcpu_A_1 holds bitlock X (HPTE_V_HVLOCK), then is switched o=
ut,
>> and on cpuN, vcpu_A_2 try to lock X in realmode, then cpuN will be caugh=
t in
>> realmode for a long time.
>>
>> What makes things even worse if the following happens,
>>  On cpuM, bitlockX is hold, on cpuN, Y is hold.
>>  vcpu_B_2 try to lock Y on cpuM in realmode
>>  vcpu_A_2 try to lock X on cpuN in realmode
>>
>> Oops! deadlock happens
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>
> Very nice catch :).
>
> I think it makes a lot of sense to document the fact that kvmppc_hv_find_=
lock_hpte() should be called with preemption disabled in a comment above th=
e function, so that next time when someone potentially calls it, he knows t=
hat he needs to put preempt_disable() around it.
>
Ok, I will document them in v3

> Thanks a lot for finding this pretty subtle issue. May I ask how you got =
there? Did you actually see systems deadlock because of this?
>
Intuition :). then I begin try to model a scene which causes the
deadlock. And fortunately, I find a case to verify my suspension.

Regards,
Pingfan
>
> Alex
>
>> ---
>> arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book=
3s_64_mmu_hv.c
>> index 043eec8..dbc1478 100644
>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
>> @@ -474,10 +474,13 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kv=
m_vcpu *vcpu, gva_t eaddr,
>>       }
>>
>>       /* Find the HPTE in the hash table */
>> +     preempt_disable();
>>       index =3D kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
>>                                        HPTE_V_VALID | HPTE_V_ABSENT);
>> -     if (index < 0)
>> +     if (index < 0) {
>> +             preempt_enable();
>>               return -ENOENT;
>> +     }
>>       hptep =3D (unsigned long *)(kvm->arch.hpt_virt + (index << 4));
>>       v =3D hptep[0] & ~HPTE_V_HVLOCK;
>>       gr =3D kvm->arch.revmap[index].guest_rpte;
>> @@ -485,6 +488,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_=
vcpu *vcpu, gva_t eaddr,
>>       /* Unlock the HPTE */
>>       asm volatile("lwsync" : : : "memory");
>>       hptep[0] =3D v;
>> +     preempt_enable();
>>
>>       gpte->eaddr =3D eaddr;
>>       gpte->vpage =3D ((v & HPTE_V_AVPN) << 4) | ((eaddr >> 12) & 0xfff)=
;
>> --
>> 1.8.1.4
>>
>

^ permalink raw reply

* Re: [PATCH v2 2/2] powerpc/kvm: remove redundant assignment
From: Liu ping fan @ 2013-11-08  2:11 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc
In-Reply-To: <CF6D863C-3E0A-4BEF-B985-77B336BA92F1@suse.de>

On Thu, Nov 7, 2013 at 6:06 PM, Alexander Graf <agraf@suse.de> wrote:
>
> On 07.11.2013, at 07:22, Liu Ping Fan <kernelfans@gmail.com> wrote:
>
>> ret is assigned twice with the same value, so remove the later one.
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> Acked-by: Paul Mackerras <paulus@samba.org>
>
> I suppose my last request for a patch description was slightly too abbrev=
iated :). Sorry about that.
>
> Imagine you are a Linux-stable maintainer. You have about 5000 patches in=
 front of you and you want to figure out whether a patch should get backpor=
ted into a stable tree or not.
>
> It's very easy to read through the patch description.
> It's reasonably easy to do a git show on the patch.
> It's very hard to look at the actual surrounding code that was changed.
>
> If I open a text editor on the file, I immediately see what you're saying=
:
>
>>         ret =3D RESUME_GUEST;
>>         preempt_disable();
>>         while (!try_lock_hpte(hptep, HPTE_V_HVLOCK))
>>                 cpu_relax();
>>         if ((hptep[0] & ~HPTE_V_HVLOCK) !=3D hpte[0] || hptep[1] !=3D hp=
te[1] ||
>>             rev->guest_rpte !=3D hpte[2])
>>                 /* HPTE has been changed under us; let the guest retry *=
/
>>                 goto out_unlock;
>>         hpte[0] =3D (hpte[0] & ~HPTE_V_ABSENT) | HPTE_V_VALID;
>>
>>         rmap =3D &memslot->arch.rmap[gfn - memslot->base_gfn];
>>         lock_rmap(rmap);
>>
>>         /* Check if we might have been invalidated; let the guest retry =
if so */
>>         ret =3D RESUME_GUEST;
>
> However, that scope is not given in the actual patch itself. If you look =
at the diff below, you have no idea whether the patch is fixing a bug or ju=
st removes duplication and doesn't actually have any effect. In fact, the c=
ompiled assembly should be the same with this patch and without. But you ca=
n't tell from the diff below.
>
> So what I would like to see in the patch description is something that ma=
kes it easy to understand what's going on without the need to check out the=
 source file. Something like
>
>> We redundantly set ret to RESUME_GUEST twice without changing it in betw=
een. Only do it once:
>>
>>         ret =3D RESUME_GUEST;
>>         preempt_disable();
>>         while (!try_lock_hpte(hptep, HPTE_V_HVLOCK))
>>                 cpu_relax();
>>         if ((hptep[0] & ~HPTE_V_HVLOCK) !=3D hpte[0] || hptep[1] !=3D hp=
te[1] ||
>>             rev->guest_rpte !=3D hpte[2])
>>                 /* HPTE has been changed under us; let the guest retry *=
/
>>                 goto out_unlock;
>>         hpte[0] =3D (hpte[0] & ~HPTE_V_ABSENT) | HPTE_V_VALID;
>>
>>         rmap =3D &memslot->arch.rmap[gfn - memslot->base_gfn];
>>         lock_rmap(rmap);
>>
>>         /* Check if we might have been invalidated; let the guest retry =
if so */
>>         ret =3D RESUME_GUEST;
>
>
> If I look at that patch description it immediately tells me "Ah, no need =
to worry, it's not a critical bug I need to backport". If you have a better=
 idea how to express that I'm more than happy to take that too. Otherwise j=
ust let me know whether you like the description above and I'll modify it t=
o the one that includes the code snippet when applying the patch.
>
Oh, yes. Thank you very much..
And I had a better understanding of the heavy work of maintainers :)
Will keep this in mind.

Best regards,
Pingfan

^ permalink raw reply

* [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Liu Ping Fan @ 2013-11-08  2:44 UTC (permalink / raw)
  To: linuxppc-dev, kvm-ppc; +Cc: Paul Mackerras, Alexander Graf

syscall is a very common behavior inside guest, and this patch
optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
so hypervisor can return to guest without heavy exit, i.e, no need
to swap TLB, HTAB,.. etc

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
Compiled, but lack of bare metal, I have not tested it yet.
---
 arch/powerpc/kvm/book3s_hv.c            |  6 ------
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 62a2b5a..73dc852 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		/* hcall - punt to userspace */
 		int i;
 
-		if (vcpu->arch.shregs.msr & MSR_PR) {
-			/* sc 1 from userspace - reflect to guest syscall */
-			kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL);
-			r = RESUME_GUEST;
-			break;
-		}
 		run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3);
 		for (i = 0; i < 9; ++i)
 			run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c71103b..9f626c3 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1388,7 +1388,8 @@ kvmppc_hisi:
 hcall_try_real_mode:
 	ld	r3,VCPU_GPR(R3)(r9)
 	andi.	r0,r11,MSR_PR
-	bne	guest_exit_cont
+	/* sc 1 from userspace - reflect to guest syscall */
+	bne	sc_0_fast_return
 	clrrdi	r3,r3,2
 	cmpldi	r3,hcall_real_table_end - hcall_real_table
 	bge	guest_exit_cont
@@ -1409,6 +1410,16 @@ hcall_try_real_mode:
 	ld	r11,VCPU_MSR(r4)
 	b	fast_guest_return
 
+sc_0_fast_return:
+	ld	r10,VCPU_PC(r9)
+	ld	r11,VCPU_MSR(r9)
+	mtspr	SPRN_SRR0,r10
+	mtspr	SPRN_SRR1,r11
+	li	r10, BOOK3S_INTERRUPT_SYSCALL
+	LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)	/* zero 33:36,42:47 */
+	and	r11,r11,r3
+	b	fast_guest_return
+
 	/* We've attempted a real mode hcall, but it's punted it back
 	 * to userspace.  We need to restore some clobbered volatiles
 	 * before resuming the pass-it-to-qemu path */
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH v11 3/3] DMA: Freescale: update driver to support 8-channel DMA engine
From: Dan Williams @ 2013-11-08  2:45 UTC (permalink / raw)
  To: Hongbo Zhang
  Cc: mark.rutland, devicetree, ian.campbell, pawel.moll,
	Stephen Warren, Koul, Vinod, Linux Kernel Mailing List,
	rob.herring, dmaengine, linuxppc-dev
In-Reply-To: <5278587F.7090500@freescale.com>

On Mon, Nov 4, 2013 at 6:31 PM, Hongbo Zhang <hongbo.zhang@freescale.com> wrote:
> Hi Vinod Koul and Dan Williams,
> Ping?
>

Not much to review from the dmaengine side, just one question below.
It would be helpful if you can send these to the new dmaengine
patchwork at dmaengine@vger.kernel.org with the Acks you have already
collected.

>
>
> On 10/17/2013 01:56 PM, Hongbo Zhang wrote:
>>
>> Hi Vinod,
>> I have gotten ACK from Mark for both the 1/3 and 2/3 patches.
>> Thanks.
>>
>>
>> On 09/26/2013 05:33 PM, hongbo.zhang@freescale.com wrote:
>>>
>>> From: Hongbo Zhang <hongbo.zhang@freescale.com>
>>>
>>> This patch adds support to 8-channel DMA engine, thus the driver works
>>> for both
>>> the new 8-channel and the legacy 4-channel DMA engines.
>>>
>>> Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
>>> ---
>>>   drivers/dma/Kconfig  |    9 +++++----
>>>   drivers/dma/fsldma.c |    9 ++++++---
>>>   drivers/dma/fsldma.h |    2 +-
>>>   3 files changed, 12 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
>>> index 6825957..3979c65 100644
>>> --- a/drivers/dma/Kconfig
>>> +++ b/drivers/dma/Kconfig
>>> @@ -89,14 +89,15 @@ config AT_HDMAC
>>>         Support the Atmel AHB DMA controller.
>>>     config FSL_DMA
>>> -    tristate "Freescale Elo and Elo Plus DMA support"
>>> +    tristate "Freescale Elo series DMA support"
>>>       depends on FSL_SOC
>>>       select DMA_ENGINE
>>>       select ASYNC_TX_ENABLE_CHANNEL_SWITCH
>>>       ---help---
>>> -      Enable support for the Freescale Elo and Elo Plus DMA controllers.
>>> -      The Elo is the DMA controller on some 82xx and 83xx parts, and the
>>> -      Elo Plus is the DMA controller on 85xx and 86xx parts.
>>> +      Enable support for the Freescale Elo series DMA controllers.
>>> +      The Elo is the DMA controller on some mpc82xx and mpc83xx parts,
>>> the
>>> +      EloPlus is on mpc85xx and mpc86xx and Pxxx parts, and the Elo3 is
>>> on
>>> +      some Txxx and Bxxx parts.
>>>     config MPC512X_DMA
>>>       tristate "Freescale MPC512x built-in DMA engine support"
>>> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
>>> index 49e8fbd..16a9a48 100644
>>> --- a/drivers/dma/fsldma.c
>>> +++ b/drivers/dma/fsldma.c
>>> @@ -1261,7 +1261,9 @@ static int fsl_dma_chan_probe(struct fsldma_device
>>> *fdev,
>>>       WARN_ON(fdev->feature != chan->feature);
>>>         chan->dev = fdev->dev;
>>> -    chan->id = ((res.start - 0x100) & 0xfff) >> 7;
>>> +    chan->id = (res.start & 0xfff) < 0x300 ?
>>> +           ((res.start - 0x100) & 0xfff) >> 7 :
>>> +           ((res.start - 0x200) & 0xfff) >> 7;
>>>       if (chan->id >= FSL_DMA_MAX_CHANS_PER_DEVICE) {

Isn't it a bit fragile to have this based on the resource address?
Can't device tree tell you the channel id directly by an index into
the "dma0: dma@100300" node?

--
Dan

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Alexander Graf @ 2013-11-08  3:10 UTC (permalink / raw)
  To: Liu Ping Fan; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc
In-Reply-To: <1383878656-4196-1-git-send-email-pingfank@linux.vnet.ibm.com>


On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:

> syscall is a very common behavior inside guest, and this patch
> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
> so hypervisor can return to guest without heavy exit, i.e, no need
> to swap TLB, HTAB,.. etc

The syscall exit you touch here only happens when you do an sc > 0 with =
MSR_PR set inside the guest. The only case you realistically see this is =
when you run PR KVM inside of an HV KVM guest.

I don't think we should optimize for that case. Instead, we should =
rather try to not bounce to the 1st hypervisor in the first place in =
that scenario :).


Alex

>=20
> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> ---
> Compiled, but lack of bare metal, I have not tested it yet.
> ---
> arch/powerpc/kvm/book3s_hv.c            |  6 ------
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++-
> 2 files changed, 12 insertions(+), 7 deletions(-)
>=20
> diff --git a/arch/powerpc/kvm/book3s_hv.c =
b/arch/powerpc/kvm/book3s_hv.c
> index 62a2b5a..73dc852 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run =
*run, struct kvm_vcpu *vcpu,
> 		/* hcall - punt to userspace */
> 		int i;
>=20
> -		if (vcpu->arch.shregs.msr & MSR_PR) {
> -			/* sc 1 from userspace - reflect to guest =
syscall */
> -			kvmppc_book3s_queue_irqprio(vcpu, =
BOOK3S_INTERRUPT_SYSCALL);
> -			r =3D RESUME_GUEST;
> -			break;
> -		}
> 		run->papr_hcall.nr =3D kvmppc_get_gpr(vcpu, 3);
> 		for (i =3D 0; i < 9; ++i)
> 			run->papr_hcall.args[i] =3D kvmppc_get_gpr(vcpu, =
4 + i);
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S =
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index c71103b..9f626c3 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1388,7 +1388,8 @@ kvmppc_hisi:
> hcall_try_real_mode:
> 	ld	r3,VCPU_GPR(R3)(r9)
> 	andi.	r0,r11,MSR_PR
> -	bne	guest_exit_cont
> +	/* sc 1 from userspace - reflect to guest syscall */
> +	bne	sc_0_fast_return
> 	clrrdi	r3,r3,2
> 	cmpldi	r3,hcall_real_table_end - hcall_real_table
> 	bge	guest_exit_cont
> @@ -1409,6 +1410,16 @@ hcall_try_real_mode:
> 	ld	r11,VCPU_MSR(r4)
> 	b	fast_guest_return
>=20
> +sc_0_fast_return:
> +	ld	r10,VCPU_PC(r9)
> +	ld	r11,VCPU_MSR(r9)
> +	mtspr	SPRN_SRR0,r10
> +	mtspr	SPRN_SRR1,r11
> +	li	r10, BOOK3S_INTERRUPT_SYSCALL
> +	LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)	/* zero =
33:36,42:47 */
> +	and	r11,r11,r3
> +	b	fast_guest_return
> +
> 	/* We've attempted a real mode hcall, but it's punted it back
> 	 * to userspace.  We need to restore some clobbered volatiles
> 	 * before resuming the pass-it-to-qemu path */
> --=20
> 1.8.1.4
>=20

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Mathieu Desnoyers @ 2013-11-07 23:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Oleg Nesterov, LKML, Linux PPC dev,
	Anton Blanchard, Frederic Weisbecker, Victor Kaplansky,
	Paul E. McKenney, Linus Torvalds
In-Reply-To: <20131104112254.GK28601@twins.programming.kicks-ass.net>

* Peter Zijlstra (peterz@infradead.org) wrote:

[...]

Hi Peter,

Looking at this simplified version of perf's ring buffer
synchronization, I get concerned about the following issue:

> /*
>  * One important detail is that the kbuf part and the kbuf_writer() are
>  * strictly per cpu and we can thus rely on program order for those.
>  *
>  * Only the userspace consumer can possibly run on another cpu, and thus we
>  * need to ensure data consistency for those.
>  */
> 
> struct buffer {
>         u64 size;
>         u64 tail;
>         u64 head;
>         void *data;
> };
> 
> struct buffer *kbuf, *ubuf;
> 
> /*
>  * If there's space in the buffer; store the data @buf; otherwise
>  * discard it.
>  */
> void kbuf_write(int sz, void *buf)
> {
> 	u64 tail, head, offset;
> 
> 	do {
> 		tail = ACCESS_ONCE(ubuf->tail);
> 		offset = head = kbuf->head;
> 		if (CIRC_SPACE(head, tail, kbuf->size) < sz) {
> 			/* discard @buf */
> 			return;
> 		}
> 		head += sz;
> 	} while (local_cmpxchg(&kbuf->head, offset, head) != offset)
> 

Let's suppose we have a thread executing kbuf_write(), interrupted by an
IRQ or NMI right after a successful local_cmpxchg() (space reservation
in the buffer). If the nested execution context also calls kbuf_write(),
it will therefore update ubuf->head (below) with the second reserved
space, and only after that will it return to the original thread context
and continue executing kbuf_write(), thus overwriting ubuf->head with
the prior-to-last reserved offset.

All this probably works OK most of the times, when we have an event flow
guaranteeing that a following event will fix things up, but there
appears to be a risk of losing events near the end of the trace when
those are in nested execution contexts.

Thoughts ?

Thanks,

Mathieu

>         /*
>          * Ensure that if we see the userspace tail (ubuf->tail) such
>          * that there is space to write @buf without overwriting data
>          * userspace hasn't seen yet, we won't in fact store data before
>          * that read completes.
>          */
> 
>         smp_mb(); /* A, matches with D */
> 
>         memcpy(kbuf->data + offset, buf, sz);
> 
>         /*
>          * Ensure that we write all the @buf data before we update the
>          * userspace visible ubuf->head pointer.
>          */
>         smp_wmb(); /* B, matches with C */
> 
>         ubuf->head = kbuf->head;
> }
> 
> /*
>  * Consume the buffer data and update the tail pointer to indicate to
>  * kernel space there's 'free' space.
>  */
> void ubuf_read(void)
> {
>         u64 head, tail;
> 
>         tail = ACCESS_ONCE(ubuf->tail);
>         head = ACCESS_ONCE(ubuf->head);
> 
>         /*
>          * Ensure we read the buffer boundaries before the actual buffer
>          * data...
>          */
>         smp_rmb(); /* C, matches with B */
> 
>         while (tail != head) {
>                 obj = ubuf->data + tail;
>                 /* process obj */
>                 tail += obj->size;
>                 tail %= ubuf->size;
>         }
> 
>         /*
>          * Ensure all data reads are complete before we issue the
>          * ubuf->tail update; once that update hits, kbuf_write() can
>          * observe and overwrite data.
>          */
>         smp_mb(); /* D, matches with A */
> 
>         ubuf->tail = tail;
> }

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Benjamin Herrenschmidt @ 2013-11-08  4:05 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc, Liu Ping Fan
In-Reply-To: <470E7CC0-6397-4343-89DA-6818531E6775@suse.de>

On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote:
> On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:
> 
> > syscall is a very common behavior inside guest, and this patch
> > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
> > so hypervisor can return to guest without heavy exit, i.e, no need
> > to swap TLB, HTAB,.. etc
> 
> The syscall exit you touch here only happens when you do an sc > 0
> with MSR_PR set inside the guest. The only case you realistically see
> this is when you run PR KVM inside of an HV KVM guest.
> 
> I don't think we should optimize for that case. Instead, we should
> rather try to not bounce to the 1st hypervisor in the first place in
> that scenario :).

Well, so unfortunately openstack CI uses PR inside HV pretty
heavily .... it *might* be worthwhile optimizing that path if the patch
is simple enough... I'd make that Paul's call.

Cheers,
Ben.

> 
> Alex
> 
> > 
> > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> > ---
> > Compiled, but lack of bare metal, I have not tested it yet.
> > ---
> > arch/powerpc/kvm/book3s_hv.c            |  6 ------
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++-
> > 2 files changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/book3s_hv.c
> b/arch/powerpc/kvm/book3s_hv.c
> > index 62a2b5a..73dc852 100644
> > --- a/arch/powerpc/kvm/book3s_hv.c
> > +++ b/arch/powerpc/kvm/book3s_hv.c
> > @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run
> *run, struct kvm_vcpu *vcpu,
> > 		/* hcall - punt to userspace */
> > 		int i;
> > 
> > -		if (vcpu->arch.shregs.msr & MSR_PR) {
> > -			/* sc 1 from userspace - reflect to guest syscall */
> > -			kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL);
> > -			r = RESUME_GUEST;
> > -			break;
> > -		}
> > 		run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3);
> > 		for (i = 0; i < 9; ++i)
> > 			run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i);
> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > index c71103b..9f626c3 100644
> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > @@ -1388,7 +1388,8 @@ kvmppc_hisi:
> > hcall_try_real_mode:
> > 	ld	r3,VCPU_GPR(R3)(r9)
> > 	andi.	r0,r11,MSR_PR
> > -	bne	guest_exit_cont
> > +	/* sc 1 from userspace - reflect to guest syscall */
> > +	bne	sc_0_fast_return
> > 	clrrdi	r3,r3,2
> > 	cmpldi	r3,hcall_real_table_end - hcall_real_table
> > 	bge	guest_exit_cont
> > @@ -1409,6 +1410,16 @@ hcall_try_real_mode:
> > 	ld	r11,VCPU_MSR(r4)
> > 	b	fast_guest_return
> > 
> > +sc_0_fast_return:
> > +	ld	r10,VCPU_PC(r9)
> > +	ld	r11,VCPU_MSR(r9)
> > +	mtspr	SPRN_SRR0,r10
> > +	mtspr	SPRN_SRR1,r11
> > +	li	r10, BOOK3S_INTERRUPT_SYSCALL
> > +	LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)	/* zero 33:36,42:47 */
> > +	and	r11,r11,r3
> > +	b	fast_guest_return
> > +
> > 	/* We've attempted a real mode hcall, but it's punted it back
> > 	 * to userspace.  We need to restore some clobbered volatiles
> > 	 * before resuming the pass-it-to-qemu path */
> > -- 
> > 1.8.1.4
> > 

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Benjamin Herrenschmidt @ 2013-11-08  4:11 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc, Liu Ping Fan
In-Reply-To: <1383883503.4776.214.camel@pasglop>

On Fri, 2013-11-08 at 15:05 +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote:
> > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:
> > 
> > > syscall is a very common behavior inside guest, and this patch
> > > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
> > > so hypervisor can return to guest without heavy exit, i.e, no need
> > > to swap TLB, HTAB,.. etc
> > 
> > The syscall exit you touch here only happens when you do an sc > 0
> > with MSR_PR set inside the guest. The only case you realistically see
> > this is when you run PR KVM inside of an HV KVM guest.
> > 
> > I don't think we should optimize for that case. Instead, we should
> > rather try to not bounce to the 1st hypervisor in the first place in
> > that scenario :).
> 
> Well, so unfortunately openstack CI uses PR inside HV pretty
> heavily .... it *might* be worthwhile optimizing that path if the patch
> is simple enough... I'd make that Paul's call.

Note that this is a statement of value for the idea ... not the
implementation ;-) From a quick look with Paulus, the patch is quite
broken. I'll let Paul comment in details.

Cheers,
Ben.

> Cheers,
> Ben.
> 
> > 
> > Alex
> > 
> > > 
> > > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> > > ---
> > > Compiled, but lack of bare metal, I have not tested it yet.
> > > ---
> > > arch/powerpc/kvm/book3s_hv.c            |  6 ------
> > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++-
> > > 2 files changed, 12 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/kvm/book3s_hv.c
> > b/arch/powerpc/kvm/book3s_hv.c
> > > index 62a2b5a..73dc852 100644
> > > --- a/arch/powerpc/kvm/book3s_hv.c
> > > +++ b/arch/powerpc/kvm/book3s_hv.c
> > > @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run
> > *run, struct kvm_vcpu *vcpu,
> > > 		/* hcall - punt to userspace */
> > > 		int i;
> > > 
> > > -		if (vcpu->arch.shregs.msr & MSR_PR) {
> > > -			/* sc 1 from userspace - reflect to guest syscall */
> > > -			kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL);
> > > -			r = RESUME_GUEST;
> > > -			break;
> > > -		}
> > > 		run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3);
> > > 		for (i = 0; i < 9; ++i)
> > > 			run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i);
> > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > index c71103b..9f626c3 100644
> > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > @@ -1388,7 +1388,8 @@ kvmppc_hisi:
> > > hcall_try_real_mode:
> > > 	ld	r3,VCPU_GPR(R3)(r9)
> > > 	andi.	r0,r11,MSR_PR
> > > -	bne	guest_exit_cont
> > > +	/* sc 1 from userspace - reflect to guest syscall */
> > > +	bne	sc_0_fast_return
> > > 	clrrdi	r3,r3,2
> > > 	cmpldi	r3,hcall_real_table_end - hcall_real_table
> > > 	bge	guest_exit_cont
> > > @@ -1409,6 +1410,16 @@ hcall_try_real_mode:
> > > 	ld	r11,VCPU_MSR(r4)
> > > 	b	fast_guest_return
> > > 
> > > +sc_0_fast_return:
> > > +	ld	r10,VCPU_PC(r9)
> > > +	ld	r11,VCPU_MSR(r9)
> > > +	mtspr	SPRN_SRR0,r10
> > > +	mtspr	SPRN_SRR1,r11
> > > +	li	r10, BOOK3S_INTERRUPT_SYSCALL
> > > +	LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)	/* zero 33:36,42:47 */
> > > +	and	r11,r11,r3
> > > +	b	fast_guest_return
> > > +
> > > 	/* We've attempted a real mode hcall, but it's punted it back
> > > 	 * to userspace.  We need to restore some clobbered volatiles
> > > 	 * before resuming the pass-it-to-qemu path */
> > > -- 
> > > 1.8.1.4
> > > 
> 

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Liu ping fan @ 2013-11-08  4:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc
In-Reply-To: <470E7CC0-6397-4343-89DA-6818531E6775@suse.de>

On Fri, Nov 8, 2013 at 11:10 AM, Alexander Graf <agraf@suse.de> wrote:
>
> On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:
>
>> syscall is a very common behavior inside guest, and this patch
>> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
>> so hypervisor can return to guest without heavy exit, i.e, no need
>> to swap TLB, HTAB,.. etc
>
> The syscall exit you touch here only happens when you do an sc > 0 with MSR_PR set inside the guest. The only case you realistically see this is when you run PR KVM inside of an HV KVM guest.
>
Maybe I misunderstood the ISA spec, but refer for "6.5.14 System Call
Interrupt", no description about the MSR_PR when sc trigger a syscall
interrupt. So I think, guest application "sc 0" will also fall to the
kernel who owns hypervisor mode.  Am I right?

> I don't think we should optimize for that case. Instead, we should rather try to not bounce to the 1st hypervisor in the first place in that scenario :).
>
Sorry, but just want to make clear about the idiom:  0 -> kernel run
with NV, and 1st -> kernel run on HV-KVM and provide PR-KVM to up
layer? Right?

When you say "try to not bounce to the 1st hypervisor ", what is the
exact meaning and how can we achieve this?  I am a quite newer on
powerpc, and hope that I can get more clear figure about it  :)

Thanks

Pingfan
>
> Alex
>
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> ---
>> Compiled, but lack of bare metal, I have not tested it yet.
>> ---
>> arch/powerpc/kvm/book3s_hv.c            |  6 ------
>> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 ++++++++++++-
>> 2 files changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 62a2b5a..73dc852 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -628,12 +628,6 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>               /* hcall - punt to userspace */
>>               int i;
>>
>> -             if (vcpu->arch.shregs.msr & MSR_PR) {
>> -                     /* sc 1 from userspace - reflect to guest syscall */
>> -                     kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_SYSCALL);
>> -                     r = RESUME_GUEST;
>> -                     break;
>> -             }
>>               run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3);
>>               for (i = 0; i < 9; ++i)
>>                       run->papr_hcall.args[i] = kvmppc_get_gpr(vcpu, 4 + i);
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index c71103b..9f626c3 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> @@ -1388,7 +1388,8 @@ kvmppc_hisi:
>> hcall_try_real_mode:
>>       ld      r3,VCPU_GPR(R3)(r9)
>>       andi.   r0,r11,MSR_PR
>> -     bne     guest_exit_cont
>> +     /* sc 1 from userspace - reflect to guest syscall */
>> +     bne     sc_0_fast_return
>>       clrrdi  r3,r3,2
>>       cmpldi  r3,hcall_real_table_end - hcall_real_table
>>       bge     guest_exit_cont
>> @@ -1409,6 +1410,16 @@ hcall_try_real_mode:
>>       ld      r11,VCPU_MSR(r4)
>>       b       fast_guest_return
>>
>> +sc_0_fast_return:
>> +     ld      r10,VCPU_PC(r9)
>> +     ld      r11,VCPU_MSR(r9)
>> +     mtspr   SPRN_SRR0,r10
>> +     mtspr   SPRN_SRR1,r11
>> +     li      r10, BOOK3S_INTERRUPT_SYSCALL
>> +     LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)       /* zero 33:36,42:47 */
>> +     and     r11,r11,r3
>> +     b       fast_guest_return
>> +
>>       /* We've attempted a real mode hcall, but it's punted it back
>>        * to userspace.  We need to restore some clobbered volatiles
>>        * before resuming the pass-it-to-qemu path */
>> --
>> 1.8.1.4
>>
>

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Liu ping fan @ 2013-11-08  4:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Paul Mackerras, linuxppc-dev, Alexander Graf, kvm-ppc
In-Reply-To: <1383883907.4776.215.camel@pasglop>

On Fri, Nov 8, 2013 at 12:11 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Fri, 2013-11-08 at 15:05 +1100, Benjamin Herrenschmidt wrote:
>> On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote:
>> > On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:
>> >
>> > > syscall is a very common behavior inside guest, and this patch
>> > > optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
>> > > so hypervisor can return to guest without heavy exit, i.e, no need
>> > > to swap TLB, HTAB,.. etc
>> >
>> > The syscall exit you touch here only happens when you do an sc > 0
>> > with MSR_PR set inside the guest. The only case you realistically see
>> > this is when you run PR KVM inside of an HV KVM guest.
>> >
>> > I don't think we should optimize for that case. Instead, we should
>> > rather try to not bounce to the 1st hypervisor in the first place in
>> > that scenario :).
>>
>> Well, so unfortunately openstack CI uses PR inside HV pretty
>> heavily .... it *might* be worthwhile optimizing that path if the patch
>> is simple enough... I'd make that Paul's call.
>
> Note that this is a statement of value for the idea ... not the
> implementation ;-) From a quick look with Paulus, the patch is quite
> broken. I'll let Paul comment in details.
>
Thank you very much,

Regards,
Pingfan

^ permalink raw reply

* [PATCH v3 1/2] powerpc: kvm: pair kvmppc_hv_find_lock_hpte with _unlock_hpte
From: Liu Ping Fan @ 2013-11-08  7:29 UTC (permalink / raw)
  To: linuxppc-dev, kvm-ppc; +Cc: Paul Mackerras, Alexander Graf
In-Reply-To: <1383895794-16164-1-git-send-email-pingfank@linux.vnet.ibm.com>

Highlight the lock pair for the reader. (and later it will the
place to hide the detail about preemption disable)

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  7 ++-----
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 13 ++++++++++---
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index fa19e2f..a818932 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -129,6 +129,7 @@ extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
 extern int kvmppc_book3s_hv_page_fault(struct kvm_run *run,
 			struct kvm_vcpu *vcpu, unsigned long addr,
 			unsigned long status);
+extern void kvmppc_hv_unlock_hpte(ulong *hptep, ulong *hpte_val);
 extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr,
 			unsigned long slb_v, unsigned long valid);
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 842f081..97685e7 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -479,12 +479,9 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 	if (index < 0)
 		return -ENOENT;
 	hptep = (unsigned long *)(kvm->arch.hpt_virt + (index << 4));
-	v = hptep[0] & ~HPTE_V_HVLOCK;
+	v = hptep[0];
 	gr = kvm->arch.revmap[index].guest_rpte;
-
-	/* Unlock the HPTE */
-	asm volatile("lwsync" : : : "memory");
-	hptep[0] = v;
+	kvmppc_hv_unlock_hpte(hptep, &v);
 
 	gpte->eaddr = eaddr;
 	gpte->vpage = ((v & HPTE_V_AVPN) << 4) | ((eaddr >> 12) & 0xfff);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 9c51544..0ff9e91 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -749,6 +749,14 @@ static int slb_base_page_shift[4] = {
 	20,	/* 1M, unsupported */
 };
 
+void kvmppc_hv_unlock_hpte(unsigned long *hptep, unsigned long *hpte_val)
+{
+	*hpte_val = *hpte_val & ~HPTE_V_HVLOCK;
+	asm volatile("lwsync" : : : "memory");
+	*hptep = *hpte_val;
+}
+EXPORT_SYMBOL(kvmppc_hv_unlock_hpte);
+
 long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, unsigned long slb_v,
 			      unsigned long valid)
 {
@@ -863,12 +871,11 @@ long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 		return 0;		/* for prot fault, HPTE disappeared */
 	}
 	hpte = (unsigned long *)(kvm->arch.hpt_virt + (index << 4));
-	v = hpte[0] & ~HPTE_V_HVLOCK;
+	v = hpte[0];
 	r = hpte[1];
 	rev = real_vmalloc_addr(&kvm->arch.revmap[index]);
 	gr = rev->guest_rpte;
-
-	unlock_hpte(hpte, v);
+	kvmppc_hv_unlock_hpte(hpte, &v);
 
 	/* For not found, if the HPTE is valid by now, retry the instruction */
 	if ((status & DSISR_NOHPTE) && (v & HPTE_V_VALID))
-- 
1.8.1.4

^ permalink raw reply related

* [PATCH v3 0/2] powerpc kvm: fix deadlock scene
From: Liu Ping Fan @ 2013-11-08  7:29 UTC (permalink / raw)
  To: linuxppc-dev, kvm-ppc; +Cc: Paul Mackerras, Alexander Graf

v2->v3:
  introduce kvmppc_hv_unlock_hpte() to pair with kvmppc_hv_find_lock_hpte()
  and hide the preemption detail inside this pair from the callers

Liu Ping Fan (2):
  powerpc: kvm: pair kvmppc_hv_find_lock_hpte with _unlock_hpte
  powerpc: kvm: fix rare but potential deadlock scene

 arch/powerpc/include/asm/kvm_book3s.h |  3 ++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c   | 10 ++++------
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 29 ++++++++++++++++++++++++-----
 3 files changed, 30 insertions(+), 12 deletions(-)

-- 
1.8.1.4

^ permalink raw reply

* [PATCH v3 2/2] powerpc: kvm: fix rare but potential deadlock scene
From: Liu Ping Fan @ 2013-11-08  7:29 UTC (permalink / raw)
  To: linuxppc-dev, kvm-ppc; +Cc: Paul Mackerras, Alexander Graf
In-Reply-To: <1383895794-16164-1-git-send-email-pingfank@linux.vnet.ibm.com>

Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
realmode, so it can trigger the deadlock.

Suppose the following scene:

Two physical cpuM, cpuN, two VM instances A, B, each VM has a group of vcpus.

If on cpuM, vcpu_A_1 holds bitlock X (HPTE_V_HVLOCK), then is switched out,
and on cpuN, vcpu_A_2 try to lock X in realmode, then cpuN will be caught in
realmode for a long time.

What makes things even worse if the following happens,
  On cpuM, bitlockX is hold, on cpuN, Y is hold.
  vcpu_B_2 try to lock Y on cpuM in realmode
  vcpu_A_2 try to lock X on cpuN in realmode

Oops! deadlock happens

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s.h |  4 ++--
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  5 +++--
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 20 ++++++++++++++++----
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index a818932..3d710ba 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -129,9 +129,9 @@ extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
 extern int kvmppc_book3s_hv_page_fault(struct kvm_run *run,
 			struct kvm_vcpu *vcpu, unsigned long addr,
 			unsigned long status);
-extern void kvmppc_hv_unlock_hpte(ulong *hptep, ulong *hpte_val);
+extern void kvmppc_hv_unlock_hpte(ulong *hptep, ulong *hpte_val, bool vmode);
 extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr,
-			unsigned long slb_v, unsigned long valid);
+			unsigned long slb_v, unsigned long valid, bool vmode);
 
 extern void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte);
 extern struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 97685e7..12d9635 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -475,13 +475,14 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 
 	/* Find the HPTE in the hash table */
 	index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
-					 HPTE_V_VALID | HPTE_V_ABSENT);
+					 HPTE_V_VALID | HPTE_V_ABSENT,
+					 true);
 	if (index < 0)
 		return -ENOENT;
 	hptep = (unsigned long *)(kvm->arch.hpt_virt + (index << 4));
 	v = hptep[0];
 	gr = kvm->arch.revmap[index].guest_rpte;
-	kvmppc_hv_unlock_hpte(hptep, &v);
+	kvmppc_hv_unlock_hpte(hptep, &v, true);
 
 	gpte->eaddr = eaddr;
 	gpte->vpage = ((v & HPTE_V_AVPN) << 4) | ((eaddr >> 12) & 0xfff);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 0ff9e91..18a9425 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -749,16 +749,22 @@ static int slb_base_page_shift[4] = {
 	20,	/* 1M, unsupported */
 };
 
-void kvmppc_hv_unlock_hpte(unsigned long *hptep, unsigned long *hpte_val)
+void kvmppc_hv_unlock_hpte(unsigned long *hptep, unsigned long *hpte_val,
+	bool vmode)
 {
 	*hpte_val = *hpte_val & ~HPTE_V_HVLOCK;
 	asm volatile("lwsync" : : : "memory");
 	*hptep = *hpte_val;
+	if (unlikely(vmode))
+		preempt_enable();
 }
 EXPORT_SYMBOL(kvmppc_hv_unlock_hpte);
 
+/* If called from virtmode and success to lock, then the context will be set
+ * as preemption disabled
+ */
 long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, unsigned long slb_v,
-			      unsigned long valid)
+			      unsigned long valid, bool vmode)
 {
 	unsigned int i;
 	unsigned int pshift;
@@ -796,6 +802,9 @@ long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, unsigned long slb_v,
 		avpn &= ~0x7fUL;
 	val |= avpn;
 
+	if (unlikely(vmode))
+		preempt_disable();
+
 	for (;;) {
 		hpte = (unsigned long *)(kvm->arch.hpt_virt + (hash << 7));
 
@@ -833,6 +842,9 @@ long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, unsigned long slb_v,
 		val |= HPTE_V_SECONDARY;
 		hash = hash ^ kvm->arch.hpt_mask;
 	}
+
+	if (unlikely(vmode))
+		preempt_enable();
 	return -1;
 }
 EXPORT_SYMBOL(kvmppc_hv_find_lock_hpte);
@@ -864,7 +876,7 @@ long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 	if (status & DSISR_NOHPTE)
 		valid |= HPTE_V_ABSENT;
 
-	index = kvmppc_hv_find_lock_hpte(kvm, addr, slb_v, valid);
+	index = kvmppc_hv_find_lock_hpte(kvm, addr, slb_v, valid, false);
 	if (index < 0) {
 		if (status & DSISR_NOHPTE)
 			return status;	/* there really was no HPTE */
@@ -875,7 +887,7 @@ long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 	r = hpte[1];
 	rev = real_vmalloc_addr(&kvm->arch.revmap[index]);
 	gr = rev->guest_rpte;
-	kvmppc_hv_unlock_hpte(hpte, &v);
+	kvmppc_hv_unlock_hpte(hpte, &v, false);
 
 	/* For not found, if the HPTE is valid by now, retry the instruction */
 	if ((status & DSISR_NOHPTE) && (v & HPTE_V_VALID))
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Liu ping fan @ 2013-11-08  8:38 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, linuxppc-dev, kvm-ppc
In-Reply-To: <CAFgQCTvWAAr83nq9MmZm44Om36kYC3tOnkUPumfRAVCeYXjLAg@mail.gmail.com>

On Fri, Nov 8, 2013 at 12:19 PM, Liu ping fan <kernelfans@gmail.com> wrote:
> On Fri, Nov 8, 2013 at 11:10 AM, Alexander Graf <agraf@suse.de> wrote:
>>
>> On 08.11.2013, at 03:44, Liu Ping Fan <kernelfans@gmail.com> wrote:
>>
>>> syscall is a very common behavior inside guest, and this patch
>>> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
>>> so hypervisor can return to guest without heavy exit, i.e, no need
>>> to swap TLB, HTAB,.. etc
>>
>> The syscall exit you touch here only happens when you do an sc > 0 with MSR_PR set inside the guest. The only case you realistically see this is when you run PR KVM inside of an HV KVM guest.
>>
> Maybe I misunderstood the ISA spec, but refer for "6.5.14 System Call
> Interrupt", no description about the MSR_PR when sc trigger a syscall
> interrupt. So I think, guest application "sc 0" will also fall to the
> kernel who owns hypervisor mode.  Am I right?
>
Some further comment: I think the essential of the problem is whether
we switch RMA from guest to HV when interrupts raise.
DSI/ISI will be redirected to HDSI and RMA switch.  But what about
SYSCALL, and DEC, external interrupt, ...etc?

>> I don't think we should optimize for that case. Instead, we should rather try to not bounce to the 1st hypervisor in the first place in that scenario :).
>>
> Sorry, but just want to make clear about the idiom:  0 -> kernel run
> with NV, and 1st -> kernel run on HV-KVM and provide PR-KVM to up
> layer? Right?
>
> When you say "try to not bounce to the 1st hypervisor ", what is the
> exact meaning and how can we achieve this?  I am a quite newer on
> powerpc, and hope that I can get more clear figure about it  :)
>

Thanks
Pingfan

^ permalink raw reply

* BookE "branch taken" behavior vis-a-vis updating the NIP register
From: pegasus @ 2013-11-08 10:46 UTC (permalink / raw)
  To: linuxppc-dev

Hello.

I was reading the source code for the debug exception under powerpc. I saw
that BookE processors stop before (actually) taking a branch. Hence in order
to force it to take that branch and then stop, the source code for it had to
be "hacked'  to (temporarily) enable single step until the branch
instruction has been taken, thereby mimicing the BookS behavior. 

By doing this, I believe we would want the exception to be triggered after
the branch has been successfully taken. Hence I put a printk to print the
value of the instruction that actually caused the exception. I was assuming
that initially a debug breakpoint would be hit on the branch instruction
(assuming the branch is supposed to be taken). Now since at this point in
time, the branch instruction has NOT finished, the kernel, after merely
disabling the BT bit in DBCR0 and enabling IC bit in DBCR0, returns. At this
point I was assuming we will see another exception at the very same
instruction in userspace. However, when printing the NIP it becomes clear
that when it gets to the debug exception handler after being (temporarily)
set to single step, NIP points to the instruction after the branch
instruction. To me, it appears that, after disabling BT (branch taken) debug
event monitoring (and enabling single stepping), it does not catch an
exception at that very same branch instruction, instead it catches an
exception for the subsequent instruction. Sorry for the repetition but I
wanted to clarify what I am seeing here.

May be this IS the way it is supposed to behave (which means my thinking
about it is flawed). I am a bit confused here. 

Please let me know more..Im keen to hear from you folks.

--
View this message in context: http://linuxppc.10917.n7.nabble.com/BookE-branch-taken-behavior-vis-a-vis-updating-the-NIP-register-tp77960.html
Sent from the linuxppc-dev mailing list archive at Nabble.com.

^ permalink raw reply

* [PATCH v4 0/4] Add dual-fifo mode support of i.MX ssi
From: Nicolin Chen @ 2013-11-08 10:49 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, s.hauer, timur, shawn.guo, broonie
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	swarren, linux-kernel, rob.herring, dmaengine, ijc+devicetree,
	linuxppc-dev, linux-arm-kernel

 * ! This series of patches has a direct dependency between them. When
 * ! applying them, we need to apply to one single branch. Otherwise,
 * ! it would break currect branches.

Changelog
v4:
 * PATCH-3: Add period size constraint when using dual fifo mode
 *
 * Nothing changes for PATCH-1, PATCH-2 and PATCH-4
v3:
 * PATCH-1: Add comments to indicate the end of v1 and v2 array.
 * PATCH-3: Use better way to keep watermark as even number.
 *
 * Nothing changes for PATCH-2 and PATCH-4
v2:
 * Instead of adding rogue scripts to current SDMA driver based on firmware
 * V1, we define the new SDMA firmware as version 2 and bisect the PATCH-1
 * to two patches: The first is to add version check code to the SDMA driver;
 * And the second is to add SSI dual FIFO DMATYPE.
 *
 * Nothing changes for the last two patches.
v1:
 * SSI can reduce hardware overrun/underrun possibility when using dual
 * fifo mode. To support this mode, we need to first update sdma sciprt
 * list, and then enable dual fifo BIT in SSI driver, and last update DT
 * bindings of i.MX series.

Nicolin Chen (4):
  dma: imx-sdma: Add sdma firmware version 2 support
  dma: imx-sdma: Add new dma type for ssi dual fifo script
  ASoC: fsl_ssi: Add dual fifo mode support
  ARM: dts: imx: use dual-fifo sdma script for ssi

 .../devicetree/bindings/dma/fsl-imx-sdma.txt       |  1 +
 arch/arm/boot/dts/imx51.dtsi                       |  4 ++--
 arch/arm/boot/dts/imx53.dtsi                       |  4 ++--
 arch/arm/boot/dts/imx6qdl.dtsi                     | 12 +++++-----
 arch/arm/boot/dts/imx6sl.dtsi                      | 12 +++++-----
 drivers/dma/imx-sdma.c                             | 19 ++++++++++++++-
 include/linux/platform_data/dma-imx-sdma.h         |  5 ++++
 include/linux/platform_data/dma-imx.h              |  1 +
 sound/soc/fsl/fsl_ssi.c                            | 27 +++++++++++++++++++++-
 9 files changed, 67 insertions(+), 18 deletions(-)

-- 
1.8.4

^ permalink raw reply

* [PATCH v4 1/4] dma: imx-sdma: Add sdma firmware version 2 support
From: Nicolin Chen @ 2013-11-08 10:49 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, s.hauer, timur, shawn.guo, broonie
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	swarren, linux-kernel, rob.herring, dmaengine, ijc+devicetree,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <cover.1383907685.git.b42378@freescale.com>

On i.MX5/6 series, SDMA is using new version firmware to support SSI
dual FIFO feature and HDMI Audio (i.MX6Q/DL only). Thus add it.

Signed-off-by: Nicolin Chen <b42378@freescale.com>
---
 drivers/dma/imx-sdma.c                     | 15 ++++++++++++++-
 include/linux/platform_data/dma-imx-sdma.h |  5 +++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index fc43603..c7ece8d 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -323,6 +323,7 @@ struct sdma_engine {
 	struct clk			*clk_ipg;
 	struct clk			*clk_ahb;
 	spinlock_t			channel_0_lock;
+	u32				script_number;
 	struct sdma_script_start_addrs	*script_addrs;
 	const struct sdma_driver_data	*drvdata;
 };
@@ -1238,6 +1239,7 @@ static void sdma_issue_pending(struct dma_chan *chan)
 }
 
 #define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1	34
+#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V2	38
 
 static void sdma_add_scripts(struct sdma_engine *sdma,
 		const struct sdma_script_start_addrs *addr)
@@ -1246,7 +1248,7 @@ static void sdma_add_scripts(struct sdma_engine *sdma,
 	s32 *saddr_arr = (u32 *)sdma->script_addrs;
 	int i;
 
-	for (i = 0; i < SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1; i++)
+	for (i = 0; i < sdma->script_number; i++)
 		if (addr_arr[i] > 0)
 			saddr_arr[i] = addr_arr[i];
 }
@@ -1272,6 +1274,17 @@ static void sdma_load_firmware(const struct firmware *fw, void *context)
 		goto err_firmware;
 	if (header->ram_code_start + header->ram_code_size > fw->size)
 		goto err_firmware;
+	switch (header->version_major) {
+		case 1:
+			sdma->script_number = SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1;
+			break;
+		case 2:
+			sdma->script_number = SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V2;
+			break;
+		default:
+			dev_err(sdma->dev, "unknown firmware version\n");
+			return;
+	}
 
 	addr = (void *)header + header->script_addrs_start;
 	ram_code = (void *)header + header->ram_code_start;
diff --git a/include/linux/platform_data/dma-imx-sdma.h b/include/linux/platform_data/dma-imx-sdma.h
index 3a39428..eabac4e 100644
--- a/include/linux/platform_data/dma-imx-sdma.h
+++ b/include/linux/platform_data/dma-imx-sdma.h
@@ -43,6 +43,11 @@ struct sdma_script_start_addrs {
 	s32 dptc_dvfs_addr;
 	s32 utra_addr;
 	s32 ram_code_start_addr;
+	/* End of v1 array */
+	s32 mcu_2_ssish_addr;
+	s32 ssish_2_mcu_addr;
+	s32 hdmi_dma_addr;
+	/* End of v2 array */
 };
 
 /**
-- 
1.8.4

^ permalink raw reply related

* [PATCH v4 2/4] dma: imx-sdma: Add new dma type for ssi dual fifo script
From: Nicolin Chen @ 2013-11-08 10:49 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, s.hauer, timur, shawn.guo, broonie
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	swarren, linux-kernel, rob.herring, dmaengine, ijc+devicetree,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <cover.1383907685.git.b42378@freescale.com>

This patch adds a new DMA_TYPE for SSI dual FIFO script, included
in SDMA firmware version 2. This script would allow SSI use dual
fifo mode to transimit/receive data without occasional hardware
underrun/overrun.

Signed-off-by: Nicolin Chen <b42378@freescale.com>
---
 Documentation/devicetree/bindings/dma/fsl-imx-sdma.txt | 1 +
 drivers/dma/imx-sdma.c                                 | 4 ++++
 include/linux/platform_data/dma-imx.h                  | 1 +
 3 files changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/fsl-imx-sdma.txt b/Documentation/devicetree/bindings/dma/fsl-imx-sdma.txt
index 4fa814d..68b83ec 100644
--- a/Documentation/devicetree/bindings/dma/fsl-imx-sdma.txt
+++ b/Documentation/devicetree/bindings/dma/fsl-imx-sdma.txt
@@ -42,6 +42,7 @@ The full ID of peripheral types can be found below.
 	19	IPU Memory
 	20	ASRC
 	21	ESAI
+	22	SSI Dual FIFO	(needs firmware ver >= 2)
 
 The third cell specifies the transfer priority as below.
 
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index c7ece8d..efaa9a9 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -725,6 +725,10 @@ static void sdma_get_pc(struct sdma_channel *sdmac,
 		per_2_emi = sdma->script_addrs->app_2_mcu_addr;
 		emi_2_per = sdma->script_addrs->mcu_2_app_addr;
 		break;
+	case IMX_DMATYPE_SSI_DUAL:
+		per_2_emi = sdma->script_addrs->ssish_2_mcu_addr;
+		emi_2_per = sdma->script_addrs->mcu_2_ssish_addr;
+		break;
 	case IMX_DMATYPE_SSI_SP:
 	case IMX_DMATYPE_MMC:
 	case IMX_DMATYPE_SDHC:
diff --git a/include/linux/platform_data/dma-imx.h b/include/linux/platform_data/dma-imx.h
index beac6b8..bcbc6c3 100644
--- a/include/linux/platform_data/dma-imx.h
+++ b/include/linux/platform_data/dma-imx.h
@@ -39,6 +39,7 @@ enum sdma_peripheral_type {
 	IMX_DMATYPE_IPU_MEMORY,	/* IPU Memory */
 	IMX_DMATYPE_ASRC,	/* ASRC */
 	IMX_DMATYPE_ESAI,	/* ESAI */
+	IMX_DMATYPE_SSI_DUAL,	/* SSI Dual FIFO */
 };
 
 enum imx_dma_prio {
-- 
1.8.4

^ permalink raw reply related

* [PATCH v4 3/4] ASoC: fsl_ssi: Add dual fifo mode support
From: Nicolin Chen @ 2013-11-08 10:49 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, s.hauer, timur, shawn.guo, broonie
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	swarren, linux-kernel, rob.herring, dmaengine, ijc+devicetree,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <cover.1383907685.git.b42378@freescale.com>

By enabling dual fifo mode, it would allow SSI enter a better performance
to transimit/receive data without occasional hardware underrun/overrun.

Signed-off-by: Nicolin Chen <b42378@freescale.com>
---
 sound/soc/fsl/fsl_ssi.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 6b81d0c..af6640c 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -143,6 +143,7 @@ struct fsl_ssi_private {
 	bool ssi_on_imx;
 	bool imx_ac97;
 	bool use_dma;
+	bool use_dual_fifo;
 	struct clk *clk;
 	struct snd_dmaengine_dai_dma_data dma_params_tx;
 	struct snd_dmaengine_dai_dma_data dma_params_rx;
@@ -413,6 +414,12 @@ static int fsl_ssi_setup(struct fsl_ssi_private *ssi_private)
 		write_ssi(CCSR_SSI_SOR_WAIT(3), &ssi->sor);
 	}
 
+	if (ssi_private->use_dual_fifo) {
+		write_ssi_mask(&ssi->srcr, 0, CCSR_SSI_SRCR_RFEN1);
+		write_ssi_mask(&ssi->stcr, 0, CCSR_SSI_STCR_TFEN1);
+		write_ssi_mask(&ssi->scr, 0, CCSR_SSI_SCR_TCH_EN);
+	}
+
 	return 0;
 }
 
@@ -487,6 +494,15 @@ static int fsl_ssi_startup(struct snd_pcm_substream *substream,
 		ssi_private->second_stream = substream;
 	}
 
+	/* When using dual fifo mode, it is safer to ensure an even period
+	 * size. If appearing to an odd number while DMA always starts its
+	 * task from fifo0, fifo1 would be neglected at the end of each
+	 * period. But SSI would still access fifo1 with an invalid data.
+	 */
+	if (ssi_private->use_dual_fifo)
+		snd_pcm_hw_constraint_step(substream->runtime, 0,
+				SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 2);
+
 	return 0;
 }
 
@@ -954,7 +970,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 		ssi_private->fifo_depth = 8;
 
 	if (of_device_is_compatible(pdev->dev.of_node, "fsl,imx21-ssi")) {
-		u32 dma_events[2];
+		u32 dma_events[2], dmas[4];
 		ssi_private->ssi_on_imx = true;
 
 		ssi_private->clk = devm_clk_get(&pdev->dev, NULL);
@@ -1008,6 +1024,15 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 			dma_events[0], shared ? IMX_DMATYPE_SSI_SP : IMX_DMATYPE_SSI);
 		imx_pcm_dma_params_init_data(&ssi_private->filter_data_rx,
 			dma_events[1], shared ? IMX_DMATYPE_SSI_SP : IMX_DMATYPE_SSI);
+		if (!of_property_read_u32_array(pdev->dev.of_node, "dmas", dmas, 4)
+				&& dmas[2] == IMX_DMATYPE_SSI_DUAL) {
+			ssi_private->use_dual_fifo = true;
+			/* When using dual fifo mode, we need to keep watermark
+			 * as even numbers due to dma script limitation.
+			 */
+			ssi_private->dma_params_tx.maxburst &= ~0x1;
+			ssi_private->dma_params_rx.maxburst &= ~0x1;
+		}
 	} else if (ssi_private->use_dma) {
 		/* The 'name' should not have any slashes in it. */
 		ret = devm_request_irq(&pdev->dev, ssi_private->irq,
-- 
1.8.4

^ permalink raw reply related

* [PATCH v4 4/4] ARM: dts: imx: use dual-fifo sdma script for ssi
From: Nicolin Chen @ 2013-11-08 10:49 UTC (permalink / raw)
  To: vinod.koul, dan.j.williams, s.hauer, timur, shawn.guo, broonie
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	swarren, linux-kernel, rob.herring, dmaengine, ijc+devicetree,
	linuxppc-dev, linux-arm-kernel
In-Reply-To: <cover.1383907685.git.b42378@freescale.com>

Use dual-fifo sdma scripts instead of shared scripts for ssi on i.MX series.

Signed-off-by: Nicolin Chen <b42378@freescale.com>
---
 arch/arm/boot/dts/imx51.dtsi   |  4 ++--
 arch/arm/boot/dts/imx53.dtsi   |  4 ++--
 arch/arm/boot/dts/imx6qdl.dtsi | 12 ++++++------
 arch/arm/boot/dts/imx6sl.dtsi  | 12 ++++++------
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm/boot/dts/imx51.dtsi b/arch/arm/boot/dts/imx51.dtsi
index 54cee65..1a71eac 100644
--- a/arch/arm/boot/dts/imx51.dtsi
+++ b/arch/arm/boot/dts/imx51.dtsi
@@ -154,8 +154,8 @@
 					reg = <0x70014000 0x4000>;
 					interrupts = <30>;
 					clocks = <&clks 49>;
-					dmas = <&sdma 24 1 0>,
-					       <&sdma 25 1 0>;
+					dmas = <&sdma 24 22 0>,
+					       <&sdma 25 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					fsl,ssi-dma-events = <25 24 23 22>; /* TX0 RX0 TX1 RX1 */
diff --git a/arch/arm/boot/dts/imx53.dtsi b/arch/arm/boot/dts/imx53.dtsi
index 4307e80..7208fde 100644
--- a/arch/arm/boot/dts/imx53.dtsi
+++ b/arch/arm/boot/dts/imx53.dtsi
@@ -153,8 +153,8 @@
 					reg = <0x50014000 0x4000>;
 					interrupts = <30>;
 					clocks = <&clks 49>;
-					dmas = <&sdma 24 1 0>,
-					       <&sdma 25 1 0>;
+					dmas = <&sdma 24 22 0>,
+					       <&sdma 25 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					fsl,ssi-dma-events = <25 24 23 22>; /* TX0 RX0 TX1 RX1 */
diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
index 57e9c38..6e096ca 100644
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@@ -223,8 +223,8 @@
 					reg = <0x02028000 0x4000>;
 					interrupts = <0 46 0x04>;
 					clocks = <&clks 178>;
-					dmas = <&sdma 37 1 0>,
-					       <&sdma 38 1 0>;
+					dmas = <&sdma 37 22 0>,
+					       <&sdma 38 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					fsl,ssi-dma-events = <38 37>;
@@ -236,8 +236,8 @@
 					reg = <0x0202c000 0x4000>;
 					interrupts = <0 47 0x04>;
 					clocks = <&clks 179>;
-					dmas = <&sdma 41 1 0>,
-					       <&sdma 42 1 0>;
+					dmas = <&sdma 41 22 0>,
+					       <&sdma 42 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					fsl,ssi-dma-events = <42 41>;
@@ -249,8 +249,8 @@
 					reg = <0x02030000 0x4000>;
 					interrupts = <0 48 0x04>;
 					clocks = <&clks 180>;
-					dmas = <&sdma 45 1 0>,
-					       <&sdma 46 1 0>;
+					dmas = <&sdma 45 22 0>,
+					       <&sdma 46 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					fsl,ssi-dma-events = <46 45>;
diff --git a/arch/arm/boot/dts/imx6sl.dtsi b/arch/arm/boot/dts/imx6sl.dtsi
index c46651e..b32ba99 100644
--- a/arch/arm/boot/dts/imx6sl.dtsi
+++ b/arch/arm/boot/dts/imx6sl.dtsi
@@ -195,8 +195,8 @@
 					reg = <0x02028000 0x4000>;
 					interrupts = <0 46 0x04>;
 					clocks = <&clks IMX6SL_CLK_SSI1>;
-					dmas = <&sdma 37 1 0>,
-					       <&sdma 38 1 0>;
+					dmas = <&sdma 37 22 0>,
+					       <&sdma 38 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					status = "disabled";
@@ -207,8 +207,8 @@
 					reg = <0x0202c000 0x4000>;
 					interrupts = <0 47 0x04>;
 					clocks = <&clks IMX6SL_CLK_SSI2>;
-					dmas = <&sdma 41 1 0>,
-					       <&sdma 42 1 0>;
+					dmas = <&sdma 41 22 0>,
+					       <&sdma 42 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					status = "disabled";
@@ -219,8 +219,8 @@
 					reg = <0x02030000 0x4000>;
 					interrupts = <0 48 0x04>;
 					clocks = <&clks IMX6SL_CLK_SSI3>;
-					dmas = <&sdma 45 1 0>,
-					       <&sdma 46 1 0>;
+					dmas = <&sdma 45 22 0>,
+					       <&sdma 46 22 0>;
 					dma-names = "rx", "tx";
 					fsl,fifo-depth = <15>;
 					status = "disabled";
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH v3 0/2] powerpc kvm: fix deadlock scene
From: Paul Mackerras @ 2013-11-08 10:58 UTC (permalink / raw)
  To: Liu Ping Fan; +Cc: linuxppc-dev, Alexander Graf, kvm-ppc
In-Reply-To: <1383895794-16164-1-git-send-email-pingfank@linux.vnet.ibm.com>

On Fri, Nov 08, 2013 at 03:29:52PM +0800, Liu Ping Fan wrote:
> v2->v3:
>   introduce kvmppc_hv_unlock_hpte() to pair with kvmppc_hv_find_lock_hpte()
>   and hide the preemption detail inside this pair from the callers

Actually, I preferred v2.  This version seems a bit over-engineered.

Making a kvmppc_hv_unlock_hpte() is not such a bad idea, though I
would make it identical to the existing unlock_hpte() from
book3s_hv_rm_mmu.c, just in a header.  I'm really not convinced about
putting the preempt_disable/enable inside the lock/unlock functions,
with the consequent need to pass in a 'vmode' parameter, given that
there is just one caller that needs to do the preempt_disable/enable.

Regards,
Paul.

^ permalink raw reply

* Re: [PATCH] powerpc: kvm: optimize "sc 0" as fast return
From: Paul Mackerras @ 2013-11-08 11:12 UTC (permalink / raw)
  To: Liu Ping Fan; +Cc: linuxppc-dev, Alexander Graf, kvm-ppc
In-Reply-To: <1383878656-4196-1-git-send-email-pingfank@linux.vnet.ibm.com>

On Fri, Nov 08, 2013 at 10:44:16AM +0800, Liu Ping Fan wrote:
> syscall is a very common behavior inside guest, and this patch
> optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
> so hypervisor can return to guest without heavy exit, i.e, no need
> to swap TLB, HTAB,.. etc

Many interrupts that are caused by guest code go directly to the guest
and don't come to the hypervisor at all.  That includes system call
(sc 0), alignment interrupts, program interrupts, SLB miss interrupts,
etc.  See section 6.5 of Book 3S of the Power ISA specification; all
the interrupts with '-' in the 'HV' column of the table there get
delivered directly to the guest when they occur inside a guest.

> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1388,7 +1388,8 @@ kvmppc_hisi:
>  hcall_try_real_mode:
>  	ld	r3,VCPU_GPR(R3)(r9)
>  	andi.	r0,r11,MSR_PR
> -	bne	guest_exit_cont
> +	/* sc 1 from userspace - reflect to guest syscall */
> +	bne	sc_0_fast_return

Discrepancy between comment and code here.  In fact we would only take
the branch for a sc 1 instruction in userspace, which occurs when a PR
KVM guest nested inside a HV KVM guest does a hypercall (i.e., not for
normal system calls).  It is probably worthwhile to speed those up.

> +sc_0_fast_return:
> +	ld	r10,VCPU_PC(r9)
> +	ld	r11,VCPU_MSR(r9)

r11 must already contain this since you just did andi. r0,r11,MSR_PR.
In fact r10 already contains VCPU_PC(r9) at this point also, though
that is not so obvious.

> +	mtspr	SPRN_SRR0,r10
> +	mtspr	SPRN_SRR1,r11
> +	li	r10, BOOK3S_INTERRUPT_SYSCALL
> +	LOAD_REG_IMMEDIATE(r3,0xffffffff87a0ffff)	/* zero 33:36,42:47 */
> +	and	r11,r11,r3

This is not correct, since you don't even clear PR.  In fact what you
need is to load up MSR_SF | MSR_ME, though that value changes with
little-endian support and changes again with transactional memory
support for POWER8.  There is an idiom for loading that MSR value,
which is:

	li	r11, (MSR_ME << 1) | 1	/* synthesize MSR_SF | MSR_ME */
	rotldi	r11, r11, 63

which you could use for now, but it will need to be changed when
Anton's LE patch gets accepted.

Paul.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox