* Re: [PATCH 1/1] powerpc: mpc512x_dma: add support for data transfers between memory and i/o memory
From: Anatolij Gustschin @ 2013-05-03 13:52 UTC (permalink / raw)
To: Alexander Popov; +Cc: Vinod Koul, linuxppc-dev, linux-kernel, Dan Williams
In-Reply-To: <CAF0T0X5cRMvoZKQQUkQHTLjbzz1xi_1pvUiywsKb5BGvyMGvpg@mail.gmail.com>
Hello Alexander,
On Fri, 3 May 2013 14:43:23 +0400
Alexander Popov <a13xp0p0v88@gmail.com> wrote:
> Hello Anatolij,
>
> > Note that there is a patch for .device_prep_slave_sg() operation
> > for this driver as part of this series:
> > https://patchwork.kernel.org/patch/2368581/
> > https://patchwork.kernel.org/patch/2368591/
> Thanks, I haven't seen that patch.
> It's certainly what my SCLPC device driver needs
> (http://patchwork.ozlabs.org/patch/241010/).
> I will send the second version of it which uses .device_prep_slave_sg().
>
> > maybe you can reuse and improve it.
> > Anatolij
> Should I propose my additions at https://patchwork.kernel.org/patch/2368591/ ?
Yes, I think so. I only used drivers new .device_prep_slave_sg()
for SDHC DMA channel transfers up to now. Adding support for other
peripherals would be good. With generic DMA DT bindings patch for
this driver you can use
dmas = <&dma0 26>;
dma-names = "rx-tx";
in your sclpc@10100 DT node and then
dma_request_slave_channel(&pdev->dev, "rx-tx") in the lpbfifo
driver.
Thanks,
Anatolij
^ permalink raw reply
* Re: [PATCH net-next] af_unix: fix a fatal race with bit fields
From: Eric Dumazet @ 2013-05-03 14:14 UTC (permalink / raw)
To: Alan Modra
Cc: netdev, Ambrose Feinstein, Paul Mackerras, Anton Blanchard,
linuxppc-dev, David Miller
In-Reply-To: <20130503013136.GN5221@bubble.grove.modra.org>
On Fri, 2013-05-03 at 11:01 +0930, Alan Modra wrote:
> On Tue, Apr 30, 2013 at 10:04:32PM -0700, Eric Dumazet wrote:
> > These kind of errors are pretty hard to find, its a pity to spend time
> > on them.
>
> Well, yes. From the first comment in gcc PR52080. "For the following
> testcase we generate a 8 byte RMW cycle on IA64 which causes locking
> problems in the linux kernel btrfs filesystem."
>
> Did someone fix btrfs, but not check other kernel locks? Having now
> hit the same problem again, have you checked that other kernel locks
> don't have adjacent bit fields in the same 64-bit word? And comment
> the struct to ensure someone doesn't optimize those unsigned chars
> back to bit fields.
Not only spinlock, but atomic_t followed by bit fields.
BTW, if a spinlock is followed by bit fields, but bit fields
only changed when this spinlock is held, there is no problem, unless
spinlock is a ticket spinlock.
In af_unix, bug happens because the bit fields were changed without
spinlock being held (another global spinlock is used instead)
(ppc64 doesnt use ticket spinlocks yet)
^ permalink raw reply
* RE: [PATCH net-next] af_unix: fix a fatal race with bit fields
From: David Laight @ 2013-05-03 14:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Eric Dumazet
Cc: netdev, linuxppc-dev, Paul Mackerras, David Miller,
Ambrose Feinstein
In-Reply-To: <1367372393.22115.6.camel@pasglop>
PiA+IENvdWxkIHBwYzY0IGV4cGVydHMgY29uZmlybSB1c2luZyBieXRlIGlzIHNhZmUsIG9yIHNo
b3VsZCB3ZSByZWFsbHkgYWRkDQo+ID4gYSAzMmJpdCBob2xlIGFmdGVyIHRoZSBzcGlubG9jayA/
IElmIHNvLCBJIHdvbmRlciBob3cgbWFueSBvdGhlciBwbGFjZXMNCj4gPiBuZWVkIGEgY2hhbmdl
Li4uDQouLi4NCj4gQWxzbyBJJ2QgYmUgc3VycHJpc2VkIGlmIHBwYzY0IGlzIHRoZSBvbmx5IG9u
ZSB3aXRoIHRoYXQgcHJvYmxlbS4uLiB3aGF0DQo+IGFib3V0IHNwYXJjNjQgYW5kIGFybTY0ID8N
Cg0KRXZlbiB4ODYgY291bGQgYmUgYWZmZWN0ZWQuDQpUaGUgd2lkdGggb2YgdGhlIG1lbW9yeSBj
eWNsZXMgdXNlZCBieSB0aGUgJ2JpdCBzZXQgYW5kIGJpdCBjbGVhcicNCmluc3RydWN0aW9ucyBp
c24ndCBkb2N1bWVudGVkLiBUaGV5IGFyZSBjZXJ0YWlubHkgYWxsb3dlZCB0byBkbw0KUk1XIG9u
IGFkamFjZW50IGJ5dGVzLg0KSSBkb24ndCByZW1lbWJlciB3aGV0aGVyIHRoZXkgYXJlIGNvbnN0
cmFpbmVkIHRvIG9ubHkgZG8NCjMyYml0IGFjY2Vzc2VzLCBidXQgbm90aGluZyB1c2VkIHRvIHNh
eSB0aGF0IHRoZXkgd291bGRuJ3QNCmRvIDMyYml0IG1pc2FsaWduZWQgb25lcyEgKGFsdGhvdWdo
IEkgc3VzcGVjdCB0aGV5IG5ldmVyIGhhdmUpLg0KDQoJRGF2aWQNCg0K
^ permalink raw reply
* RE: [PATCH net-next] af_unix: fix a fatal race with bit fields
From: Eric Dumazet @ 2013-05-03 15:02 UTC (permalink / raw)
To: David Laight
Cc: Ambrose Feinstein, Paul Mackerras, netdev, linuxppc-dev,
David Miller
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7226@saturn3.aculab.com>
On Fri, 2013-05-03 at 15:29 +0100, David Laight wrote:
> > > Could ppc64 experts confirm using byte is safe, or should we really add
> > > a 32bit hole after the spinlock ? If so, I wonder how many other places
> > > need a change...
> ...
> > Also I'd be surprised if ppc64 is the only one with that problem... what
> > about sparc64 and arm64 ?
>
> Even x86 could be affected.
> The width of the memory cycles used by the 'bit set and bit clear'
> instructions isn't documented. They are certainly allowed to do
> RMW on adjacent bytes.
> I don't remember whether they are constrained to only do
> 32bit accesses, but nothing used to say that they wouldn't
> do 32bit misaligned ones! (although I suspect they never have).
x86 is not affected (or else we would have found the bug much earlier)
Setting 1-bit field to one/zero uses OR/AND instructions.
orb $4,724(%reg)
doesn't load/store 64bits but 8bits.
^ permalink raw reply
* RE: [PATCH net-next] af_unix: fix a fatal race with bit fields
From: David Laight @ 2013-05-03 15:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: Ambrose Feinstein, Paul Mackerras, netdev, linuxppc-dev,
David Miller
In-Reply-To: <1367593356.29805.37.camel@edumazet-glaptop>
PiA+ID4gQWxzbyBJJ2QgYmUgc3VycHJpc2VkIGlmIHBwYzY0IGlzIHRoZSBvbmx5IG9uZSB3aXRo
IHRoYXQgcHJvYmxlbS4uLiB3aGF0DQo+ID4gPiBhYm91dCBzcGFyYzY0IGFuZCBhcm02NCA/DQo+
ID4NCj4gPiBFdmVuIHg4NiBjb3VsZCBiZSBhZmZlY3RlZC4NCj4gPiBUaGUgd2lkdGggb2YgdGhl
IG1lbW9yeSBjeWNsZXMgdXNlZCBieSB0aGUgJ2JpdCBzZXQgYW5kIGJpdCBjbGVhcicNCj4gPiBp
bnN0cnVjdGlvbnMgaXNuJ3QgZG9jdW1lbnRlZC4gVGhleSBhcmUgY2VydGFpbmx5IGFsbG93ZWQg
dG8gZG8NCj4gPiBSTVcgb24gYWRqYWNlbnQgYnl0ZXMuDQo+ID4gSSBkb24ndCByZW1lbWJlciB3
aGV0aGVyIHRoZXkgYXJlIGNvbnN0cmFpbmVkIHRvIG9ubHkgZG8NCj4gPiAzMmJpdCBhY2Nlc3Nl
cywgYnV0IG5vdGhpbmcgdXNlZCB0byBzYXkgdGhhdCB0aGV5IHdvdWxkbid0DQo+ID4gZG8gMzJi
aXQgbWlzYWxpZ25lZCBvbmVzISAoYWx0aG91Z2ggSSBzdXNwZWN0IHRoZXkgbmV2ZXIgaGF2ZSku
DQo+IA0KPiB4ODYgaXMgbm90IGFmZmVjdGVkIChvciBlbHNlIHdlIHdvdWxkIGhhdmUgZm91bmQg
dGhlIGJ1ZyBtdWNoIGVhcmxpZXIpDQo+IA0KPiBTZXR0aW5nIDEtYml0IGZpZWxkIHRvIG9uZS96
ZXJvIHVzZXMgT1IvQU5EIGluc3RydWN0aW9ucy4NCj4gDQo+IG9yYiAgJDQsNzI0KCVyZWcpDQo+
IA0KPiBkb2Vzbid0IGxvYWQvc3RvcmUgNjRiaXRzIGJ1dCA4Yml0cy4NCg0KSSB3YXMgdGhpbmtp
bmcgb2YgY29kZSB0aGF0IG1pZ2h0IGJlIHVzaW5nIEJULCBCVEMsIEJUUiBvciBCVFMuDQpUaGVz
ZSBhcmUgcHJvYmFibHkgdXNlZCB3aXRoIHRoZSAnbG9jaycgcHJlZml4IC0gd2hpY2ggd291bGQN
CihJIHRoaW5rKSBtYWtlIHRoZW0gc2FmZS4NCg0KVGhlIGRvY3VtZW50ZWQgY29uc3RyYWludCBp
cyBtb3JlIHNwZWNpZmljIHRoYW4gaXQgdXNlZCB0byBiZQ0KdGhlIEludGVsIHZlcnNpb24gcmVh
ZHM6DQoNCiAgICBXaGVuIGFjY2Vzc2luZyBhIGJpdCBpbiBtZW1vcnksIHRoZSBwcm9jZXNzb3Ig
bWF5IGFjY2VzcyA0IGJ5dGVzDQogICAgc3RhcnRpbmcgZnJvbSB0aGUgbWVtb3J5IGFkZHJlc3Mg
Zm9yIGEgMzItYml0IG9wZXJhbmQgc2l6ZSwgdXNpbmcNCiAgICBieSB0aGUgZm9sbG93aW5nIHJl
bGF0aW9uc2hpcDoNCiAgICAgICAgRWZmZWN0aXZlIEFkZHJlc3MgKyAoNCDiiJcgKEJpdE9mZnNl
dCBESVYgMzIpKQ0KICAgIE9yLCBpdCBtYXkgYWNjZXNzIDIgYnl0ZXMgc3RhcnRpbmcgZnJvbSB0
aGUgbWVtb3J5IGFkZHJlc3MgZm9yIGENCiAgICAxNi1iaXQgb3BlcmFuZCwgdXNpbmcgdGhpcyBy
ZWxhdGlvbnNoaXA6DQogICAgICAgIEVmZmVjdGl2ZSBBZGRyZXNzICsgKDIg4oiXIChCaXRPZmZz
ZXQgRElWIDE2KSkNCiAgICBJdCBtYXkgZG8gc28gZXZlbiB3aGVuIG9ubHkgYSBzaW5nbGUgYnl0
ZSBuZWVkcyB0byBiZSBhY2Nlc3NlZCB0bw0KICAgIHJlYWNoIHRoZSBnaXZlbiBiaXQuDQogICAg
V2hlbiB1c2luZyB0aGlzIGJpdCBhZGRyZXNzaW5nIG1lY2hhbmlzbSwgc29mdHdhcmUgc2hvdWxk
IGF2b2lkDQogICAgcmVmZXJlbmNpbmcgYXJlYXMgb2YgbWVtb3J5IGNsb3NlIHRvIGFkZHJlc3Mg
c3BhY2UgaG9sZXMuDQogICAgSW4gcGFydGljdWxhciwgaXQgc2hvdWxkIGF2b2lkIHJlZmVyZW5j
ZXMgdG8gbWVtb3J5LW1hcHBlZCBJL08gcmVnaXN0ZXJzLg0KICAgIEluc3RlYWQsIHNvZnR3YXJl
IHNob3VsZCB1c2UgdGhlIE1PViBpbnN0cnVjdGlvbnMgdG8gbG9hZCBmcm9tIG9yIHN0b3JlDQog
ICAgdG8gdGhlc2UgYWRkcmVzc2VzLCBhbmQgdXNlIHRoZSByZWdpc3RlciBmb3JtIG9mIHRoZXNl
IGluc3RydWN0aW9ucyB0bw0KICAgIG1hbmlwdWxhdGUgdGhlIGRhdGEuDQogICAgSW4gNjQtYml0
IG1vZGUsIHRoZSBpbnN0cnVjdGlvbuKAmXMgZGVmYXVsdCBvcGVyYXRpb24gc2l6ZSBpcyAzMiBi
aXRzLg0KICAgIFVzaW5nIGEgUkVYIHByZWZpeCBpbiB0aGUgZm9ybSBvZiBSRVguUiBwZXJtaXRz
IGFjY2VzcyB0byBhZGRpdGlvbmFsDQogICAgcmVnaXN0ZXJzIChSOC1SMTUpLiBVc2luZyBhIFJF
WCBwcmVmaXggaW4gdGhlIGZvcm0gb2YgUkVYLlcgcHJvbW90ZXMNCiAgICBvcGVyYXRpb24gdG8g
NjQgYml0IG9wZXJhbmRzLg0KDQoJRGF2aWQNCg0KDQo=
^ permalink raw reply
* [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Mihai Caraman @ 2013-05-03 16:11 UTC (permalink / raw)
To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm
A change in the generic code highlighted that we were running with IRQs
(soft) enabled on Book3E 64-bit when trying to restart interrupts from
handle_exit(). This is a lesson to configure lockdep often :)
There is no reason to exit guest with soft_enabled == 1, a local_irq_enable()
call will do this for us so get rid of kvmppc_layz_ee() calls. With this fix
we eliminate irqs_disabled() warnings and some guest and host hangs revealed
under stress tests, but guests still exhibit some unresponsiveness.
The unresponsiveness has to do with the fact that arch_local_irq_restore()
does not guarantees to hard enable interrupts. To do so replace exception
function calls like timer_interrupt() with irq_happened flags. The
local_irq_enable() call takes care of replaying them and lets the interrupts
hard enabled.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
arch/powerpc/kvm/booke.c | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..82f155e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -673,7 +673,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
ret = s;
goto out;
}
- kvmppc_lazy_ee_enable();
kvm_guest_enter();
@@ -789,16 +788,16 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
switch (exit_nr) {
case BOOKE_INTERRUPT_EXTERNAL:
kvmppc_fill_pt_regs(®s);
- do_IRQ(®s);
+ local_paca->irq_happened |= PACA_IRQ_EE;
break;
case BOOKE_INTERRUPT_DECREMENTER:
kvmppc_fill_pt_regs(®s);
- timer_interrupt(®s);
+ local_paca->irq_happened |= PACA_IRQ_DEC;
break;
#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
case BOOKE_INTERRUPT_DOORBELL:
kvmppc_fill_pt_regs(®s);
- doorbell_exception(®s);
+ local_paca->irq_happened |= PACA_IRQ_DBELL;
break;
#endif
case BOOKE_INTERRUPT_MACHINE_CHECK:
@@ -1148,8 +1147,6 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
if (s <= 0) {
local_irq_enable();
r = (s << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
- } else {
- kvmppc_lazy_ee_enable();
}
}
--
1.7.4.1
^ permalink raw reply related
* Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Alexander Graf @ 2013-05-03 16:24 UTC (permalink / raw)
To: Mihai Caraman; +Cc: linuxppc-dev, kvm@vger.kernel.org VIRTUAL MA..., kvm-ppc
In-Reply-To: <1367597470-22214-1-git-send-email-mihai.caraman@freescale.com>
On 03.05.2013, at 18:11, Mihai Caraman wrote:
> A change in the generic code highlighted that we were running with =
IRQs
> (soft) enabled on Book3E 64-bit when trying to restart interrupts from
> handle_exit(). This is a lesson to configure lockdep often :)
>=20
> There is no reason to exit guest with soft_enabled =3D=3D 1, a =
local_irq_enable()
> call will do this for us so get rid of kvmppc_layz_ee() calls. With =
this fix
> we eliminate irqs_disabled() warnings and some guest and host hangs =
revealed
> under stress tests, but guests still exhibit some unresponsiveness.
>=20
> The unresponsiveness has to do with the fact that =
arch_local_irq_restore()
> does not guarantees to hard enable interrupts. To do so replace =
exception
> function calls like timer_interrupt() with irq_happened flags. The
> local_irq_enable() call takes care of replaying them and lets the =
interrupts
> hard enabled.
>=20
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Ben, could you please review?
Alex
> ---
> arch/powerpc/kvm/booke.c | 9 +++------
> 1 files changed, 3 insertions(+), 6 deletions(-)
>=20
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1020119..82f155e 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -673,7 +673,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, =
struct kvm_vcpu *vcpu)
> ret =3D s;
> goto out;
> }
> - kvmppc_lazy_ee_enable();
>=20
> kvm_guest_enter();
>=20
> @@ -789,16 +788,16 @@ static void kvmppc_restart_interrupt(struct =
kvm_vcpu *vcpu,
> switch (exit_nr) {
> case BOOKE_INTERRUPT_EXTERNAL:
> kvmppc_fill_pt_regs(®s);
> - do_IRQ(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_EE;
> break;
> case BOOKE_INTERRUPT_DECREMENTER:
> kvmppc_fill_pt_regs(®s);
> - timer_interrupt(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_DEC;
> break;
> #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
> case BOOKE_INTERRUPT_DOORBELL:
> kvmppc_fill_pt_regs(®s);
> - doorbell_exception(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_DBELL;
> break;
> #endif
> case BOOKE_INTERRUPT_MACHINE_CHECK:
> @@ -1148,8 +1147,6 @@ int kvmppc_handle_exit(struct kvm_run *run, =
struct kvm_vcpu *vcpu,
> if (s <=3D 0) {
> local_irq_enable();
> r =3D (s << 2) | RESUME_HOST | (r & =
RESUME_FLAG_NV);
> - } else {
> - kvmppc_lazy_ee_enable();
> }
> }
>=20
> --=20
> 1.7.4.1
>=20
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: net/eth/ibmveth: Fixup retrieval of MAC address
From: Ben Hutchings @ 2013-05-03 16:30 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: netdev, linuxppc-dev, David Miller, David Gibson
In-Reply-To: <1367458520.4389.6.camel@pasglop>
On Thu, 2013-05-02 at 11:35 +1000, Benjamin Herrenschmidt wrote:
> Some ancient pHyp versions used to create a 8 bytes local-mac-address
> property in the device-tree instead of a 6 bytes one for veth.
>
> The Linux driver code to deal with that is an insane hack which also
> happens to break with some choices of MAC addresses in qemu by testing
> for a bit in the address rather than just looking at the size of the
> property.
>
> Sanitize this by doing the latter instead.
[...]
> @@ -1334,11 +1334,19 @@ static int ibmveth_probe(struct vio_dev *dev, const struct vio_device_id *id)
> dev->unit_address);
>
> mac_addr_p = (unsigned char *)vio_get_attribute(dev, VETH_MAC_ADDR,
> - NULL);
> + &mac_len);
> if (!mac_addr_p) {
> dev_err(&dev->dev, "Can't find VETH_MAC_ADDR attribute\n");
> return -EINVAL;
> }
> + /* Workaround for old/broken pHyp */
> + if (mac_len == 8)
> + mac_addr_p += 2;
> + if (mac_len != 6) {
Missing 'else' before the second if?
> + dev_err(&dev->dev, "VETH_MAC_ADDR attribute wrong len %d\n",
> + mac_len);
> + return -EINVAL;
> + }
[...]
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Scott Wood @ 2013-05-03 18:04 UTC (permalink / raw)
To: Mihai Caraman; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1367597470-22214-1-git-send-email-mihai.caraman@freescale.com>
On 05/03/2013 11:11:10 AM, Mihai Caraman wrote:
> A change in the generic code highlighted that we were running with =20
> IRQs
> (soft) enabled on Book3E 64-bit when trying to restart interrupts from
> handle_exit(). This is a lesson to configure lockdep often :)
>=20
> There is no reason to exit guest with soft_enabled =3D=3D 1, a =20
> local_irq_enable()
> call will do this for us so get rid of kvmppc_layz_ee() calls. With =20
> this fix
> we eliminate irqs_disabled() warnings and some guest and host hangs =20
> revealed
> under stress tests, but guests still exhibit some unresponsiveness.
>=20
> The unresponsiveness has to do with the fact that =20
> arch_local_irq_restore()
> does not guarantees to hard enable interrupts.
Could you elaborate? If the saved IRQ state was "enabled", why =20
wouldn't arch_local_irq_restore() hard-enable IRQs? The last thing it =20
does is __hard_irq_enable().
Where is the arch_local_irq_restore() instance you're talking about?
> To do so replace exception
> function calls like timer_interrupt() with irq_happened flags. The
> local_irq_enable() call takes care of replaying them and lets the =20
> interrupts
> hard enabled.
Not sure what you mean by "lets the interrupts hard enabled"... Do you =20
mean the EE bit in regs->msr, as opposed to the EE bit in the current =20
MSR?
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> arch/powerpc/kvm/booke.c | 9 +++------
> 1 files changed, 3 insertions(+), 6 deletions(-)
>=20
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1020119..82f155e 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -673,7 +673,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, =20
> struct kvm_vcpu *vcpu)
> ret =3D s;
> goto out;
> }
> - kvmppc_lazy_ee_enable();
>=20
> kvm_guest_enter();
>=20
> @@ -789,16 +788,16 @@ static void kvmppc_restart_interrupt(struct =20
> kvm_vcpu *vcpu,
> switch (exit_nr) {
> case BOOKE_INTERRUPT_EXTERNAL:
> kvmppc_fill_pt_regs(®s);
> - do_IRQ(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_EE;
> break;
> case BOOKE_INTERRUPT_DECREMENTER:
> kvmppc_fill_pt_regs(®s);
> - timer_interrupt(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_DEC;
> break;
> #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
> case BOOKE_INTERRUPT_DOORBELL:
> kvmppc_fill_pt_regs(®s);
> - doorbell_exception(®s);
> + local_paca->irq_happened |=3D PACA_IRQ_DBELL;
> break;
> #endif
Aren't you breaking 32-bit here?
-Scott=
^ permalink raw reply
* Re: [PATCH -V7 08/10] powerpc/THP: Enable THP on PPC64
From: Aneesh Kumar K.V @ 2013-05-03 18:49 UTC (permalink / raw)
To: David Gibson; +Cc: paulus, linuxppc-dev, linux-mm
In-Reply-To: <20130503051528.GT13041@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Mon, Apr 29, 2013 at 01:21:49AM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>
>> We enable only if the we support 16MB page size.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/pgtable-ppc64.h | 3 +--
>> arch/powerpc/mm/pgtable_64.c | 28 ++++++++++++++++++++++++++++
>> 2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
>> index 97fc839..d65534b 100644
>> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
>> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
>> @@ -426,8 +426,7 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>> return pmd_val(pmd) >> PTE_RPN_SHIFT;
>> }
>>
>> -/* We will enable it in the last patch */
>> -#define has_transparent_hugepage() 0
>> +extern int has_transparent_hugepage(void);
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>
>> static inline int pmd_young(pmd_t pmd)
>> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
>> index 54216c1..b742d6f 100644
>> --- a/arch/powerpc/mm/pgtable_64.c
>> +++ b/arch/powerpc/mm/pgtable_64.c
>> @@ -754,6 +754,34 @@ void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
>> return;
>> }
>>
>> +int has_transparent_hugepage(void)
>> +{
>> + if (!mmu_has_feature(MMU_FTR_16M_PAGE))
>> + return 0;
>> + /*
>> + * We support THP only if HPAGE_SHIFT is 16MB.
>> + */
>> + if (!HPAGE_SHIFT || (HPAGE_SHIFT != mmu_psize_defs[MMU_PAGE_16M].shift))
>> + return 0;
>
> Again, THP should not be dependent on the value of HPAGE_SHIFT. Just
> checking that mmu_psize_defsz[MMU_PAGE_16M].shift == 24 should be
> sufficient (i.e. that 16M hugepages are supported).
done
+ /*
+ * We support THP only if PMD_SIZE is 16MB.
+ */
+ if (mmu_psize_defs[MMU_PAGE_16M].shift != PMD_SHIFT)
+ return 0;
+ /*
>
>> + /*
>> + * We need to make sure that we support 16MB hugepage in a segement
>> + * with base page size 64K or 4K. We only enable THP with a PAGE_SIZE
>> + * of 64K.
>> + */
>> + /*
>> + * If we have 64K HPTE, we will be using that by default
>> + */
>> + if (mmu_psize_defs[MMU_PAGE_64K].shift &&
>> + (mmu_psize_defs[MMU_PAGE_64K].penc[MMU_PAGE_16M] == -1))
>> + return 0;
>> + /*
>> + * Ok we only have 4K HPTE
>> + */
>> + if (mmu_psize_defs[MMU_PAGE_4K].penc[MMU_PAGE_16M] == -1)
>> + return 0;
>
> Except you don't actually support THP on 4K base page size yet.
That is 64K linux page size and 4K HPTE . We do support that. The Linux
page size part is taken care by Kconfig.
>
>> +
>> + return 1;
>> +}
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>
>> pmd_t pmdp_get_and_clear(struct mm_struct *mm,
>
-aneesh
^ permalink raw reply
* Re: [PATCH -V7 01/18] mm/THP: HPAGE_SHIFT is not a #define on some arch
From: Aneesh Kumar K.V @ 2013-05-03 18:51 UTC (permalink / raw)
To: David Gibson; +Cc: linuxppc-dev, paulus, linux-mm
In-Reply-To: <20130430050101.GY20202@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Tue, Apr 30, 2013 at 09:12:09AM +0530, Aneesh Kumar K.V wrote:
>> David Gibson <dwg@au1.ibm.com> writes:
>>
>> > On Mon, Apr 29, 2013 at 01:07:22AM +0530, Aneesh Kumar K.V wrote:
>> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> >>
>> >> On archs like powerpc that support different hugepage sizes, HPAGE_SHIFT
>> >> and other derived values like HPAGE_PMD_ORDER are not constants. So move
>> >> that to hugepage_init
>> >
>> > These seems to miss the point. Those variables may be defined in
>> > terms of HPAGE_SHIFT right now, but that is of itself kind of broken.
>> > The transparent hugepage mechanism only works if the hugepage size is
>> > equal to the PMD size - and PMD_SHIFT remains a compile time constant.
>> >
>> > There's no reason having transparent hugepage should force the PMD
>> > size of hugepage to be the default for other purposes - it should be
>> > possible to do THP as long as PMD-sized is a possible hugepage size.
>> >
>>
>> THP code does
>>
>> #define HPAGE_PMD_SHIFT HPAGE_SHIFT
>> #define HPAGE_PMD_MASK HPAGE_MASK
>> #define HPAGE_PMD_SIZE HPAGE_SIZE
>>
>> I had two options, one to move all those in terms of PMD_SHIFT
>
> This is a much better option that you've taken now, and really
> shouldn't be that hard. The THP code is much more strongly tied to
> the fact that it is a PMD than the fact that it's the same size as
> explicit huge pages.
>
Ok I tried the above and that turned out to be much simpler. I will have
to make sure i didn't break other archs that support THP.
-aneesh
^ permalink raw reply
* Re: [PATCH -V7 02/10] powerpc/THP: Implement transparent hugepages for ppc64
From: Aneesh Kumar K.V @ 2013-05-03 18:54 UTC (permalink / raw)
To: David Gibson, Benjamin Herrenschmidt; +Cc: linux-mm, linuxppc-dev, paulus
In-Reply-To: <20130503115428.GW13041@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Fri, May 03, 2013 at 06:19:03PM +1000, Benjamin Herrenschmidt wrote:
>> On Fri, 2013-05-03 at 14:52 +1000, David Gibson wrote:
>> > Here, specifically, the fact that PAGE_BUSY is in PAGE_THP_HPTEFLAGS
>> > is likely to be bad. If the page is busy, it's in the middle of
>> > update so can't stably be considered the same as anything.
>>
>> _PAGE_BUSY is more like a read lock. It means it's being hashed, so what
>> is not stable is _PAGE_HASHPTE, slot index, _ACCESSED and _DIRTY. The
>> rest is stable and usually is what pmd_same looks at (though I have a
>> small doubt vs. _ACCESSED and _DIRTY but at least x86 doesn't care since
>> they are updated by HW).
>
> Ok. It still seems very odd to me that _PAGE_BUSY would be in the THP
> version of _PAGE_HASHPTE, but not the normal one.
>
64-4k definition:
/* PTE flags to conserve for HPTE identification */
#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | \
_PAGE_SECONDARY | _PAGE_GROUP_IX)
64-64K definition:
/* PTE flags to conserve for HPTE identification */
#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
BTW I have dropped that change in my current patch. I dropped the
usage of _PAGE_COMBO and instead started using _PAGE_4K_PFN for
identifying THP.That enabled me to use _PAGE_HPTEFLAGS as it is.
-aneesh
^ permalink raw reply
* Re: [PATCH -V7 04/10] powerpc: Update find_linux_pte_or_hugepte to handle transparent hugepages
From: Aneesh Kumar K.V @ 2013-05-03 18:58 UTC (permalink / raw)
To: David Gibson; +Cc: paulus, linuxppc-dev, linux-mm
In-Reply-To: <20130503045323.GP13041@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Mon, Apr 29, 2013 at 01:21:45AM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> What's the difference in meaning between pmd_huge() and pmd_large()?
>
#ifndef CONFIG_HUGETLB_PAGE
#define pmd_huge(x) 0
#endif
Also pmd_large do check for THP PTE flag, and _PAGE_PRESENT.
-aneesh
^ permalink raw reply
* Re: [PATCH -V7 09/10] powerpc: Optimize hugepage invalidate
From: Aneesh Kumar K.V @ 2013-05-03 19:05 UTC (permalink / raw)
To: David Gibson; +Cc: paulus, linuxppc-dev, linux-mm
In-Reply-To: <20130503052846.GU13041@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Mon, Apr 29, 2013 at 01:21:50AM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>
>> Hugepage invalidate involves invalidating multiple hpte entries.
>> Optimize the operation using H_BULK_REMOVE on lpar platforms.
>> On native, reduce the number of tlb flush.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> Since this is purely an optimization, have you tried reproducing the
> bugs you're chasing with this patch not included?
That was due to not handling thp split while walking page table. I have
that fixed. Will post the next version soon.
>
>> ---
>> arch/powerpc/include/asm/machdep.h | 3 +
>> arch/powerpc/mm/hash_native_64.c | 78 +++++++++++++++++++++
>> arch/powerpc/mm/pgtable_64.c | 13 +++-
>> arch/powerpc/platforms/pseries/lpar.c | 126 ++++++++++++++++++++++++++++++++--
>> 4 files changed, 210 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
>> index 3f3f691..5d1e7d2 100644
>> --- a/arch/powerpc/include/asm/machdep.h
>> +++ b/arch/powerpc/include/asm/machdep.h
>> @@ -56,6 +56,9 @@ struct machdep_calls {
.....
>>
>> +/*
>> + * Limit iterations holding pSeries_lpar_tlbie_lock to 3. We also need
>> + * to make sure that we avoid bouncing the hypervisor tlbie lock.
>> + */
>> +#define PPC64_HUGE_HPTE_BATCH 12
>> +
>> +static void __pSeries_lpar_hugepage_invalidate(unsigned long *slot,
>> + unsigned long *vpn, int count,
>> + int psize, int ssize)
>> +{
>> + unsigned long param[9];
>
> [9]? I only see 8 elements being used.
cut paste error from pSeries_lpar_flush_hash_range
>
>> + int i = 0, pix = 0, rc;
>> + unsigned long flags = 0;
>> + int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
>> +
>> + if (lock_tlbie)
>> + spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags);
>
> Why are these hash operations being called with the tlbie lock held?
if the firmware doesn't support lockless TLBIE, we need to do locking
at the guest side. pSeries_lpar_flush_hash_range does that.
>
>> +
>> + for (i = 0; i < count; i++) {
>> +
>> + if (!firmware_has_feature(FW_FEATURE_BULK_REMOVE)) {
>> + pSeries_lpar_hpte_invalidate(slot[i], vpn[i], psize,
>> + ssize, 0);
>
> Couldn't you set the ppc_md hook based on the firmware request to
> avoid this test in the inner loop? I don't see any tlbie operations
> at all.
didn't get that.
>
>> + } else {
>> + param[pix] = HBR_REQUEST | HBR_AVPN | slot[i];
>> + param[pix+1] = hpte_encode_avpn(vpn[i], psize, ssize);
>> + pix += 2;
>> + if (pix == 8) {
>> + rc = plpar_hcall9(H_BULK_REMOVE, param,
>> + param[0], param[1], param[2],
>> + param[3], param[4], param[5],
>> + param[6], param[7]);
>> + BUG_ON(rc != H_SUCCESS);
>> + pix = 0;
>> + }
>> + }
>> + }
>> + if (pix) {
>> + param[pix] = HBR_END;
>> + rc = plpar_hcall9(H_BULK_REMOVE, param, param[0], param[1],
>> + param[2], param[3], param[4], param[5],
>> + param[6], param[7]);
>> + BUG_ON(rc != H_SUCCESS);
>> + }
>> +
>> + if (lock_tlbie)
>> + spin_unlock_irqrestore(&pSeries_lpar_tlbie_lock, flags);
>> +}
>> +
>> +static void pSeries_lpar_hugepage_invalidate(struct mm_struct *mm,
>> + unsigned char *hpte_slot_array,
>> + unsigned long addr, int psize)
>> +{
>> + int ssize = 0, i, index = 0;
>> + unsigned long s_addr = addr;
>> + unsigned int max_hpte_count, valid;
>> + unsigned long vpn_array[PPC64_HUGE_HPTE_BATCH];
>> + unsigned long slot_array[PPC64_HUGE_HPTE_BATCH];
>> + unsigned long shift, hidx, vpn = 0, vsid, hash, slot;
>> +
>> + shift = mmu_psize_defs[psize].shift;
>> + max_hpte_count = HUGE_PAGE_SIZE >> shift;
>> +
>> + for (i = 0; i < max_hpte_count; i++) {
>> + /*
>> + * 8 bits per each hpte entries
>> + * 000| [ secondary group (one bit) | hidx (3 bits) | valid bit]
>> + */
>> + valid = hpte_slot_array[i] & 0x1;
>> + if (!valid)
>> + continue;
>> + hidx = hpte_slot_array[i] >> 1;
>> +
>> + /* get the vpn */
>> + addr = s_addr + (i * (1ul << shift));
>> + if (!is_kernel_addr(addr)) {
>> + ssize = user_segment_size(addr);
>> + vsid = get_vsid(mm->context.id, addr, ssize);
>> + WARN_ON(vsid == 0);
>> + } else {
>> + vsid = get_kernel_vsid(addr, mmu_kernel_ssize);
>> + ssize = mmu_kernel_ssize;
>> + }
>> +
>> + vpn = hpt_vpn(addr, vsid, ssize);
>> + hash = hpt_hash(vpn, shift, ssize);
>> + if (hidx & _PTEIDX_SECONDARY)
>> + hash = ~hash;
>> +
>> + slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
>> + slot += hidx & _PTEIDX_GROUP_IX;
>> +
>> + slot_array[index] = slot;
>> + vpn_array[index] = vpn;
>> + if (index == PPC64_HUGE_HPTE_BATCH - 1) {
>> + /*
>> + * Now do a bluk invalidate
>> + */
>> + __pSeries_lpar_hugepage_invalidate(slot_array,
>> + vpn_array,
>> + PPC64_HUGE_HPTE_BATCH,
>> + psize, ssize);
>
> I don't really understand why you have one loop in this function, then
> another in the __ function.
?? if we didn't accumulate batch size number of entries, we won't call
the above. Hence we will have to do the bulk remove outside the if
loop.
>
>> + index = 0;
>> + } else
>> + index++;
>> + }
>> + if (index)
>> + __pSeries_lpar_hugepage_invalidate(slot_array, vpn_array,
>> + index, psize, ssize);
>> +}
>> +
>> static void pSeries_lpar_hpte_removebolted(unsigned long ea,
>> int psize, int ssize)
>> {
>> @@ -360,13 +478,6 @@ static void pSeries_lpar_hpte_removebolted(unsigned long ea,
>> pSeries_lpar_hpte_invalidate(slot, vpn, psize, ssize, 0);
>> }
>>
>> -/* Flag bits for H_BULK_REMOVE */
>> -#define HBR_REQUEST 0x4000000000000000UL
>> -#define HBR_RESPONSE 0x8000000000000000UL
>> -#define HBR_END 0xc000000000000000UL
>> -#define HBR_AVPN 0x0200000000000000UL
>> -#define HBR_ANDCOND 0x0100000000000000UL
>> -
>> /*
>> * Take a spinlock around flushes to avoid bouncing the hypervisor tlbie
>> * lock.
>> @@ -452,6 +563,7 @@ void __init hpte_init_lpar(void)
>> ppc_md.hpte_removebolted = pSeries_lpar_hpte_removebolted;
>> ppc_md.flush_hash_range = pSeries_lpar_flush_hash_range;
>> ppc_md.hpte_clear_all = pSeries_lpar_hptab_clear;
>> + ppc_md.hugepage_invalidate = pSeries_lpar_hugepage_invalidate;
>> }
>>
>> #ifdef CONFIG_PPC_SMLPAR
>
> --
-aneesh
^ permalink raw reply
* Re: [PATCH -V7 10/10] powerpc: disable assert_pte_locked
From: Aneesh Kumar K.V @ 2013-05-03 19:07 UTC (permalink / raw)
To: David Gibson; +Cc: paulus, linuxppc-dev, linux-mm
In-Reply-To: <20130503053027.GV13041@truffula.fritz.box>
David Gibson <dwg@au1.ibm.com> writes:
> On Mon, Apr 29, 2013 at 01:21:51AM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>
>> With THP we set pmd to none, before we do pte_clear. Hence we can't
>> walk page table to get the pte lock ptr and verify whether it is locked.
>> THP do take pte lock before calling pte_clear. So we don't change the locking
>> rules here. It is that we can't use page table walking to check whether
>> pte locks are help with THP.
>>
>> NOTE: This needs to be re-written. Not to be merged upstream.
>
> So, rewrite it..
That is something we need to discuss more. We can't do the pte_locked
assert the way we do now. Because as explained above, thp collapse
depend on setting pmd to none before doing pte_clear. So we clearly
cannot walk the page table and fine the ptl to check whether we are
holding that lock. But yes, these asserts are valid. Those function
should be called holding ptl locks. I still haven't found an alternative
way to do those asserts. Any suggestions ?
>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/mm/pgtable.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
>> index 214130a..d77f94f 100644
>> --- a/arch/powerpc/mm/pgtable.c
>> +++ b/arch/powerpc/mm/pgtable.c
>> @@ -224,6 +224,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
>> #ifdef CONFIG_DEBUG_VM
>> void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
>> {
>> +#if 0
>> pgd_t *pgd;
>> pud_t *pud;
>> pmd_t *pmd;
>> @@ -237,6 +238,7 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
>> pmd = pmd_offset(pud, addr);
>> BUG_ON(!pmd_present(*pmd));
>> assert_spin_locked(pte_lockptr(mm, pmd));
>> +#endif
>> }
>> #endif /* CONFIG_DEBUG_VM */
>>
>
-aneesh
^ permalink raw reply
* RE: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Caraman Mihai Claudiu-B02008 @ 2013-05-03 20:01 UTC (permalink / raw)
To: Wood Scott-B07421
Cc: linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
kvm-ppc@vger.kernel.org
In-Reply-To: <1367604287.19391.2@snotra>
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, May 03, 2013 9:05 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org; Caraman Mihai Claudiu-B02008
> Subject: Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
>=20
> > The unresponsiveness has to do with the fact that
> > arch_local_irq_restore()
> > does not guarantees to hard enable interrupts.
>=20
> Could you elaborate? If the saved IRQ state was "enabled", why
> wouldn't arch_local_irq_restore() hard-enable IRQs? The last thing it
> does is __hard_irq_enable().
if (!irq_happened)
return;
>=20
> Where is the arch_local_irq_restore() instance you're talking about?
./arch/power/kernel/irq.c
>=20
> > To do so replace exception
> > function calls like timer_interrupt() with irq_happened flags. The
> > local_irq_enable() call takes care of replaying them and lets the
> > interrupts
> > hard enabled.
>=20
> Not sure what you mean by "lets the interrupts hard enabled"... Do you
> mean the EE bit in regs->msr, as opposed to the EE bit in the current
> MSR?
If irq_happened "the last thing it does is __hard_irq_enable()".
> > @@ -789,16 +788,16 @@ static void kvmppc_restart_interrupt(struct
> > kvm_vcpu *vcpu,
> > switch (exit_nr) {
> > case BOOKE_INTERRUPT_EXTERNAL:
> > kvmppc_fill_pt_regs(®s);
> > - do_IRQ(®s);
> > + local_paca->irq_happened |=3D PACA_IRQ_EE;
> > break;
>=20
> Aren't you breaking 32-bit here?
I had eyes only for 64-bit hangs :)
-Mike
^ permalink raw reply
* Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Scott Wood @ 2013-05-03 20:15 UTC (permalink / raw)
To: Caraman Mihai Claudiu-B02008
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
In-Reply-To: <300B73AA675FCE4A93EB4FC1D42459FF3E984C@039-SN2MPN1-013.039d.mgd.msft.net>
On 05/03/2013 03:01:26 PM, Caraman Mihai Claudiu-B02008 wrote:
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, May 03, 2013 9:05 PM
> > To: Caraman Mihai Claudiu-B02008
> > Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> > dev@lists.ozlabs.org; Caraman Mihai Claudiu-B02008
> > Subject: Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and =20
> hangs
> >
> > > The unresponsiveness has to do with the fact that
> > > arch_local_irq_restore()
> > > does not guarantees to hard enable interrupts.
> >
> > Could you elaborate? If the saved IRQ state was "enabled", why
> > wouldn't arch_local_irq_restore() hard-enable IRQs? The last thing =20
> it
> > does is __hard_irq_enable().
>=20
> if (!irq_happened)
> return;
OK, so the problem is that we're not setting PACA_IRQ_HARD_DIS when we =20
hard-disable interrupts?
> > Where is the arch_local_irq_restore() instance you're talking about?
>=20
> ./arch/power/kernel/irq.c
I meant the caller. :-P
-Scott=
^ permalink raw reply
* RE: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Caraman Mihai Claudiu-B02008 @ 2013-05-03 20:56 UTC (permalink / raw)
To: Wood Scott-B07421
Cc: linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
kvm-ppc@vger.kernel.org
In-Reply-To: <1367612101.19391.8@snotra>
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Friday, May 03, 2013 11:15 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; kvm@vger.kernel.org;
> linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
>=20
> > > > The unresponsiveness has to do with the fact that
> > > > arch_local_irq_restore()
> > > > does not guarantees to hard enable interrupts.
> > >
> > > Could you elaborate? If the saved IRQ state was "enabled", why
> > > wouldn't arch_local_irq_restore() hard-enable IRQs? The last thing
> > it
> > > does is __hard_irq_enable().
> >
> > if (!irq_happened)
> > return;
>=20
> OK, so the problem is that we're not setting PACA_IRQ_HARD_DIS when we
> hard-disable interrupts?
We enter guest with local_irq_disable() which means soft disabled, when
do we hard-disable interrupts? If we follow host exception handlers model
they set PACA_IRQ_EE/DEC/DBELL but not PACA_IRQ_HARD_DIS. Can you give it
a try to see how KVM behaves with PACA_IRQ_HARD_DIS? I can't do it right no=
w.
>=20
> > > Where is the arch_local_irq_restore() instance you're talking about?
> >
> > ./arch/power/kernel/irq.c
>=20
> I meant the caller. :-P
./arch/powerpc/include/asm/hw_irq.h
55static inline unsigned long arch_local_irq_disable(void)
56{
57 unsigned long flags, zero;
58
59 asm volatile(
60 "li %1,0; lbz %0,%2(13); stb %1,%2(13)"
61 : "=3Dr" (flags), "=3D&r" (zero)
62 : "i" (offsetof(struct paca_struct, soft_enabled))
63 : "memory");
64
65 return flags;
66}
67
68extern void arch_local_irq_restore(unsigned long);
69
70static inline void arch_local_irq_enable(void)
71{
72 arch_local_irq_restore(1);
73}
-Mike
^ permalink raw reply
* Re: net/eth/ibmveth: Fixup retrieval of MAC address
From: Benjamin Herrenschmidt @ 2013-05-03 21:17 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, linuxppc-dev, David Miller, David Gibson
In-Reply-To: <1367598632.2756.3.camel@bwh-desktop.uk.solarflarecom.com>
On Fri, 2013-05-03 at 17:30 +0100, Ben Hutchings wrote:
> > + /* Workaround for old/broken pHyp */
> > + if (mac_len == 8)
> > + mac_addr_p += 2;
> > + if (mac_len != 6) {
>
> Missing 'else' before the second if?
Absolutely... oops :-) I couldn't find a version of pHyp with the wrong
property to test with. I suppose I could hack it up in OFW before boot.
I'll fix that and respin, sorry about that.
Cheers,
Ben.
> > + dev_err(&dev->dev, "VETH_MAC_ADDR attribute wrong len %d\n",
> > + mac_len);
> > + return -EINVAL;
> > + }
> [...]
>
^ permalink raw reply
* [PATCH 1/1] powerpc: Force 32 bit MSIs for devices that require it
From: Brian King @ 2013-05-03 21:30 UTC (permalink / raw)
To: benh; +Cc: klebers, brking, linuxppc-dev
The following patch implements a new PAPR change which allows
the OS to force the use of 32 bit MSIs, regardless of what
the PCI capabilities indicate. This is required for some
devices that advertise support for 64 bit MSIs but don't
actually support them.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pci-bridge.h | 2 ++
arch/powerpc/platforms/pseries/msi.c | 21 ++++++++++++++++++---
2 files changed, 20 insertions(+), 3 deletions(-)
diff -puN arch/powerpc/platforms/pseries/msi.c~powerpc_32_bit_msi_papr arch/powerpc/platforms/pseries/msi.c
--- linux/arch/powerpc/platforms/pseries/msi.c~powerpc_32_bit_msi_papr 2013-05-03 11:15:09.000000000 -0500
+++ linux-bjking1/arch/powerpc/platforms/pseries/msi.c 2013-05-03 12:33:11.000000000 -0500
@@ -24,6 +24,7 @@ static int query_token, change_token;
#define RTAS_RESET_FN 2
#define RTAS_CHANGE_MSI_FN 3
#define RTAS_CHANGE_MSIX_FN 4
+#define RTAS_CHANGE_32MSI_FN 5
static struct pci_dn *get_pdn(struct pci_dev *pdev)
{
@@ -58,7 +59,8 @@ static int rtas_change_msi(struct pci_dn
seq_num = 1;
do {
- if (func == RTAS_CHANGE_MSI_FN || func == RTAS_CHANGE_MSIX_FN)
+ if (func == RTAS_CHANGE_MSI_FN || func == RTAS_CHANGE_MSIX_FN ||
+ func == RTAS_CHANGE_32MSI_FN)
rc = rtas_call(change_token, 6, 4, rtas_ret, addr,
BUID_HI(buid), BUID_LO(buid),
func, num_irqs, seq_num);
@@ -426,9 +428,12 @@ static int rtas_setup_msi_irqs(struct pc
*/
again:
if (type == PCI_CAP_ID_MSI) {
- rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec);
+ if (pdn->force_32bit_msi)
+ rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec);
+ else
+ rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec);
- if (rc < 0) {
+ if (rc < 0 && !pdn->force_32bit_msi) {
pr_debug("rtas_msi: trying the old firmware call.\n");
rc = rtas_change_msi(pdn, RTAS_CHANGE_FN, nvec);
}
@@ -512,3 +517,13 @@ static int rtas_msi_init(void)
return 0;
}
arch_initcall(rtas_msi_init);
+
+static void quirk_radeon(struct pci_dev *dev)
+{
+ struct pci_dn *pdn = get_pdn(dev);
+
+ if (pdn)
+ pdn->force_32bit_msi = 1;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x68f2, quirk_radeon);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0xaa68, quirk_radeon);
diff -puN arch/powerpc/include/asm/pci-bridge.h~powerpc_32_bit_msi_papr arch/powerpc/include/asm/pci-bridge.h
--- linux/arch/powerpc/include/asm/pci-bridge.h~powerpc_32_bit_msi_papr 2013-05-03 11:15:09.000000000 -0500
+++ linux-bjking1/arch/powerpc/include/asm/pci-bridge.h 2013-05-03 11:15:09.000000000 -0500
@@ -163,6 +163,8 @@ struct pci_dn {
int pci_ext_config_space; /* for pci devices */
+ int force_32bit_msi:1;
+
struct pci_dev *pcidev; /* back-pointer to the pci device */
#ifdef CONFIG_EEH
struct eeh_dev *edev; /* eeh device */
_
^ permalink raw reply
* Re: [PATCH -V7 09/10] powerpc: Optimize hugepage invalidate
From: Benjamin Herrenschmidt @ 2013-05-03 21:54 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, paulus, linuxppc-dev, David Gibson
In-Reply-To: <87fvy351gc.fsf@linux.vnet.ibm.com>
On Sat, 2013-05-04 at 00:35 +0530, Aneesh Kumar K.V wrote:
>
> if the firmware doesn't support lockless TLBIE, we need to do locking
> at the guest side. pSeries_lpar_flush_hash_range does that.
We don't "need" to ... it's an optimization because by experience the FW
locking was horrible (and the HW locking is too).
Beware however that the hash routines can take a lock too on
"native" (instead of pHyp)...
Ben.
^ permalink raw reply
* Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Benjamin Herrenschmidt @ 2013-05-03 22:03 UTC (permalink / raw)
To: Alexander Graf
Cc: Mihai Caraman, linuxppc-dev, kvm@vger.kernel.org VIRTUAL MA...,
kvm-ppc
In-Reply-To: <981DB739-E50F-488F-B2D8-916FDC2E1749@suse.de>
On Fri, 2013-05-03 at 18:24 +0200, Alexander Graf wrote:
> > There is no reason to exit guest with soft_enabled == 1, a local_irq_enable()
> > call will do this for us so get rid of kvmppc_layz_ee() calls. With this fix
> > we eliminate irqs_disabled() warnings and some guest and host hangs revealed
> > under stress tests, but guests still exhibit some unresponsiveness.
> >
> > The unresponsiveness has to do with the fact that arch_local_irq_restore()
> > does not guarantees to hard enable interrupts. To do so replace exception
> > function calls like timer_interrupt() with irq_happened flags. The
> > local_irq_enable() call takes care of replaying them and lets the interrupts
> > hard enabled.
> >
> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>
> Ben, could you please review?
That does look like the right thing to do indeed.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and hangs
From: Scott Wood @ 2013-05-03 22:06 UTC (permalink / raw)
To: Caraman Mihai Claudiu-B02008
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
In-Reply-To: <300B73AA675FCE4A93EB4FC1D42459FF3E9B37@039-SN2MPN1-013.039d.mgd.msft.net>
On 05/03/2013 03:56:47 PM, Caraman Mihai Claudiu-B02008 wrote:
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Friday, May 03, 2013 11:15 PM
> > To: Caraman Mihai Claudiu-B02008
> > Cc: Wood Scott-B07421; kvm-ppc@vger.kernel.org; kvm@vger.kernel.org;
> > linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH] KVM: PPC: Book3E 64: Fix IRQs warnings and =20
> hangs
> >
> > > > > The unresponsiveness has to do with the fact that
> > > > > arch_local_irq_restore()
> > > > > does not guarantees to hard enable interrupts.
> > > >
> > > > Could you elaborate? If the saved IRQ state was "enabled", why
> > > > wouldn't arch_local_irq_restore() hard-enable IRQs? The last =20
> thing
> > > it
> > > > does is __hard_irq_enable().
> > >
> > > if (!irq_happened)
> > > return;
> >
> > OK, so the problem is that we're not setting PACA_IRQ_HARD_DIS when =20
> we
> > hard-disable interrupts?
>=20
> We enter guest with local_irq_disable() which means soft disabled,
Hmm... I don't see any obvious breakage from that, but it makes me =20
nervous. I'd be more comfortable if we just hard-disabled interrupts =20
there.
> when do we hard-disable interrupts?
Interrupts will be hard-disabled when we take an exception to exit =20
guest state.
> If we follow host exception handlers model
> they set PACA_IRQ_EE/DEC/DBELL but not PACA_IRQ_HARD_DIS. Can you =20
> give it
> a try to see how KVM behaves with PACA_IRQ_HARD_DIS? I can't do it =20
> right now.
I replaced the two calls to kvmppc_lazy_ee_enable() with calls to =20
hard_irq_disable(), and it seems to be working fine.
> > > > Where is the arch_local_irq_restore() instance you're talking =20
> about?
> > >
> > > ./arch/power/kernel/irq.c
> >
> > I meant the caller. :-P
>=20
> ./arch/powerpc/include/asm/hw_irq.h
>=20
> 55static inline unsigned long arch_local_irq_disable(void)
> 56{
> 57 unsigned long flags, zero;
> 58
> 59 asm volatile(
> 60 "li %1,0; lbz %0,%2(13); stb %1,%2(13)"
> 61 : "=3Dr" (flags), "=3D&r" (zero)
> 62 : "i" (offsetof(struct paca_struct, soft_enabled))
> 63 : "memory");
> 64
> 65 return flags;
> 66}
> 67
> 68extern void arch_local_irq_restore(unsigned long);
> 69
> 70static inline void arch_local_irq_enable(void)
> 71{
> 72 arch_local_irq_restore(1);
> 73}
Sigh. I meant the real caller, who's calling local_irq_restore().
-Scott=
^ permalink raw reply
* [PATCHv5 0/2] Speed Cap fixes for ppc64
From: Kleber Sacilotto de Souza @ 2013-05-03 22:43 UTC (permalink / raw)
To: linuxppc-dev, dri-devel, Benjamin Herrenschmidt, Bjorn Helgaas,
David Airlie, Michael Ellerman
Cc: Brian King, Alex Deucher, Jerome Glisse,
Thadeu Lima de Souza Cascardo, Kleber Sacilotto de Souza
This v5 of the patch series is based on v4 sent by Lucas Kannebley Tavares
with a few changes:
1. Fix a compilation warning on the code from the first patch, where it was
missing a declaration of struct pci_host_bridge, used on the definition of
the function pointer pcibios_root_bridge_prepare() in
arch/powerpc/include/asm/machdep.h.
2. Incorporate some changes proposed by Tony Breeds in
pseries_root_bridge_prepare().
The following description of the changes was extrated from v4:
This patch series does:
1. max_bus_speed is used to set the device to gen2 speeds
2. on power there's no longer a conflict between the pseries call and other
architectures, because the overwrite is done via a ppc_md hook
3. radeon is using bus->max_bus_speed instead of drm_pcie_get_speed_cap_mask
for gen2 capability detection
The first patch consists of some architecture changes, such as adding a hook on
powerpc for pci_root_bridge_prepare, so that pseries will initialize it to a
function, while all other architectures get a NULL pointer. So that whenever
pci_create_root_bus is called, we'll get max_bus_speed properly setup from
OpenFirmware.
The second patch consists of simple radeon changes not to call
drm_get_pcie_speed_cap_mask anymore. I assume that on x86 machines,
the max_bus_speed property will be properly set already.
Kleber Sacilotto de Souza (2):
ppc64: perform proper max_bus_speed detection
radeon: use max_bus_speed to activate gen2 speeds
arch/powerpc/include/asm/machdep.h | 3 ++
arch/powerpc/kernel/pci-common.c | 8 ++++
arch/powerpc/platforms/pseries/pci.c | 53 ++++++++++++++++++++++++++++++
arch/powerpc/platforms/pseries/pseries.h | 4 ++
arch/powerpc/platforms/pseries/setup.c | 2 +
drivers/gpu/drm/radeon/evergreen.c | 10 ++----
drivers/gpu/drm/radeon/r600.c | 9 +----
drivers/gpu/drm/radeon/rv770.c | 9 +----
8 files changed, 77 insertions(+), 21 deletions(-)
^ permalink raw reply
* [PATCHv5 1/2] ppc64: perform proper max_bus_speed detection
From: Kleber Sacilotto de Souza @ 2013-05-03 22:43 UTC (permalink / raw)
To: linuxppc-dev, dri-devel, Benjamin Herrenschmidt, Bjorn Helgaas,
David Airlie, Michael Ellerman
Cc: Brian King, Alex Deucher, Jerome Glisse,
Thadeu Lima de Souza Cascardo, Kleber Sacilotto de Souza
In-Reply-To: <1367620993-27037-1-git-send-email-klebers@linux.vnet.ibm.com>
On pseries machines the detection for max_bus_speed should be done
through an OpenFirmware property. This patch adds a function to perform
this detection and a hook to perform dynamic adding of the function only
for pseries. This is done by overwriting the weak
pcibios_root_bridge_prepare function which is called by
pci_create_root_bus().
From: Lucas Kannebley Tavares <lucaskt@linux.vnet.ibm.com>
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/machdep.h | 3 ++
arch/powerpc/kernel/pci-common.c | 8 ++++
arch/powerpc/platforms/pseries/pci.c | 53 ++++++++++++++++++++++++++++++
arch/powerpc/platforms/pseries/pseries.h | 4 ++
arch/powerpc/platforms/pseries/setup.c | 2 +
5 files changed, 70 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 3f3f691..92386fc 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -29,6 +29,7 @@ struct rtc_time;
struct file;
struct pci_controller;
struct kimage;
+struct pci_host_bridge;
struct machdep_calls {
char *name;
@@ -108,6 +109,8 @@ struct machdep_calls {
void (*pcibios_fixup)(void);
int (*pci_probe_mode)(struct pci_bus *);
void (*pci_irq_fixup)(struct pci_dev *dev);
+ int (*pcibios_root_bridge_prepare)(struct pci_host_bridge
+ *bridge);
/* To setup PHBs when using automatic OF platform driver for PCI */
int (*pci_setup_phb)(struct pci_controller *host);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index f325dc9..d5811d8 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -845,6 +845,14 @@ int pci_proc_domain(struct pci_bus *bus)
return 1;
}
+int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+ if (ppc_md.pcibios_root_bridge_prepare)
+ return ppc_md.pcibios_root_bridge_prepare(bridge);
+
+ return 0;
+}
+
/* This header fixup will do the resource fixup for all devices as they are
* probed, but not for bridge ranges
*/
diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index 0b580f4..5f93856 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -108,3 +108,56 @@ static void fixup_winbond_82c105(struct pci_dev* dev)
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105,
fixup_winbond_82c105);
+
+int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+ struct device_node *dn, *pdn;
+ struct pci_bus *bus;
+ const uint32_t *pcie_link_speed_stats;
+
+ bus = bridge->bus;
+
+ dn = pcibios_get_phb_of_node(bus);
+ if (!dn)
+ return 0;
+
+ for (pdn = dn; pdn != NULL; pdn = of_get_next_parent(pdn)) {
+ pcie_link_speed_stats = (const uint32_t *) of_get_property(pdn,
+ "ibm,pcie-link-speed-stats", NULL);
+ if (pcie_link_speed_stats)
+ break;
+ }
+
+ of_node_put(pdn);
+
+ if (!pcie_link_speed_stats) {
+ pr_err("no ibm,pcie-link-speed-stats property\n");
+ return 0;
+ }
+
+ switch (pcie_link_speed_stats[0]) {
+ case 0x01:
+ bus->max_bus_speed = PCIE_SPEED_2_5GT;
+ break;
+ case 0x02:
+ bus->max_bus_speed = PCIE_SPEED_5_0GT;
+ break;
+ default:
+ bus->max_bus_speed = PCI_SPEED_UNKNOWN;
+ break;
+ }
+
+ switch (pcie_link_speed_stats[1]) {
+ case 0x01:
+ bus->cur_bus_speed = PCIE_SPEED_2_5GT;
+ break;
+ case 0x02:
+ bus->cur_bus_speed = PCIE_SPEED_5_0GT;
+ break;
+ default:
+ bus->cur_bus_speed = PCI_SPEED_UNKNOWN;
+ break;
+ }
+
+ return 0;
+}
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 8af71e4..c2a3a25 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -63,4 +63,8 @@ extern int dlpar_detach_node(struct device_node *);
/* Snooze Delay, pseries_idle */
DECLARE_PER_CPU(long, smt_snooze_delay);
+/* PCI root bridge prepare function override for pseries */
+struct pci_host_bridge;
+int pseries_root_bridge_prepare(struct pci_host_bridge *bridge);
+
#endif /* _PSERIES_PSERIES_H */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index ac932a9..c11c823 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -466,6 +466,8 @@ static void __init pSeries_setup_arch(void)
else
ppc_md.enable_pmcs = power4_enable_pmcs;
+ ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+
if (firmware_has_feature(FW_FEATURE_SET_MODE)) {
long rc;
if ((rc = pSeries_enable_reloc_on_exc()) != H_SUCCESS) {
--
1.7.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox