From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bn0109.outbound.protection.outlook.com [157.56.110.109]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 0C33C1A0C3B for ; Thu, 23 Apr 2015 22:32:00 +1000 (AEST) Message-ID: <5538E624.8080904@freescale.com> Date: Thu, 23 Apr 2015 15:31:32 +0300 From: Purcareata Bogdan MIME-Version: 1.0 To: Scott Wood Subject: Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux References: <1424251955-308-1-git-send-email-bogdan.purcareata@freescale.com> <54E73A6C.9080500@suse.de> <54E740E7.5090806@redhat.com> <54E74A8C.30802@linutronix.de> <1424734051.4698.17.camel@freescale.com> <54EF196E.4090805@redhat.com> <54EF2025.80404@linutronix.de> <1424999159.4698.78.camel@freescale.com> <55158E6D.40304@freescale.com> <1428016310.22867.289.camel@freescale.com> <551E4A41.1080705@freescale.com> <1428096375.22867.369.camel@freescale.com> <55262DD3.2050707@freescale.com> <1428623611.22867.561.camel@freescale.com> <5534DAA4.3050809@freescale.com> <1429577566.4352.68.camel@freescale.com> <55378EC4.2080302@freescale.com> <1429749001.16357.7.camel@freescale.com> In-Reply-To: <1429749001.16357.7.camel@freescale.com> Content-Type: text/plain; charset="utf-8"; format=flowed Cc: Laurentiu Tudor , linux-rt-users@vger.kernel.org, Sebastian Andrzej Siewior , Alexander Graf , linux-kernel@vger.kernel.org, Bogdan Purcareata , mihai.caraman@freescale.com, Paolo Bonzini , Thomas Gleixner , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 23.04.2015 03:30, Scott Wood wrote: > On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote: >> On 21.04.2015 03:52, Scott Wood wrote: >>> On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote: >>>> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner >>>> function is kvmppc_set_epr, which is a static inline. Removing the static inline >>>> yields a compiler crash (Segmentation fault (core dumped) - >>>> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed), >>>> but that's a different story, so I just let it be for now. Point is the time may >>>> include other work after the lock has been released, but before the function >>>> actually returned. I noticed this was the case for .kvm_set_msi, which could >>>> work up to 90 ms, not actually under the lock. This made me change what I'm >>>> looking at. >>> >>> kvm_set_msi does pretty much nothing outside the lock -- I suspect >>> you're measuring an interrupt that happened as soon as the lock was >>> released. >> >> That's exactly right. I've seen things like a timer interrupt occuring right >> after the spinlock_irqrestore, but before kvm_set_msi actually returned. >> >> [...] >> >>>> Or perhaps a different stress scenario involving a lot of VCPUs >>>> and external interrupts? >>> >>> You could instrument the MPIC code to find out how many loop iterations >>> you maxed out on, and compare that to the theoretical maximum. >> >> Numbers are pretty low, and I'll try to explain based on my observations. >> >> The problematic section in openpic_update_irq is this [1], since it loops >> through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops >> through all pending interrupts for a VCPU [2]. >> >> The guest interfaces are virtio-vhostnet, which are based on MSI >> (/proc/interrupts in guest shows they are MSI). For external interrupts to the >> guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized), >> so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop). >> >> I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts >> are actually pending for the destination VCPU. At most, there were 3 interrupts >> - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that >> guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts. >> >> So worst case, in this scenario, was checking the priorities for 3 pending >> interrupts for 1 VCPU. Something like this (some of my prints included): >> >> [61010.582033] openpic_update_irq: destmask 1 last_cpu 0 >> [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ >> [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1 >> [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1 >> [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1 >> [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1 >> >> It would be really helpful to get your comments regarding whether these are >> realistical number for everyday use, or they are relevant only to this >> particular scenario. > > RT isn't about "realistic numbers for everyday use". It's about worst > cases. > >> - Can these interrupts be used in directed delivery, so that the destination >> mask can include multiple VCPUs? > > The Freescale MPIC does not support multiple destinations for most > interrupts, but the (non-FSL-specific) emulation code appears to allow > it. > >> The MPIC manual states that timer and IPI >> interrupts are supported for directed delivery, altough I'm not sure how much of >> this is used in the emulation. I know that kvmppc uses the decrementer outside >> of the MPIC. >> >> - How are virtio interrupts cascaded over the shared MSI interrupts? >> /proc/device-tree/soc@e0000000/msi@41600/interrupts in the guest shows 8 values >> - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is >> that correct? > > It looks like that's currently the case, but actual hardware supports > more than that, so it's possible (albeit unlikely any time soon) that > the emulation eventually does as well. > > But it's possible to have interrupts other than MSIs... Right. So given that the raw spinlock conversion is not suitable for all the scenarios supported by the OpenPIC emulation, is it ok that my next step would be to send a patch containing both the raw spinlock conversion and a mandatory disable of the in-kernel MPIC? This is actually the last conclusion we came up with some time ago, but I guess it was good to get some more insight on how things actually work (at least for me). Thanks, Bogdan P. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Purcareata Bogdan Subject: Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux Date: Thu, 23 Apr 2015 15:31:32 +0300 Message-ID: <5538E624.8080904@freescale.com> References: <1424251955-308-1-git-send-email-bogdan.purcareata@freescale.com> <54E73A6C.9080500@suse.de> <54E740E7.5090806@redhat.com> <54E74A8C.30802@linutronix.de> <1424734051.4698.17.camel@freescale.com> <54EF196E.4090805@redhat.com> <54EF2025.80404@linutronix.de> <1424999159.4698.78.camel@freescale.com> <55158E6D.40304@freescale.com> <1428016310.22867.289.camel@freescale.com> <551E4A41.1080705@freescale.com> <1428096375.22867.369.camel@freescale.com> <55262DD3.2050707@freescale.com> <1428623611.22867.561.camel@freescale.com> <5534DAA4.3050809@freescale.com> <1429577566.4352.68.camel@freescale.com> <55378EC4.2080302@freescale.com> <1429749001.16357.7.camel@freescale.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Cc: Laurentiu Tudor , linux-rt-users@vger.kernel.org, Sebastian Andrzej Siewior , Alexander Graf , linux-kernel@vger.kernel.org, Bogdan Purcareata , mihai.caraman@freescale.com, Paolo Bonzini , Thomas Gleixner , linuxppc-dev@lists.ozlabs.org To: Scott Wood Return-path: In-Reply-To: <1429749001.16357.7.camel@freescale.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" List-Id: linux-rt-users.vger.kernel.org T24gMjMuMDQuMjAxNSAwMzozMCwgU2NvdHQgV29vZCB3cm90ZToKPiBPbiBXZWQsIDIwMTUtMDQt MjIgYXQgMTU6MDYgKzAzMDAsIFB1cmNhcmVhdGEgQm9nZGFuIHdyb3RlOgo+PiBPbiAyMS4wNC4y MDE1IDAzOjUyLCBTY290dCBXb29kIHdyb3RlOgo+Pj4gT24gTW9uLCAyMDE1LTA0LTIwIGF0IDEz OjUzICswMzAwLCBQdXJjYXJlYXRhIEJvZ2RhbiB3cm90ZToKPj4+PiBUaGVyZSB3YXMgYSB3ZWly ZCBzaXR1YXRpb24gZm9yIC5rdm1wcGNfbXBpY19zZXRfZXByIC0gaXRzIGNvcnJlc3BvbmRpbmcg aW5uZXIKPj4+PiBmdW5jdGlvbiBpcyBrdm1wcGNfc2V0X2Vwciwgd2hpY2ggaXMgYSBzdGF0aWMg aW5saW5lLiBSZW1vdmluZyB0aGUgc3RhdGljIGlubGluZQo+Pj4+IHlpZWxkcyBhIGNvbXBpbGVy IGNyYXNoIChTZWdtZW50YXRpb24gZmF1bHQgKGNvcmUgZHVtcGVkKSAtCj4+Pj4gc2NyaXB0cy9N YWtlZmlsZS5idWlsZDo0NDE6IHJlY2lwZSBmb3IgdGFyZ2V0ICdhcmNoL3Bvd2VycGMva3ZtL2t2 bS5vJyBmYWlsZWQpLAo+Pj4+IGJ1dCB0aGF0J3MgYSBkaWZmZXJlbnQgc3RvcnksIHNvIEkganVz dCBsZXQgaXQgYmUgZm9yIG5vdy4gUG9pbnQgaXMgdGhlIHRpbWUgbWF5Cj4+Pj4gaW5jbHVkZSBv dGhlciB3b3JrIGFmdGVyIHRoZSBsb2NrIGhhcyBiZWVuIHJlbGVhc2VkLCBidXQgYmVmb3JlIHRo ZSBmdW5jdGlvbgo+Pj4+IGFjdHVhbGx5IHJldHVybmVkLiBJIG5vdGljZWQgdGhpcyB3YXMgdGhl IGNhc2UgZm9yIC5rdm1fc2V0X21zaSwgd2hpY2ggY291bGQKPj4+PiB3b3JrIHVwIHRvIDkwIG1z LCBub3QgYWN0dWFsbHkgdW5kZXIgdGhlIGxvY2suIFRoaXMgbWFkZSBtZSBjaGFuZ2Ugd2hhdCBJ J20KPj4+PiBsb29raW5nIGF0Lgo+Pj4KPj4+IGt2bV9zZXRfbXNpIGRvZXMgcHJldHR5IG11Y2gg bm90aGluZyBvdXRzaWRlIHRoZSBsb2NrIC0tIEkgc3VzcGVjdAo+Pj4geW91J3JlIG1lYXN1cmlu ZyBhbiBpbnRlcnJ1cHQgdGhhdCBoYXBwZW5lZCBhcyBzb29uIGFzIHRoZSBsb2NrIHdhcwo+Pj4g cmVsZWFzZWQuCj4+Cj4+IFRoYXQncyBleGFjdGx5IHJpZ2h0LiBJJ3ZlIHNlZW4gdGhpbmdzIGxp a2UgYSB0aW1lciBpbnRlcnJ1cHQgb2NjdXJpbmcgcmlnaHQKPj4gYWZ0ZXIgdGhlIHNwaW5sb2Nr X2lycXJlc3RvcmUsIGJ1dCBiZWZvcmUga3ZtX3NldF9tc2kgYWN0dWFsbHkgcmV0dXJuZWQuCj4+ Cj4+IFsuLi5dCj4+Cj4+Pj4gICAgT3IgcGVyaGFwcyBhIGRpZmZlcmVudCBzdHJlc3Mgc2NlbmFy aW8gaW52b2x2aW5nIGEgbG90IG9mIFZDUFVzCj4+Pj4gYW5kIGV4dGVybmFsIGludGVycnVwdHM/ Cj4+Pgo+Pj4gWW91IGNvdWxkIGluc3RydW1lbnQgdGhlIE1QSUMgY29kZSB0byBmaW5kIG91dCBo b3cgbWFueSBsb29wIGl0ZXJhdGlvbnMKPj4+IHlvdSBtYXhlZCBvdXQgb24sIGFuZCBjb21wYXJl IHRoYXQgdG8gdGhlIHRoZW9yZXRpY2FsIG1heGltdW0uCj4+Cj4+IE51bWJlcnMgYXJlIHByZXR0 eSBsb3csIGFuZCBJJ2xsIHRyeSB0byBleHBsYWluIGJhc2VkIG9uIG15IG9ic2VydmF0aW9ucy4K Pj4KPj4gVGhlIHByb2JsZW1hdGljIHNlY3Rpb24gaW4gb3BlbnBpY191cGRhdGVfaXJxIGlzIHRo aXMgWzFdLCBzaW5jZSBpdCBsb29wcwo+PiB0aHJvdWdoIGFsbCBWQ1BVcywgYW5kIElSUV9sb2Nh bF9waXBlIGZ1cnRoZXIgY2FsbHMgSVJRX2NoZWNrLCB3aGljaCBsb29wcwo+PiB0aHJvdWdoIGFs bCBwZW5kaW5nIGludGVycnVwdHMgZm9yIGEgVkNQVSBbMl0uCj4+Cj4+IFRoZSBndWVzdCBpbnRl cmZhY2VzIGFyZSB2aXJ0aW8tdmhvc3RuZXQsIHdoaWNoIGFyZSBiYXNlZCBvbiBNU0kKPj4gKC9w cm9jL2ludGVycnVwdHMgaW4gZ3Vlc3Qgc2hvd3MgdGhleSBhcmUgTVNJKS4gRm9yIGV4dGVybmFs IGludGVycnVwdHMgdG8gdGhlCj4+IGd1ZXN0LCB0aGUgaXJxX3NvdXJjZSBkZXN0bWFzayBpcyBj dXJyZW50bHkgMCwgYW5kIGxhc3RfY3B1IGlzIDAgKHVuaXRpYWxpemVkKSwKPj4gc28gWzFdIHdp bGwgZ28gb24gYW5kIGRlbGl2ZXIgdGhlIGludGVycnVwdCBkaXJlY3RseSBhbmQgdW5pY2FzdCAo bm8gVkNQVXMgbG9vcCkuCj4+Cj4+IEkgYWN0aXZhdGVkIHRoZSBwcl9kZWJ1Z3MgaW4gYXJjaC9w b3dlcnBjL2t2bS9tcGljLmMsIHRvIHNlZSBob3cgbWFueSBpbnRlcnJ1cHRzCj4+IGFyZSBhY3R1 YWxseSBwZW5kaW5nIGZvciB0aGUgZGVzdGluYXRpb24gVkNQVS4gQXQgbW9zdCwgdGhlcmUgd2Vy ZSAzIGludGVycnVwdHMKPj4gLSBuX0lSUSA9IHsyMjQsMjI1LDIyNn0gLSBldmVuIGZvciAyNCBm bG93cyBvZiBwaW5nIGZsb29kLiBJIHVuZGVyc3RhbmQgdGhhdAo+PiBndWVzdCB2aXJ0aW8gaW50 ZXJydXB0cyBhcmUgY2FzY2FkZWQgb3ZlciAxIG9yIGEgY291cGxlIG9mIHNoYXJlZCBNU0kgaW50 ZXJydXB0cy4KPj4KPj4gU28gd29yc3QgY2FzZSwgaW4gdGhpcyBzY2VuYXJpbywgd2FzIGNoZWNr aW5nIHRoZSBwcmlvcml0aWVzIGZvciAzIHBlbmRpbmcKPj4gaW50ZXJydXB0cyBmb3IgMSBWQ1BV LiBTb21ldGhpbmcgbGlrZSB0aGlzIChzb21lIG9mIG15IHByaW50cyBpbmNsdWRlZCk6Cj4+Cj4+ IFs2MTAxMC41ODIwMzNdIG9wZW5waWNfdXBkYXRlX2lycTogZGVzdG1hc2sgMSBsYXN0X2NwdSAw Cj4+IFs2MTAxMC41ODIwMzRdIG9wZW5waWNfdXBkYXRlX2lycTogT25seSBvbmUgQ1BVIGlzIGFs bG93ZWQgdG8gcmVjZWl2ZSB0aGlzIElSUQo+PiBbNjEwMTAuNTgyMDM2XSBJUlFfbG9jYWxfcGlw ZTogSVJRIDIyNCBhY3RpdmUgMCB3YXMgMQo+PiBbNjEwMTAuNTgyMDM3XSBJUlFfY2hlY2s6IGly cSAyMjYgc2V0IGl2cHJfcHI9OCBwcj0tMQo+PiBbNjEwMTAuNTgyMDM4XSBJUlFfY2hlY2s6IGly cSAyMjUgc2V0IGl2cHJfcHI9OCBwcj0tMQo+PiBbNjEwMTAuNTgyMDM5XSBJUlFfY2hlY2s6IGly cSAyMjQgc2V0IGl2cHJfcHI9OCBwcj0tMQo+Pgo+PiBJdCB3b3VsZCBiZSByZWFsbHkgaGVscGZ1 bCB0byBnZXQgeW91ciBjb21tZW50cyByZWdhcmRpbmcgd2hldGhlciB0aGVzZSBhcmUKPj4gcmVh bGlzdGljYWwgbnVtYmVyIGZvciBldmVyeWRheSB1c2UsIG9yIHRoZXkgYXJlIHJlbGV2YW50IG9u bHkgdG8gdGhpcwo+PiBwYXJ0aWN1bGFyIHNjZW5hcmlvLgo+Cj4gUlQgaXNuJ3QgYWJvdXQgInJl YWxpc3RpYyBudW1iZXJzIGZvciBldmVyeWRheSB1c2UiLiAgSXQncyBhYm91dCB3b3JzdAo+IGNh c2VzLgo+Cj4+IC0gQ2FuIHRoZXNlIGludGVycnVwdHMgYmUgdXNlZCBpbiBkaXJlY3RlZCBkZWxp dmVyeSwgc28gdGhhdCB0aGUgZGVzdGluYXRpb24KPj4gbWFzayBjYW4gaW5jbHVkZSBtdWx0aXBs ZSBWQ1BVcz8KPgo+IFRoZSBGcmVlc2NhbGUgTVBJQyBkb2VzIG5vdCBzdXBwb3J0IG11bHRpcGxl IGRlc3RpbmF0aW9ucyBmb3IgbW9zdAo+IGludGVycnVwdHMsIGJ1dCB0aGUgKG5vbi1GU0wtc3Bl Y2lmaWMpIGVtdWxhdGlvbiBjb2RlIGFwcGVhcnMgdG8gYWxsb3cKPiBpdC4KPgo+PiAgIFRoZSBN UElDIG1hbnVhbCBzdGF0ZXMgdGhhdCB0aW1lciBhbmQgSVBJCj4+IGludGVycnVwdHMgYXJlIHN1 cHBvcnRlZCBmb3IgZGlyZWN0ZWQgZGVsaXZlcnksIGFsdG91Z2ggSSdtIG5vdCBzdXJlIGhvdyBt dWNoIG9mCj4+IHRoaXMgaXMgdXNlZCBpbiB0aGUgZW11bGF0aW9uLiBJIGtub3cgdGhhdCBrdm1w cGMgdXNlcyB0aGUgZGVjcmVtZW50ZXIgb3V0c2lkZQo+PiBvZiB0aGUgTVBJQy4KPj4KPj4gLSBI b3cgYXJlIHZpcnRpbyBpbnRlcnJ1cHRzIGNhc2NhZGVkIG92ZXIgdGhlIHNoYXJlZCBNU0kgaW50 ZXJydXB0cz8KPj4gL3Byb2MvZGV2aWNlLXRyZWUvc29jQGUwMDAwMDAwL21zaUA0MTYwMC9pbnRl cnJ1cHRzIGluIHRoZSBndWVzdCBzaG93cyA4IHZhbHVlcwo+PiAtIDIyNCAtIDIzMSAtIHNvIGF0 IG1vc3QgdGhlcmUgbWlnaHQgYmUgOCBwZW5kaW5nIGludGVycnVwdHMgaW4gSVJRX2NoZWNrLCBp cwo+PiB0aGF0IGNvcnJlY3Q/Cj4KPiBJdCBsb29rcyBsaWtlIHRoYXQncyBjdXJyZW50bHkgdGhl IGNhc2UsIGJ1dCBhY3R1YWwgaGFyZHdhcmUgc3VwcG9ydHMKPiBtb3JlIHRoYW4gdGhhdCwgc28g aXQncyBwb3NzaWJsZSAoYWxiZWl0IHVubGlrZWx5IGFueSB0aW1lIHNvb24pIHRoYXQKPiB0aGUg ZW11bGF0aW9uIGV2ZW50dWFsbHkgZG9lcyBhcyB3ZWxsLgo+Cj4gQnV0IGl0J3MgcG9zc2libGUg dG8gaGF2ZSBpbnRlcnJ1cHRzIG90aGVyIHRoYW4gTVNJcy4uLgoKUmlnaHQuCgpTbyBnaXZlbiB0 aGF0IHRoZSByYXcgc3BpbmxvY2sgY29udmVyc2lvbiBpcyBub3Qgc3VpdGFibGUgZm9yIGFsbCB0 aGUgc2NlbmFyaW9zIApzdXBwb3J0ZWQgYnkgdGhlIE9wZW5QSUMgZW11bGF0aW9uLCBpcyBpdCBv ayB0aGF0IG15IG5leHQgc3RlcCB3b3VsZCBiZSB0byBzZW5kIAphIHBhdGNoIGNvbnRhaW5pbmcg Ym90aCB0aGUgcmF3IHNwaW5sb2NrIGNvbnZlcnNpb24gYW5kIGEgbWFuZGF0b3J5IGRpc2FibGUg b2YgCnRoZSBpbi1rZXJuZWwgTVBJQz8gVGhpcyBpcyBhY3R1YWxseSB0aGUgbGFzdCBjb25jbHVz aW9uIHdlIGNhbWUgdXAgd2l0aCBzb21lIAp0aW1lIGFnbywgYnV0IEkgZ3Vlc3MgaXQgd2FzIGdv b2QgdG8gZ2V0IHNvbWUgbW9yZSBpbnNpZ2h0IG9uIGhvdyB0aGluZ3MgCmFjdHVhbGx5IHdvcmsg KGF0IGxlYXN0IGZvciBtZSkuCgpUaGFua3MsCkJvZ2RhbiBQLgpfX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eHBwYy1kZXYgbWFpbGluZyBsaXN0Ckxp bnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnCmh0dHBzOi8vbGlzdHMub3psYWJzLm9yZy9saXN0 aW5mby9saW51eHBwYy1kZXY= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757732AbbDWMrS (ORCPT ); Thu, 23 Apr 2015 08:47:18 -0400 Received: from mail-bn1bbn0107.outbound.protection.outlook.com ([157.56.111.107]:35265 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753910AbbDWMrP (ORCPT ); Thu, 23 Apr 2015 08:47:15 -0400 X-Greylist: delayed 88826 seconds by postgrey-1.27 at vger.kernel.org; Thu, 23 Apr 2015 08:47:15 EDT Authentication-Results: freescale.com; dkim=none (message not signed) header.d=none; Message-ID: <5538E624.8080904@freescale.com> Date: Thu, 23 Apr 2015 15:31:32 +0300 From: Purcareata Bogdan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Scott Wood CC: Sebastian Andrzej Siewior , Paolo Bonzini , Alexander Graf , Bogdan Purcareata , , , , , Thomas Gleixner , Laurentiu Tudor Subject: Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux References: <1424251955-308-1-git-send-email-bogdan.purcareata@freescale.com> <54E73A6C.9080500@suse.de> <54E740E7.5090806@redhat.com> <54E74A8C.30802@linutronix.de> <1424734051.4698.17.camel@freescale.com> <54EF196E.4090805@redhat.com> <54EF2025.80404@linutronix.de> <1424999159.4698.78.camel@freescale.com> <55158E6D.40304@freescale.com> <1428016310.22867.289.camel@freescale.com> <551E4A41.1080705@freescale.com> <1428096375.22867.369.camel@freescale.com> <55262DD3.2050707@freescale.com> <1428623611.22867.561.camel@freescale.com> <5534DAA4.3050809@freescale.com> <1429577566.4352.68.camel@freescale.com> <55378EC4.2080302@freescale.com> <1429749001.16357.7.camel@freescale.com> In-Reply-To: <1429749001.16357.7.camel@freescale.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.88.166.1] X-ClientProxiedBy: AMSPR02CA0047.eurprd02.prod.outlook.com (10.242.225.175) To BL2PR03MB179.namprd03.prod.outlook.com (10.255.230.154) X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BL2PR03MB179; X-Microsoft-Antispam-PRVS: X-Forefront-Antispam-Report: BMV:1;SFV:NSPM;SFS:(10019020)(6009001)(6049001)(51704005)(164054003)(24454002)(377424004)(33656002)(42186005)(86362001)(65956001)(110136001)(46102003)(76176999)(54356999)(87266999)(50986999)(40100003)(77156002)(47776003)(65806001)(65816999)(23676002)(62966003)(80316001)(64126003)(66066001)(50466002)(77096005)(19580405001)(93886004)(87976001)(2950100001)(36756003)(83506001)(4001350100001)(59896002)(92566002)(4001450100001)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:BL2PR03MB179;H:[10.171.74.74];FPR:;SPF:None;MLV:sfv;LANG:en; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(5002010);SRVR:BL2PR03MB179;BCL:0;PCL:0;RULEID:;SRVR:BL2PR03MB179; X-Forefront-PRVS: 0555EC8317 X-OriginatorOrg: freescale.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Apr 2015 12:31:51.1482 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR03MB179 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23.04.2015 03:30, Scott Wood wrote: > On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote: >> On 21.04.2015 03:52, Scott Wood wrote: >>> On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote: >>>> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner >>>> function is kvmppc_set_epr, which is a static inline. Removing the static inline >>>> yields a compiler crash (Segmentation fault (core dumped) - >>>> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed), >>>> but that's a different story, so I just let it be for now. Point is the time may >>>> include other work after the lock has been released, but before the function >>>> actually returned. I noticed this was the case for .kvm_set_msi, which could >>>> work up to 90 ms, not actually under the lock. This made me change what I'm >>>> looking at. >>> >>> kvm_set_msi does pretty much nothing outside the lock -- I suspect >>> you're measuring an interrupt that happened as soon as the lock was >>> released. >> >> That's exactly right. I've seen things like a timer interrupt occuring right >> after the spinlock_irqrestore, but before kvm_set_msi actually returned. >> >> [...] >> >>>> Or perhaps a different stress scenario involving a lot of VCPUs >>>> and external interrupts? >>> >>> You could instrument the MPIC code to find out how many loop iterations >>> you maxed out on, and compare that to the theoretical maximum. >> >> Numbers are pretty low, and I'll try to explain based on my observations. >> >> The problematic section in openpic_update_irq is this [1], since it loops >> through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops >> through all pending interrupts for a VCPU [2]. >> >> The guest interfaces are virtio-vhostnet, which are based on MSI >> (/proc/interrupts in guest shows they are MSI). For external interrupts to the >> guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized), >> so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop). >> >> I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts >> are actually pending for the destination VCPU. At most, there were 3 interrupts >> - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that >> guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts. >> >> So worst case, in this scenario, was checking the priorities for 3 pending >> interrupts for 1 VCPU. Something like this (some of my prints included): >> >> [61010.582033] openpic_update_irq: destmask 1 last_cpu 0 >> [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ >> [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1 >> [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1 >> [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1 >> [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1 >> >> It would be really helpful to get your comments regarding whether these are >> realistical number for everyday use, or they are relevant only to this >> particular scenario. > > RT isn't about "realistic numbers for everyday use". It's about worst > cases. > >> - Can these interrupts be used in directed delivery, so that the destination >> mask can include multiple VCPUs? > > The Freescale MPIC does not support multiple destinations for most > interrupts, but the (non-FSL-specific) emulation code appears to allow > it. > >> The MPIC manual states that timer and IPI >> interrupts are supported for directed delivery, altough I'm not sure how much of >> this is used in the emulation. I know that kvmppc uses the decrementer outside >> of the MPIC. >> >> - How are virtio interrupts cascaded over the shared MSI interrupts? >> /proc/device-tree/soc@e0000000/msi@41600/interrupts in the guest shows 8 values >> - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is >> that correct? > > It looks like that's currently the case, but actual hardware supports > more than that, so it's possible (albeit unlikely any time soon) that > the emulation eventually does as well. > > But it's possible to have interrupts other than MSIs... Right. So given that the raw spinlock conversion is not suitable for all the scenarios supported by the OpenPIC emulation, is it ok that my next step would be to send a patch containing both the raw spinlock conversion and a mandatory disable of the in-kernel MPIC? This is actually the last conclusion we came up with some time ago, but I guess it was good to get some more insight on how things actually work (at least for me). Thanks, Bogdan P.