Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Purcareata Bogdan <b43198@freescale.com>
To: Scott Wood <scottwood@freescale.com>
Cc: Laurentiu Tudor <b10716@freescale.com>,
	linux-rt-users@vger.kernel.org,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Alexander Graf <agraf@suse.de>,
	linux-kernel@vger.kernel.org,
	Bogdan Purcareata <bogdan.purcareata@freescale.com>,
	mihai.caraman@freescale.com, Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux
Date: Mon, 27 Apr 2015 09:45:09 +0300	[thread overview]
Message-ID: <553DDAF5.6030005@freescale.com> (raw)
In-Reply-To: <1429824418.16357.26.camel@freescale.com>

On 24.04.2015 00:26, Scott Wood wrote:
> On Thu, 2015-04-23 at 15:31 +0300, Purcareata Bogdan wrote:
>> On 23.04.2015 03:30, Scott Wood wrote:
>>> On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote:
>>>> On 21.04.2015 03:52, Scott Wood wrote:
>>>>> On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:
>>>>>> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner
>>>>>> function is kvmppc_set_epr, which is a static inline. Removing the static inline
>>>>>> yields a compiler crash (Segmentation fault (core dumped) -
>>>>>> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed),
>>>>>> but that's a different story, so I just let it be for now. Point is the time may
>>>>>> include other work after the lock has been released, but before the function
>>>>>> actually returned. I noticed this was the case for .kvm_set_msi, which could
>>>>>> work up to 90 ms, not actually under the lock. This made me change what I'm
>>>>>> looking at.
>>>>>
>>>>> kvm_set_msi does pretty much nothing outside the lock -- I suspect
>>>>> you're measuring an interrupt that happened as soon as the lock was
>>>>> released.
>>>>
>>>> That's exactly right. I've seen things like a timer interrupt occuring right
>>>> after the spinlock_irqrestore, but before kvm_set_msi actually returned.
>>>>
>>>> [...]
>>>>
>>>>>>     Or perhaps a different stress scenario involving a lot of VCPUs
>>>>>> and external interrupts?
>>>>>
>>>>> You could instrument the MPIC code to find out how many loop iterations
>>>>> you maxed out on, and compare that to the theoretical maximum.
>>>>
>>>> Numbers are pretty low, and I'll try to explain based on my observations.
>>>>
>>>> The problematic section in openpic_update_irq is this [1], since it loops
>>>> through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops
>>>> through all pending interrupts for a VCPU [2].
>>>>
>>>> The guest interfaces are virtio-vhostnet, which are based on MSI
>>>> (/proc/interrupts in guest shows they are MSI). For external interrupts to the
>>>> guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized),
>>>> so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop).
>>>>
>>>> I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts
>>>> are actually pending for the destination VCPU. At most, there were 3 interrupts
>>>> - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that
>>>> guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts.
>>>>
>>>> So worst case, in this scenario, was checking the priorities for 3 pending
>>>> interrupts for 1 VCPU. Something like this (some of my prints included):
>>>>
>>>> [61010.582033] openpic_update_irq: destmask 1 last_cpu 0
>>>> [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ
>>>> [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1
>>>> [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1
>>>> [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1
>>>> [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1
>>>>
>>>> It would be really helpful to get your comments regarding whether these are
>>>> realistical number for everyday use, or they are relevant only to this
>>>> particular scenario.
>>>
>>> RT isn't about "realistic numbers for everyday use".  It's about worst
>>> cases.
>>>
>>>> - Can these interrupts be used in directed delivery, so that the destination
>>>> mask can include multiple VCPUs?
>>>
>>> The Freescale MPIC does not support multiple destinations for most
>>> interrupts, but the (non-FSL-specific) emulation code appears to allow
>>> it.
>>>
>>>>    The MPIC manual states that timer and IPI
>>>> interrupts are supported for directed delivery, altough I'm not sure how much of
>>>> this is used in the emulation. I know that kvmppc uses the decrementer outside
>>>> of the MPIC.
>>>>
>>>> - How are virtio interrupts cascaded over the shared MSI interrupts?
>>>> /proc/device-tree/soc@e0000000/msi@41600/interrupts in the guest shows 8 values
>>>> - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is
>>>> that correct?
>>>
>>> It looks like that's currently the case, but actual hardware supports
>>> more than that, so it's possible (albeit unlikely any time soon) that
>>> the emulation eventually does as well.
>>>
>>> But it's possible to have interrupts other than MSIs...
>>
>> Right.
>>
>> So given that the raw spinlock conversion is not suitable for all the scenarios
>> supported by the OpenPIC emulation, is it ok that my next step would be to send
>> a patch containing both the raw spinlock conversion and a mandatory disable of
>> the in-kernel MPIC? This is actually the last conclusion we came up with some
>> time ago, but I guess it was good to get some more insight on how things
>> actually work (at least for me).
>
> Fine with me.  Have you given any thought to ways to restructure the
> code to eliminate the problem?

My first thought would be to create a separate lock for each VCPU pending 
interrupts queue, so that we make the whole openpic_irq_update more granular. 
However, this is just a very preliminary thought. Before I can come up with 
anything worthy of consideration, I must read the OpenPIC specification and the 
current KVM emulated OpenPIC implementation thoroughly. I currently have other 
things on my hands, and will come back to this once I have some time.

Meanwhile, I've sent a v2 on the PPC and RT mailing lists for this raw_spinlock 
conversion, alongside disabling the in-kernel MPIC emulation for PREEMPT_RT. I 
would be grateful to hear your feedback on that, so that it can get applied.

Thank you,
Bogdan P.

     prev parent reply	other threads:[~2015-04-27  6:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-18  9:32 [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux Bogdan Purcareata
2015-02-18  9:32 ` [PATCH 1/2] powerpc/kvm: Convert openpic lock to raw_spinlock Bogdan Purcareata
2015-02-23 22:43   ` Scott Wood
2015-02-18  9:32 ` [PATCH 2/2] powerpc/kvm: Limit MAX_VCPUS for guests running on RT Linux Bogdan Purcareata
2015-02-18  9:36   ` Sebastian Andrzej Siewior
2015-02-20 13:45   ` Alexander Graf
2015-02-23 22:48     ` Scott Wood
2015-02-20 13:45 ` [PATCH 0/2] powerpc/kvm: Enable running guests " Alexander Graf
2015-02-20 14:12   ` Paolo Bonzini
2015-02-20 14:16     ` Alexander Graf
2015-02-20 14:54     ` Sebastian Andrzej Siewior
2015-02-20 14:57       ` Paolo Bonzini
2015-02-20 15:06         ` Sebastian Andrzej Siewior
2015-02-20 15:10           ` Paolo Bonzini
2015-02-20 15:17             ` Sebastian Andrzej Siewior
2015-02-23  8:12               ` Purcareata Bogdan
2015-02-23  7:50           ` Purcareata Bogdan
2015-02-23  7:29       ` Purcareata Bogdan
2015-02-23 23:27       ` Scott Wood
2015-02-25 16:36         ` Sebastian Andrzej Siewior
2015-02-26 13:02         ` Paolo Bonzini
2015-02-26 13:31           ` Sebastian Andrzej Siewior
2015-02-27  1:05             ` Scott Wood
2015-02-27 13:06               ` Paolo Bonzini
2015-03-27 17:07               ` Purcareata Bogdan
2015-04-02 23:11                 ` Scott Wood
2015-04-03  8:07                   ` Purcareata Bogdan
2015-04-03 21:26                     ` Scott Wood
2015-04-09  7:44                       ` Purcareata Bogdan
2015-04-09 23:53                         ` Scott Wood
2015-04-20 10:53                           ` Purcareata Bogdan
2015-04-21  0:52                             ` Scott Wood
2015-04-22 12:06                               ` Purcareata Bogdan
2015-04-23  0:30                                 ` Scott Wood
2015-04-23 12:31                                   ` Purcareata Bogdan
2015-04-23 21:26                                     ` Scott Wood
2015-04-27  6:45                                       ` Purcareata Bogdan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=553DDAF5.6030005@freescale.com \
    --to=b43198@freescale.com \
    --cc=agraf@suse.de \
    --cc=b10716@freescale.com \
    --cc=bigeasy@linutronix.de \
    --cc=bogdan.purcareata@freescale.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mihai.caraman@freescale.com \
    --cc=pbonzini@redhat.com \
    --cc=scottwood@freescale.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).