From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
mingo@redhat.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall
Date: Sun, 24 Sep 2017 23:57:53 -0300 [thread overview]
Message-ID: <20170925025751.GB30813@amt.cnet> (raw)
In-Reply-To: <855950672.7912001.1506258344142.JavaMail.zimbra@redhat.com>
On Sun, Sep 24, 2017 at 09:05:44AM -0400, Paolo Bonzini wrote:
>
>
> ----- Original Message -----
> > From: "Peter Zijlstra" <peterz@infradead.org>
> > To: "Paolo Bonzini" <pbonzini@redhat.com>
> > Cc: "Marcelo Tosatti" <mtosatti@redhat.com>, "Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>, mingo@redhat.com,
> > kvm@vger.kernel.org, linux-kernel@vger.kernel.org, "Thomas Gleixner" <tglx@linutronix.de>
> > Sent: Saturday, September 23, 2017 3:41:14 PM
> > Subject: Re: [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall
> >
> > On Sat, Sep 23, 2017 at 12:56:12PM +0200, Paolo Bonzini wrote:
> > > On 22/09/2017 14:55, Peter Zijlstra wrote:
> > > > You just explained it yourself. If the thread that needs to complete
> > > > what you're waiting on has lower priority, it will _never_ get to run if
> > > > you're busy waiting on it.
> > > >
> > > > This is _trivial_.
> > > >
> > > > And even for !RT it can be quite costly, because you can end up having
> > > > to burn your entire slot of CPU time before you run the other task.
> > > >
> > > > Userspace spinning is _bad_, do not do this.
> > >
> > > This is not userspace spinning, it is guest spinning---which has
> > > effectively the same effect but you cannot quite avoid.
> >
> > So I'm virt illiterate and have no clue on how all this works; but
> > wasn't this a vmexit ? (that's what marcelo traced). And once you've
> > done a vmexit you're a regular task again, not a vcpu.
>
> His trace simply shows that the timer tick happened and the SCHED_NORMAL
> thread was preempted. Bumping the vCPU thread to SCHED_FIFO drops
> the scheduler tick (the system is NOHZ_FULL) and thus 1) the frequency
> of EXTERNAL_INTERRUPT vmexits drops to 1 second 2) the thread is not
> preempted anymore.
>
> > > But I agree that the solution is properly prioritizing threads that can
> > > interrupt the VCPU, and using PI mutexes.
Thats exactly what the patch does, the prioritization is not fixed in
time, and depends on whether or not vcpu-0 is in spinlock protected
section.
Are you suggesting a different prioritization? Can you describe it
please, even if incomplete?
> >
> > Right, if you want to run RT VCPUs the whole emulator/vcpu interaction
> > needs to be designed for RT.
> >
> > > I'm not a priori opposed to paravirt scheduling primitives, but I am not
> > > at all sure that it's required.
> >
> > Problem is that the proposed thing doesn't solve anything. There is
> > nothing that prohibits the guest from triggering a vmexit while holding
> > a spinlock and landing in the self-same problems.
>
> Well, part of configuring virt for RT is (at all levels: host hypervisor+QEMU
> and guest kernel+userspace) is that vmexits while holding a spinlock are either
> confined to one vCPU or are handled in the host hypervisor very quickly, like
> less than 2000 clock cycles.
>
> So I'm not denying that Marcelo's approach solves the problem, but it's very
> heavyweight and it masks an important misconfiguration (as you write above,
> everything needs to be RT and the priorities must be designed carefully).
I think you are missing the following point:
"vcpu0 can be interrupted when its not in a spinlock protected section,
otherwise it can't."
So you _have_ to communicate to the host when the guest enters/leaves a
critical section.
So this point of "everything needs to be RT and the priorities must be
designed carefully", is this:
WHEN in spinlock protected section (more specifically, when
spinlock protected section _shared with realtime vcpus_),
priority of vcpu0 > priority of emulator thread
OTHERWISE
priority of vcpu0 < priority of emulator thread.
(*)
So emulator thread can interrupt and inject interrupts to vcpu0.
>
> _However_, even if you do this, you may want to put the less important vCPUs
> and the emulator threads on the same physical CPU. In that case, the vCPU
> can be placed at SCHED_RR to avoid starvation (while the emulator thread needs
> to stay at SCHED_FIFO and higher priority). Some kind of trick that bumps
> spinlock critical sections in that vCPU to SCHED_FIFO, for a limited time only,
> might still be useful.
Anything that violates (*) above is going to cause excessive latencies
in realtime vcpus, via:
PCPU-0:
* vcpu-0 grabs spinlock A.
* event wakes up emulator thread, vcpu-0 sched out, vcpu-0 sched
in.
PCPU-1:
* realtime vcpu grabs spinlock-A, busy spins on emulator threads
completion.
So its more than useful, its necessary.
I'm open to suggestions as better ways to solve this problem
while sharing emulator thread with vcpu-0 (which is something users
are interested in, for obvious economical reasons), but:
1) Don't get the point of Peters rejection.
2) Don't get how SCHED_RR can help the situation.
next prev parent reply other threads:[~2017-09-25 2:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-21 11:38 [patch 0/3] KVM KVM_HC_RT_PRIO hypercall support Marcelo Tosatti
2017-09-21 11:38 ` [patch 1/3] KVM: x86: add per-vcpu option to set guest vcpu -RT priority Marcelo Tosatti
2017-09-21 11:38 ` [patch 2/3] KVM: x86: KVM_HC_RT_PRIO hypercall (host-side) Marcelo Tosatti
2017-09-21 13:32 ` Konrad Rzeszutek Wilk
2017-09-21 13:49 ` Paolo Bonzini
2017-09-22 1:08 ` Marcelo Tosatti
2017-09-22 7:23 ` Paolo Bonzini
2017-09-22 12:24 ` Marcelo Tosatti
2017-09-21 11:38 ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall Marcelo Tosatti
2017-09-21 13:36 ` Konrad Rzeszutek Wilk
2017-09-21 14:06 ` Peter Zijlstra
2017-09-22 1:10 ` Marcelo Tosatti
2017-09-22 10:00 ` Peter Zijlstra
2017-09-22 10:56 ` Peter Zijlstra
2017-09-22 12:33 ` Marcelo Tosatti
2017-09-22 12:55 ` Peter Zijlstra
2017-09-23 10:56 ` Paolo Bonzini
2017-09-23 13:41 ` Peter Zijlstra
2017-09-24 13:05 ` Paolo Bonzini
2017-09-25 2:57 ` Marcelo Tosatti [this message]
2017-09-25 9:13 ` Peter Zijlstra
2017-09-25 15:12 ` Paolo Bonzini
2017-09-26 22:49 ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall\ Marcelo Tosatti
2017-09-27 9:37 ` Paolo Bonzini
2017-09-28 0:44 ` Marcelo Tosatti
2017-09-28 7:22 ` Paolo Bonzini
2017-09-28 21:35 ` Marcelo Tosatti
2017-09-28 21:41 ` Marcelo Tosatti
2017-09-29 8:18 ` Paolo Bonzini
2017-09-29 16:40 ` Marcelo Tosatti
2017-09-29 17:05 ` Paolo Bonzini
2017-09-29 20:17 ` Marcelo Tosatti
2017-10-02 12:30 ` Paolo Bonzini
2017-10-02 12:48 ` Peter Zijlstra
2017-09-26 23:22 ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall Marcelo Tosatti
2017-09-25 16:20 ` Konrad Rzeszutek Wilk
2017-09-22 12:16 ` Marcelo Tosatti
2017-09-22 12:31 ` Peter Zijlstra
2017-09-22 12:36 ` Marcelo Tosatti
2017-09-22 12:59 ` Peter Zijlstra
2017-09-25 1:52 ` Marcelo Tosatti
2017-09-25 8:35 ` Peter Zijlstra
2017-09-22 12:40 ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall\ Marcelo Tosatti
2017-09-22 13:01 ` Peter Zijlstra
2017-09-25 2:22 ` Marcelo Tosatti
2017-09-25 8:58 ` Peter Zijlstra
2017-09-25 10:41 ` Thomas Gleixner
2017-09-25 18:28 ` Jan Kiszka
2017-09-21 17:45 ` [patch 0/3] KVM KVM_HC_RT_PRIO hypercall support Jan Kiszka
2017-09-22 1:19 ` Marcelo Tosatti
2017-09-22 6:23 ` Jan Kiszka
2017-09-26 23:59 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170925025751.GB30813@amt.cnet \
--to=mtosatti@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox