From: Gleb Natapov <gleb@redhat.com>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Jan Kiszka <jan.kiszka@web.de>, kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers
Date: Sun, 24 Mar 2013 21:03:03 +0200 [thread overview]
Message-ID: <20130324190303.GG22179@redhat.com> (raw)
In-Reply-To: <A9667DDFB95DB7438FA9D7D576C3D87E099ED01B@SHSMSX101.ccr.corp.intel.com>
On Sun, Mar 24, 2013 at 10:45:53AM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-22:
> > On Fri, Mar 22, 2013 at 07:43:03AM -0300, Marcelo Tosatti wrote:
> >> On Fri, Mar 22, 2013 at 08:53:15AM +0200, Gleb Natapov wrote:
> >>> On Thu, Mar 21, 2013 at 08:06:41PM -0300, Marcelo Tosatti wrote:
> >>>> On Thu, Mar 21, 2013 at 11:13:39PM +0200, Gleb Natapov wrote:
> >>>>> On Thu, Mar 21, 2013 at 05:51:50PM -0300, Marcelo Tosatti wrote:
> >>>>>>>>> But current PI patches do break them, thats my point. So we
> >>>>>>>>> either need to revise them again, or drop LAPIC timer
> >>>>>>>>> reinjection. Making apic_accept_irq semantics "it returns
> >>>>>>>>> coalescing info, but only sometimes" is dubious though.
> >>>>>>>> We may rollback to the initial idea: test both irr and pir to get
> > coalescing info. In this case, inject LAPIC timer always in vcpu context. So
> > apic_accept_irq() will return right coalescing info.
> >>>>>>>> Also, we need to add comments to tell caller, apic_accept_irq()
> >>>>>>>> can ensure the return value is correct only when caller is in
> >>>>>>>> target vcpu context.
> >>>>>>>>
> >>>>>>> We cannot touch irr while vcpu is in non-root operation, so we
> >>>>>>> will have to pass flag to apic_accept_irq() to let it know that it
> >>>>>>> is called synchronously. While all this is possible I want to know
> >>>>>>> which guests exactly will we break if we will not track interrupt
> >>>>>>> coalescing for lapic timer. If only 2.0 smp kernels will break we
> >>>>>>> can probably drop it.
> >>>>>>
> >>>>>> RHEL4 / RHEL5 guests.
> >>>>> RHEL5 has kvmclock no? We should not break RHEL4 though.
> >>>>
> >>>> kvmclock provides no timer interrupt... either LAPIC or PIT must be used
> >>>> with kvmclock.
> >>> I am confused now. If LAPIC is not used for wallclock time keeping, but
> >>> only for scheduling the reinjection is actually harmful. Reinjecting the
> >>> interrupt will cause needles task rescheduling. So the question is if
> >>> there is a Linux kernel that uses LAPIC for wallclock time keeping and
> >>> relies on accurate number of injected interrupts to not time drift.
> >>
> >> See 4acd47cfea9c18134e0cbf915780892ef0ff433a on RHEL5, RHEL5 kernels
> >> before that commit did not reinject. Which means that all non-RHEL
> >> Linux guests based on that upstream code also suffer from the same
> >> problem.
> >>
> > The commit actually fixes guest, not host. The existence of the commit
> > also means that LAPIC timer reinjection does not solve the problem and
> > all guests without this commit will suffer from the bug regardless of
> > what we will decide to do here. Without LAPIC timer reinfection the
> > effect of the bug will be much more visible and long lasting though.
> >
> >> Also any other algorithm which uses LAPIC timers and compare that with
> >> other clocks (such as NMI watchdog) are potentially vulnerable.
> > They are with or without timer reinjection as commit you pointed to
> > shows.
> >
> >>
> >> Can drop it, and then wait until someone complains (if so).
> >>
> > Yes, tough decision to make. All the complains will be guest bugs which
> > can be hit without reinjection too, but with less probability. Why we so
> > keen on keeping RTC reinject is that the guests that depends on it
> > cannot be fixed.
> >
> >>> Knowing that Linux tend to disable interrupt it is likely that it tries
> >>> to detect and compensate for missing interrupt.
> >>
> >> As said above, any algorithm which compares LAPIC timer interrupt with
> >> another clock is vulnerable.
> Any conclusion?
>
Lets not check for coalescing in PI patches for now.
--
Gleb.
next prev parent reply other threads:[~2013-03-24 19:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-16 20:49 [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers Jan Kiszka
2013-03-17 8:47 ` Gleb Natapov
2013-03-17 10:45 ` Jan Kiszka
2013-03-17 10:47 ` Gleb Natapov
2013-03-20 19:30 ` Marcelo Tosatti
2013-03-20 20:03 ` Marcelo Tosatti
2013-03-20 21:32 ` Gleb Natapov
2013-03-20 23:19 ` Marcelo Tosatti
2013-03-21 4:54 ` Gleb Natapov
2013-03-21 14:02 ` Marcelo Tosatti
2013-03-21 14:18 ` Gleb Natapov
2013-03-21 14:27 ` Zhang, Yang Z
2013-03-21 16:27 ` Gleb Natapov
2013-03-21 20:51 ` Marcelo Tosatti
2013-03-21 21:13 ` Gleb Natapov
2013-03-21 23:06 ` Marcelo Tosatti
2013-03-22 1:50 ` Zhang, Yang Z
2013-03-22 6:53 ` Gleb Natapov
2013-03-22 10:43 ` Marcelo Tosatti
2013-03-22 11:19 ` Gleb Natapov
2013-03-24 10:45 ` Zhang, Yang Z
2013-03-24 19:03 ` Gleb Natapov [this message]
2013-04-28 10:15 ` Jan Kiszka
2013-04-28 10:19 ` Gleb Natapov
2013-04-28 10:20 ` Jan Kiszka
2013-04-28 10:23 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130324190303.GG22179@redhat.com \
--to=gleb@redhat.com \
--cc=jan.kiszka@web.de \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=yang.z.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox