public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sheng Yang <yasker@gmail.com>
To: Jan Kiszka <jan.kiszka@web.de>
Cc: Sheng Yang <sheng@linux.intel.com>,
	kvm@vger.kernel.org, avi@redhat.com, jiajun.xu@intel.com,
	Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [PATCH 1/3] KVM: x86: Relax accept conditions of kvm_apic_accept_pic_intr
Date: Sat, 18 Oct 2008 10:44:16 +0800	[thread overview]
Message-ID: <20081018024416.GA24881@yukikaze> (raw)
In-Reply-To: <48F8D570.5010308@web.de>

On Fri, Oct 17, 2008 at 08:12:00PM +0200, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Sheng Yang wrote:
> >> On Fri, Oct 17, 2008 at 07:35:01PM +0200, Jan Kiszka wrote:
> >>> Sheng Yang wrote:
> >>>> On Fri, Oct 17, 2008 at 10:10:54AM +0200, Jan Kiszka wrote:
> >>>>> Sheng Yang wrote:
> >>>>>> On Wednesday 15 October 2008 22:27:49 Jan Kiszka wrote:
> >>>>>>> Aligning in-kernel kvm_apic_accept_pic_intr with its user space mate,
> >>>>>>> this patch relaxes the conditions under which PIC IRQs are accepted
> >>>>>>> by LVT0. This reflects reality and allows to reuse the service for the
> >>>>>>> NMI watchdog use case.
> >>>>>>>
> >>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>> ---
> >>>>>>>  arch/x86/kvm/lapic.c |   13 ++++---------
> >>>>>>>  1 file changed, 4 insertions(+), 9 deletions(-)
> >>>>>>>
> >>>>>>> Index: b/arch/x86/kvm/lapic.c
> >>>>>>> ===================================================================
> >>>>>>> --- a/arch/x86/kvm/lapic.c
> >>>>>>> +++ b/arch/x86/kvm/lapic.c
> >>>>>>> @@ -1072,16 +1072,11 @@ int kvm_apic_has_interrupt(struct kvm_vc
> >>>>>>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
> >>>>>>>  {
> >>>>>>>  	u32 lvt0 = apic_get_reg(vcpu->arch.apic, APIC_LVT0);
> >>>>>>> -	int r = 0;
> >>>>>>>
> >>>>>>> -	if (vcpu->vcpu_id == 0) {
> >>>>>>> -		if (!apic_hw_enabled(vcpu->arch.apic))
> >>>>>>> -			r = 1;
> >>>>>>> -		if ((lvt0 & APIC_LVT_MASKED) == 0 &&
> >>>>>>> -		    GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
> >>>>>>> -			r = 1;
> >>>>>>> -	}
> >>>>>>> -	return r;
> >>>>>>> +	if (!apic_hw_enabled(vcpu->arch.apic) ||
> >>>>>>> +	    (lvt0 & APIC_LVT_MASKED) == 0)
> >>>>>>> +		return 1;
> >>>>>>> +	return 0;
> >>>>>>>  }
> >>>>>>>
> >>>>>>>  void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu)
> >>>>>>>
> >>>>>> (sorry for late review...)
> >>>>>>
> >>>>>> Thanks to find out the root cause of BSOD!
> >>>>>>
> >>>>>> But I am a little concern about this change. As you know, PIC only connect to 
> >>>>>> cpu0. So I think it's not proper to make it generic. 
> >>>>> I don't think so - and if it were true, qemu would have a bug then, see
> >>>>> its corresponding code.
> >>>> You can refer to Intel MP spec, virtual wire mode. Google
> >>>> "MP spec" can find it.
> >>> Ah, good reference.
> >>>
> >>>> Normally PIC is only used in BSP boot up for SMP guest(PIC can't afford SMP,
> >>>> otherwise we won't need IOAPIC/LAPIC). After that, it should be disabled.
> >>>> And virtual wire mode works with APIC_MODE_EXTINT on LVT0 of BSP lapic, so
> >>>> that's why you see
> >>>>
> >>>>     GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT
> >>>>
> >>>> KVM follow virtual wire mode exactly.
> >>> According to my understanding of the spec, the virtual wire mode means
> >>> that the PIC signal is delivered via LVT0, and thus can be received by
> >>> _all_ CPUs in the system. However, only the BSP usually enables LVT0,
> >>> thus is receiving the IRQ. When Linux switches to NMI watchdog mode 1,
> >>> it also unmasks the other CPUs (and reprograms all to deliver NMIs
> >>> instead of EXTINTs).
> >>>
> >>> Then there is also the "PIC Mode", ie. direct delivery to the BSP, and
> >>> only the latter. That mode is obviously target by the current
> >>> kvm_apic_accept_pic_intr implementation. But I find no indication in the
> >>> spec yet that both modes cannot exists in the same system. But I also
> >>> fail to understand how one could switch between both modes (via software).
> >> No. If so, we don't need to check LVT0.
> > 
> > OK, there is that IMCR register to enable/disable the PIC Mode - but
> > neither KVM nor QEMU implement it (which may indicate they both want to
> > provide Virtual Wire Mode?). When enabled, Virtual Wire Mode is
> > effectively disabled (for the BSP at least) as the LAPIC is disconnected
> > (from the BSP).
> > 
> > Still, I find not trace in the spec that says only the BSP can receive
> > PIC interrupts in Virtual Wire Mode (the LVT0 line is connected to _all_
> > CPUs).
> 
> Now I checked also the BIOS KVM is shipping, and the MP Feature byte 2,
> bit 7 (IMCRP) is cleared, thus KVM is providing the Virtual Wire mode.
> Looking at Figure 3-3 of the MP spec, one can see that the PIC's output
> is connected to the LVT0 line in this mode, and that this line is
> connected to all CPUs in the system. So I can't help concluding that a)
> QEMU's implementation is correct and b) my patch is correct as well. Or
> please tell me where I'm wrong now...

Frankly speaking, here are two apporoaches. Both are OK to work. You
insisted the QEmu method, emulate that line connect all lapic's LVT0. And I
insisted to follow the current solution, the dot-line of virtual wire mode
in the spec, then make NMI watchdog as a separate thing, impact others as
small as possible.

When I wrote NMI watchdog, I don't want to involve PIC, for it's special
case of PIC usage. So I think it's OK to not emulate the path here, then use
apic_local_deliver() to send the interrupt directly, not through the PIC
path. If PIC involved, that's another path. Current QEmu covered this,
pic_request_irq() send to every vcpu, emulate that whole LVT0 line. Our KVM
choose a different way, we just assume PIC only connect to LVT0 of BSP, for
others should be disabled. That's save a lot when you have a lot of vcpus,
as you said.

So currently, QEmu emulate virtual wire mode well, and KVM do some
simplification, only connect to BSP. Both of them follow this in each's
code. And for KVM, the change to kvm_apic_accept_pic_intr() broke this
assumption. Now we only work PIC with BSP, but check all the vcpus. I don't
think that's a good combination. I think we are not likely do more to
improve our PIC connection method, so NMI watchdog in KVM was designed as a
separate thing, as a special case, and should be the only special case.

kvm_cpu_has_interrupt() called every time before VM entry to check if
there are any intr can be injected. If lapic got none,
it would check kvm_apic_accept_pic_intr(). Check every vcpu or only check
vcpu0, would bring about (vcpu_nr - 1) * ((vm_exit_nr - lapic_has_intr_nr) /
vcpu_nr)(if we assume vmexit on every vcpu is the mostly compatiable) more
times to do the judgment on other vcpus here. And normally, the latter
number would tens of thousand to hundreds of thousands. If you care about
1000 per vcpu's touch in pit, why you don't care about them here?

--
regards
Yang, Sheng
> 
> Jan
> 



  parent reply	other threads:[~2008-10-18  2:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-15 14:27 [PATCH 0/3] KVM: x86: Fix and optimize in-kernel NMI watchdog support Jan Kiszka
2008-10-15 14:27 ` [PATCH 1/3] KVM: x86: Relax accept conditions of kvm_apic_accept_pic_intr Jan Kiszka
2008-10-17  5:11   ` Sheng Yang
2008-10-17  8:10     ` Jan Kiszka
2008-10-17 16:35       ` Sheng Yang
2008-10-17 17:35         ` Jan Kiszka
2008-10-17 17:47           ` Sheng Yang
2008-10-17 17:56             ` Jan Kiszka
2008-10-17 18:12               ` Jan Kiszka
2008-10-17 18:14                 ` Jan Kiszka
2008-10-18  2:44                 ` Sheng Yang [this message]
2008-10-18  3:02                   ` Sheng Yang
2008-10-18  8:29                   ` Jan Kiszka
2008-10-17 18:15               ` Sheng Yang
2008-10-17 15:31     ` Xu, Jiajun
2008-10-15 14:27 ` [PATCH 2/3] KVM: x86: Dont deliver PIT IRQs to masked LVT0s Jan Kiszka
2008-10-17 15:23   ` Alexander Graf
2008-10-17 15:37     ` Jan Kiszka
2008-10-17 15:44       ` Alexander Graf
2008-10-17 18:14         ` Alexander Graf
2008-10-15 14:27 ` [PATCH 3/3] KVM: x86: Optimize NMI watchdog delivery Jan Kiszka
2008-10-17 17:06   ` Sheng Yang
2008-10-17 17:23     ` Jan Kiszka
2008-10-17 17:34       ` Sheng Yang
2008-10-17 17:40         ` Jan Kiszka
2008-10-17 18:26           ` Sheng Yang
2008-10-17 18:39             ` Jan Kiszka
2008-10-19 11:15           ` Avi Kivity
2008-10-19 11:13 ` [PATCH 0/3] KVM: x86: Fix and optimize in-kernel NMI watchdog support Avi Kivity
2008-10-19 13:03   ` Sheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081018024416.GA24881@yukikaze \
    --to=yasker@gmail.com \
    --cc=avi@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jan.kiszka@web.de \
    --cc=jiajun.xu@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=sheng@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox