kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Tian, Kevin" <kevin.tian@intel.com>
Subject: Re: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support
Date: Thu, 6 Dec 2012 08:36:52 +0200	[thread overview]
Message-ID: <20121206063652.GW19514@redhat.com> (raw)
In-Reply-To: <A9667DDFB95DB7438FA9D7D576C3D87E2D55F7@SHSMSX101.ccr.corp.intel.com>

On Thu, Dec 06, 2012 at 05:02:15AM +0000, Zhang, Yang Z wrote:
> Zhang, Yang Z wrote on 2012-12-06:
> > Marcelo Tosatti wrote on 2012-12-06:
> >> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
> >>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
> >>> manually, which is fully taken care of by the hardware. This needs
> >>> some special awareness into existing interrupr injection path:
> >>> 
> >>> - for pending interrupt, instead of direct injection, we may need
> >>>   update architecture specific indicators before resuming to guest. -
> >>>   A pending interrupt, which is masked by ISR, should be also
> >>>   considered in above update action, since hardware will decide when
> >>>   to inject it at right time. Current has_interrupt and get_interrupt
> >>>   only returns a valid vector from injection p.o.v.
> >>> Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
> >>> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
> >>> ---
> >>>  arch/x86/include/asm/kvm_host.h |    4 + arch/x86/include/asm/vmx.h
> >>>    |   11 +++ arch/x86/kvm/irq.c              |   53 ++++++++++-----
> >>>  arch/x86/kvm/lapic.c            |   56 +++++++++++++---
> >>>  arch/x86/kvm/lapic.h            |    6 ++ arch/x86/kvm/svm.c
> >>>     |   19 +++++ arch/x86/kvm/vmx.c              |  140
> >>>  ++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.c
> >>>   |   34 ++++++++-- virt/kvm/ioapic.c               |    1 + 9 files
> >>>  changed, 291 insertions(+), 33 deletions(-)
> >>> diff --git a/arch/x86/include/asm/kvm_host.h
> >>> b/arch/x86/include/asm/kvm_host.h index dc87b65..e5352c8 100644 ---
> >>> a/arch/x86/include/asm/kvm_host.h +++
> >>> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,10 @@ struct
> >>> kvm_x86_ops {
> >>>  	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
> >>>  	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
> >>>  	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
> >>> +	int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
> >>> +	void (*update_irq)(struct kvm_vcpu *vcpu);
> >>> +	void (*set_eoi_exitmap)(struct kvm_vcpu *vcpu, int vector,
> >>> +			int trig_mode, int always_set);
> >>>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
> >>>  	int (*get_tdp_level)(void);
> >>>  	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
> > is_mmio);
> >>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> >>> index 21101b6..1003341 100644
> >>> --- a/arch/x86/include/asm/vmx.h
> >>> +++ b/arch/x86/include/asm/vmx.h
> >>> @@ -62,6 +62,7 @@
> >>>  #define EXIT_REASON_MCE_DURING_VMENTRY  41 #define
> >>>  EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define
> > EXIT_REASON_APIC_ACCESS
> >>>      44 +#define EXIT_REASON_EOI_INDUCED         45 #define
> >>>  EXIT_REASON_EPT_VIOLATION       48 #define
> > EXIT_REASON_EPT_MISCONFIG
> >>>      49 #define EXIT_REASON_WBINVD              54 @@ -143,6
> > +144,7 @@
> >>>  #define SECONDARY_EXEC_WBINVD_EXITING		0x00000040 #define
> >>>  SECONDARY_EXEC_UNRESTRICTED_GUEST	0x00000080 #define
> >>>  SECONDARY_EXEC_APIC_REGISTER_VIRT       0x00000100 +#define
> >>>  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200 #define
> >>>  SECONDARY_EXEC_PAUSE_LOOP_EXITING	0x00000400 #define
> >>>  SECONDARY_EXEC_ENABLE_INVPCID		0x00001000 @@ -180,6 +182,7 @@ enum
> >>>  vmcs_field { 	GUEST_GS_SELECTOR               = 0x0000080a,
> > 	GUEST_LDTR_SELECTOR
> >>>           = 0x0000080c, 	GUEST_TR_SELECTOR               =
> > 0x0000080e,
> >>>  +	GUEST_INTR_STATUS               = 0x00000810,
> > 	HOST_ES_SELECTOR
> >>>            = 0x00000c00, 	HOST_CS_SELECTOR                =
> > 0x00000c02,
> >>>  	HOST_SS_SELECTOR                = 0x00000c04, @@ -207,6 +210,14 @@
> >>>  enum vmcs_field { 	APIC_ACCESS_ADDR_HIGH		= 0x00002015,
> > 	EPT_POINTER
> >>>                    = 0x0000201a, 	EPT_POINTER_HIGH
> > =
> >>>  0x0000201b,
> >>> +	EOI_EXIT_BITMAP0                = 0x0000201c,
> >>> +	EOI_EXIT_BITMAP0_HIGH           = 0x0000201d,
> >>> +	EOI_EXIT_BITMAP1                = 0x0000201e,
> >>> +	EOI_EXIT_BITMAP1_HIGH           = 0x0000201f,
> >>> +	EOI_EXIT_BITMAP2                = 0x00002020,
> >>> +	EOI_EXIT_BITMAP2_HIGH           = 0x00002021,
> >>> +	EOI_EXIT_BITMAP3                = 0x00002022,
> >>> +	EOI_EXIT_BITMAP3_HIGH           = 0x00002023,
> >>>  	GUEST_PHYSICAL_ADDRESS          = 0x00002400,
> >>>  	GUEST_PHYSICAL_ADDRESS_HIGH     = 0x00002401,
> >>>  	VMCS_LINK_POINTER               = 0x00002800,
> >>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> >>> index 7e06ba1..f782788 100644
> >>> --- a/arch/x86/kvm/irq.c
> >>> +++ b/arch/x86/kvm/irq.c
> >>> @@ -43,45 +43,64 @@ EXPORT_SYMBOL(kvm_cpu_has_pending_timer);
> >>>   */
> >>>  int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
> >>>  {
> >>> -	struct kvm_pic *s;
> >>> -
> >>>  	if (!irqchip_in_kernel(v->kvm))
> >>>  		return v->arch.interrupt.pending;
> >>> -	if (kvm_apic_has_interrupt(v) == -1) {	/* LAPIC */
> >>> -		if (kvm_apic_accept_pic_intr(v)) {
> >>> -			s = pic_irqchip(v->kvm);	/* PIC */
> >>> -			return s->output;
> >>> -		} else
> >>> -			return 0;
> >>> -	}
> >>> +	if (kvm_apic_has_interrupt(v) == -1) /* LAPIC */
> >>> +		return kvm_cpu_has_extint(v); /* non-APIC */
> >>>  	return 1;
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
> >>>  
> >>>  /*
> >>> + * check if there is pending interrupt from
> >>> + * non-APIC source without intack.
> >>> + */
> >>> +int kvm_cpu_has_extint(struct kvm_vcpu *v)
> >>> +{
> >>> +	struct kvm_pic *s;
> >>> +
> >>> +	if (kvm_apic_accept_pic_intr(v)) {
> >>> +		s = pic_irqchip(v->kvm); /* PIC */
> >>> +		return s->output;
> >>> +	} else
> >>> +		return 0;
> >>> +}
> >>> +
> >>> +/*
> >>>   * Read pending interrupt vector and intack.
> >>>   */
> >>>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v) { -	struct kvm_pic *s;
> >>>  	int vector;
> >>>  
> >>>  	if (!irqchip_in_kernel(v->kvm))
> >>>  		return v->arch.interrupt.nr;
> >>>  
> >>>  	vector = kvm_get_apic_interrupt(v);	/* APIC */
> >>> -	if (vector == -1) {
> >>> -		if (kvm_apic_accept_pic_intr(v)) {
> >>> -			s = pic_irqchip(v->kvm);
> >>> -			s->output = 0;		/* PIC */
> >>> -			vector = kvm_pic_read_irq(v->kvm);
> >>> -		}
> >>> -	}
> >>> +	if (vector == -1)
> >>> +		return kvm_cpu_get_extint(v); /* non-APIC */
> >>>  	return vector;
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
> >>> +/*
> >>> + * Read pending interrupt(from non-APIC source)
> >>> + * vector and intack.
> >>> + */
> >>> +int kvm_cpu_get_extint(struct kvm_vcpu *v)
> >>> +{
> >>> +	struct kvm_pic *s;
> >>> +	int vector = -1;
> >>> +
> >>> +	if (kvm_apic_accept_pic_intr(v)) {
> >>> +		s = pic_irqchip(v->kvm);
> >>> +		s->output = 0;		/* PIC */
> >>> +		vector = kvm_pic_read_irq(v->kvm);
> >>> +	}
> >>> +	return vector;
> >>> +}
> >>> +
> >>>  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
> >>>  {
> >>>  	kvm_inject_apic_timer_irqs(vcpu);
> >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >>> index 7c96012..400d3ba 100644
> >>> --- a/arch/x86/kvm/lapic.c
> >>> +++ b/arch/x86/kvm/lapic.c
> >>> @@ -643,6 +643,14 @@ out:
> >>>  	return ret;
> >>>  }
> >>> +void kvm_set_eoi_exitmap(struct kvm_vcpu *vcpu, int vector,
> >>> +		int trig_mode, int always_set)
> >>> +{
> >>> +	if (kvm_x86_ops->set_eoi_exitmap)
> >>> +		kvm_x86_ops->set_eoi_exitmap(vcpu, vector,
> >>> +					trig_mode, always_set);
> >>> +}
> >>> +
> >>>  /*
> >>>   * Add a pending IRQ into lapic.
> >>>   * Return 1 if successfully added and 0 if discarded.
> >>> @@ -661,6 +669,7 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> >> delivery_mode,
> >>>  		if (unlikely(!apic_enabled(apic))) 			break;
> >>>  +		kvm_set_eoi_exitmap(vcpu, vector, trig_mode, 0); 		if (trig_mode)
> >>>  { 			apic_debug("level trig mode for vector %d", vector);
> >>>  			apic_set_vector(vector, apic->regs + APIC_TMR);
> >>> @@ -740,6 +749,19 @@ int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1,
> >> struct kvm_vcpu *vcpu2)
> >>>  	return vcpu1->arch.apic_arb_prio - vcpu2->arch.apic_arb_prio;
> >>>  }
> >>> +static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
> >>> +{
> >>> +	if (!(kvm_apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) &&
> >>> +	    kvm_ioapic_handles_vector(apic->vcpu->kvm, vector)) {
> >>> +		int trigger_mode;
> >>> +		if (apic_test_vector(vector, apic->regs + APIC_TMR))
> >>> +			trigger_mode = IOAPIC_LEVEL_TRIG;
> >>> +		else
> >>> +			trigger_mode = IOAPIC_EDGE_TRIG;
> >>> +		kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
> >>> +	}
> >>> +}
> >>> +
> >>>  static int apic_set_eoi(struct kvm_lapic *apic) { 	int vector =
> >>>  apic_find_highest_isr(apic); @@ -756,19 +778,24 @@ static int
> >>>  apic_set_eoi(struct kvm_lapic *apic) 	apic_clear_isr(vector, apic);
> >>>  	apic_update_ppr(apic);
> >>> -	if (!(kvm_apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) &&
> >>> -	    kvm_ioapic_handles_vector(apic->vcpu->kvm, vector)) {
> >>> -		int trigger_mode;
> >>> -		if (apic_test_vector(vector, apic->regs + APIC_TMR))
> >>> -			trigger_mode = IOAPIC_LEVEL_TRIG;
> >>> -		else
> >>> -			trigger_mode = IOAPIC_EDGE_TRIG;
> >>> -		kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
> >>> -	}
> >>> +	kvm_ioapic_send_eoi(apic, vector);
> >>>  	kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
> >>>  	return vector;
> >>>  }
> >>> +/*
> >>> + * this interface assumes a trap-like exit, which has already finished
> >>> + * desired side effect including vISR and vPPR update.
> >>> + */
> >>> +void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
> >>> +{
> >>> +	struct kvm_lapic *apic = vcpu->arch.apic;
> >>> +
> >>> +	kvm_ioapic_send_eoi(apic, vector);
> >>> +	kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
> >>> +
> >>>  static void apic_send_ipi(struct kvm_lapic *apic) { 	u32 icr_low =
> >>>  kvm_apic_get_reg(apic, APIC_ICR); @@ -1533,6 +1560,17 @@ int
> >>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
> >>> +int kvm_apic_get_highest_irr(struct kvm_vcpu *vcpu)
> >>> +{
> >>> +	struct kvm_lapic *apic = vcpu->arch.apic;
> >>> +
> >>> +	if (!apic || !apic_enabled(apic))
> >>> +		return -1;
> >>> +
> >>> +	return apic_find_highest_irr(apic);
> >>> +}
> >> 
> >> irr_pending of apic_find_highest_irr() is meaningless (stale) if
> >> HW is updating VIRR.
> > I don't think target vcpu is running when we call this function. So it is safe to
> > check irr_pending and read the irr.
> One problem is that when HW clears VIRR, after vmexit, the irr_peding will still true if there are more than one bit is set in VIRR in last vmentry.
It doesn't matter how much bits were set in VIRR before entry, as long
as some bit were set irr_peding will be true.

> How about to recaculate irr_pending according the VIRR on each vmexit?
> 
No need really. Since HW can only clear VIRR the only situation that may
happen is that irr_pending will be true but VIRR is empty and
apic_find_highest_irr() will return correct result in this case.

If we will see a lot of unneeded irr scans because of stale irr_pending
value we can do irr_pending = rvi != 0 on vmexit.

--
			Gleb.

  reply	other threads:[~2012-12-06  6:36 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-03  7:01 [PATCH v3 0/4] x86, apicv: Add APIC virtualization support Yang Zhang
2012-12-03  7:01 ` [PATCH v3 1/4] x86: PIT connects to pin 2 of IOAPIC Yang Zhang
2012-12-03  9:42   ` Gleb Natapov
2012-12-04  5:32     ` Zhang, Yang Z
2012-12-04 12:55       ` Gleb Natapov
2012-12-05  1:28         ` Zhang, Yang Z
2012-12-03  7:01 ` [PATCH v3 2/4] x86, apicv: add APICv register virtualization support Yang Zhang
2012-12-05  0:29   ` Marcelo Tosatti
2012-12-05  3:17     ` Zhang, Yang Z
2012-12-05 23:11       ` Marcelo Tosatti
2012-12-06  6:45         ` Gleb Natapov
2012-12-03  7:01 ` [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support Yang Zhang
2012-12-03 11:37   ` Gleb Natapov
2012-12-04  6:39     ` Zhang, Yang Z
2012-12-04 10:58       ` Gleb Natapov
2012-12-05  1:55         ` Zhang, Yang Z
2012-12-05  5:02           ` Gleb Natapov
2012-12-05  6:02             ` Zhang, Yang Z
2012-12-05 10:18               ` Gleb Natapov
2012-12-05 13:51                 ` Zhang, Yang Z
2012-12-05 14:00                   ` Gleb Natapov
2012-12-06  2:55                     ` Zhang, Yang Z
2012-12-06  7:07                       ` Gleb Natapov
2012-12-06  7:16                         ` Zhang, Yang Z
2012-12-06  7:19                           ` Gleb Natapov
2012-12-06  7:20                             ` Zhang, Yang Z
2012-12-04 16:46   ` Michael S. Tsirkin
2012-12-05  2:00   ` Marcelo Tosatti
2012-12-05  3:43     ` Zhang, Yang Z
2012-12-05 11:14       ` Gleb Natapov
2012-12-05 14:16         ` Zhang, Yang Z
2012-12-05 14:34           ` Gleb Natapov
2012-12-06  2:58             ` Zhang, Yang Z
2012-12-05 22:38         ` Marcelo Tosatti
2012-12-06  9:28           ` Gleb Natapov
2012-12-06 11:35             ` Zhang, Yang Z
2012-12-05 22:31   ` Marcelo Tosatti
2012-12-06  3:31     ` Zhang, Yang Z
2012-12-06  5:02     ` Zhang, Yang Z
2012-12-06  6:36       ` Gleb Natapov [this message]
2012-12-06  6:56         ` Zhang, Yang Z
2012-12-06 20:08         ` Marcelo Tosatti
2012-12-07  1:00           ` Zhang, Yang Z
2012-12-07  7:28             ` Gleb Natapov
2012-12-03  7:01 ` [PATCH v3 4/4] x86, apicv: add virtual x2apic support Yang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121206063652.GW19514@redhat.com \
    --to=gleb@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).