Re: [PATCH] KVM: x86: Convert INIT and SIPI signals into synchronously handled events

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jan Kiszka <jan.kiszka@siemens.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>, kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] KVM: x86: Convert INIT and SIPI signals into synchronously handled events
Date: Tue, 12 Mar 2013 14:01:51 +0100	[thread overview]
Message-ID: <513F273F.5020108@siemens.com> (raw)
In-Reply-To: <513F2688.2080902@siemens.com>

On 2013-03-12 13:58, Jan Kiszka wrote:
> On 2013-03-12 13:06, Paolo Bonzini wrote:
>> Il 12/03/2013 12:44, Jan Kiszka ha scritto:
>>> A VCPU sending INIT or SIPI to some other VCPU races for setting the
>>> remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED
>>> was overwritten by kvm_emulate_halt and, thus, got lost.
>>>
>>> This introduces APIC events for those two signals, keeping them in
>>> kvm_apic until kvm_apic_accept_events is run over the target vcpu
>>> context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there
>>> are pending events, thus if vcpu blocking should end.
>>>
>>> The patch comes with the side effect of effectively obsoleting
>>> KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but
>>> immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI.
>>> The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED
>>> state. That also means we no longer exit to user space after receiving a
>>> SIPI event.
>>>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> ---
>>>
>>> This doesn't fix the wrong behaviour on INIT for the BSP yet. Leaving
>>> this honor to Paolo.
>>>
>>> I didn't try porting any INIT handling for nested on top yet but I
>>> think it should be feasible - once we know their semantics for sure, at
>>> least on Intel...
>>>
>>>  arch/x86/include/asm/kvm_host.h |    1 +
>>>  arch/x86/kvm/lapic.c            |   41 +++++++++++++++++++++++++++++++-------
>>>  arch/x86/kvm/lapic.h            |    7 ++++++
>>>  arch/x86/kvm/x86.c              |   39 ++++++++++++++++++++----------------
>>>  4 files changed, 63 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index 348d859..2d28e76 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -1002,6 +1002,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
>>>  int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
>>>  int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
>>>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
>>> +void kvm_vcpu_reset(struct kvm_vcpu *vcpu);
>>>  
>>>  void kvm_define_shared_msr(unsigned index, u32 msr);
>>>  void kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 02b51dd..4a21a6b 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -731,7 +731,7 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>>  	case APIC_DM_INIT:
>>>  		if (!trig_mode || level) {
>>>  			result = 1;
>>> -			vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
>>> +			set_bit(KVM_APIC_INIT, &apic->pending_events);
>>
>> I think this should clear pending SIPIs, unless KVM_APIC_INIT was
>> already set in which case it should be a no-op.  Something like:
>>
>> 	e = apic->pending_events;
>> 	while (!(e & KVM_APIC_INIT))
>> 		e = cmpxchg(&apic->pending_events, e,
>> 			    (e | KVM_APIC_INIT) & ~KVM_APIC_SIPI);
>>
>> If you do this, better make pending_events an atomic_t.
> 
> OK (looks ugly but necessary).
> 
>>
>>>  			kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>  			kvm_vcpu_kick(vcpu);
>>>  		} else {
>>> @@ -743,13 +743,13 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>>  	case APIC_DM_STARTUP:
>>>  		apic_debug("SIPI to vcpu %d vector 0x%02x\n",
>>>  			   vcpu->vcpu_id, vector);
>>> -		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
>>> -			result = 1;
>>> -			vcpu->arch.sipi_vector = vector;
>>> -			vcpu->arch.mp_state = KVM_MP_STATE_SIPI_RECEIVED;
>>> -			kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> -			kvm_vcpu_kick(vcpu);
>>> -		}
>>> +		result = 1;
>>> +		apic->sipi_vector = vector;
>>> +		/* make sure sipi_vector is visible for the receiver */
>>> +		smp_wmb();
>>> +		set_bit(KVM_APIC_SIPI, &apic->pending_events);
>>> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> +		kvm_vcpu_kick(vcpu);
>>>  		break;
>>>  
>>>  	case APIC_DM_EXTINT:
>>> @@ -1860,6 +1860,31 @@ int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data)
>>>  					 addr);
>>>  }
>>>  
>>> +bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
>>> +{
>>> +	return kvm_vcpu_has_lapic(vcpu) && vcpu->arch.apic->pending_events;
>>> +}
>>> +
>>> +void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
>>> +{
>>> +	struct kvm_lapic *apic = vcpu->arch.apic;
>>> +
>>> +	if (!kvm_vcpu_has_lapic(vcpu))
>>> +		return;
>>> +
>>> +	if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events))
>>> +		vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
>>> +	if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) &&
>>> +	    vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
>>> +		vcpu->arch.sipi_vector = apic->sipi_vector;
>>> +		pr_debug("vcpu %d received sipi with vector # %x\n",
>>> +			 vcpu->vcpu_id, vcpu->arch.sipi_vector);
>>> +		kvm_lapic_reset(vcpu);
>>> +		kvm_vcpu_reset(vcpu);
>>> +		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>>> +	}
>>> +}
>>> +
>>>  void kvm_lapic_init(void)
>>>  {
>>>  	/* do not patch jump label more than once per second */
>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>> index 1676d34..ef3f4ef 100644
>>> --- a/arch/x86/kvm/lapic.h
>>> +++ b/arch/x86/kvm/lapic.h
>>> @@ -5,6 +5,9 @@
>>>  
>>>  #include <linux/kvm_host.h>
>>>  
>>> +#define KVM_APIC_INIT		0
>>> +#define KVM_APIC_SIPI		1
>>> +
>>>  struct kvm_timer {
>>>  	struct hrtimer timer;
>>>  	s64 period; 				/* unit: ns */
>>> @@ -32,13 +35,17 @@ struct kvm_lapic {
>>>  	void *regs;
>>>  	gpa_t vapic_addr;
>>>  	struct page *vapic_page;
>>> +	unsigned long pending_events;
>>> +	unsigned int sipi_vector;
>>>  };
>>>  int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>>  
>>>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
>>> +bool kvm_apic_has_events(struct kvm_vcpu *vcpu);
>>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
>>>  int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
>>> +void kvm_apic_accept_events(struct kvm_vcpu *vcpu);
>>>  void kvm_lapic_reset(struct kvm_vcpu *vcpu);
>>>  u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
>>>  void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index b891ac3..a0b8041 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -162,8 +162,6 @@ u64 __read_mostly host_xcr0;
>>>  
>>>  static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt);
>>>  
>>> -static void kvm_vcpu_reset(struct kvm_vcpu *vcpu);
>>> -
>>>  static inline void kvm_async_pf_hash_reset(struct kvm_vcpu *vcpu)
>>>  {
>>>  	int i;
>>> @@ -5663,6 +5661,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>  	bool req_immediate_exit = 0;
>>>  
>>>  	if (vcpu->requests) {
>>> +		kvm_apic_accept_events(vcpu);
>>> +		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
>>> +			r = 1;
>>> +			goto out;
>>> +		}
>>>  		if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
>>>  			kvm_mmu_unload(vcpu);
>>>  		if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
>>> @@ -5847,14 +5850,6 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>>  	int r;
>>>  	struct kvm *kvm = vcpu->kvm;
>>>  
>>> -	if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_SIPI_RECEIVED)) {
>>> -		pr_debug("vcpu %d received sipi with vector # %x\n",
>>> -			 vcpu->vcpu_id, vcpu->arch.sipi_vector);
>>> -		kvm_lapic_reset(vcpu);
>>> -		kvm_vcpu_reset(vcpu);
>>> -		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>>> -	}
>>> -
>>>  	vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>>>  	r = vapic_enter(vcpu);
>>>  	if (r) {
>>> @@ -5871,8 +5866,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>>  			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
>>>  			kvm_vcpu_block(vcpu);
>>>  			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>>> -			if (kvm_check_request(KVM_REQ_UNHALT, vcpu))
>>> -			{
>>> +			if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
>>> +				kvm_apic_accept_events(vcpu);
>>>  				switch(vcpu->arch.mp_state) {
>>>  				case KVM_MP_STATE_HALTED:
>>>  					vcpu->arch.mp_state =
>>> @@ -5880,7 +5875,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>>  				case KVM_MP_STATE_RUNNABLE:
>>>  					vcpu->arch.apf.halted = false;
>>>  					break;
>>> -				case KVM_MP_STATE_SIPI_RECEIVED:
>>> +				case KVM_MP_STATE_INIT_RECEIVED:
>>> +					break;
>>>  				default:
>>>  					r = -EINTR;
>>>  					break;
>>> @@ -5901,7 +5897,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>>  			++vcpu->stat.request_irq_exits;
>>>  		}
>>>  
>>> -		kvm_check_async_pf_completion(vcpu);
>>> +		if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED)
>>> +			kvm_check_async_pf_completion(vcpu);
>>
>> Can you instead do another change that affects the conditions of
>> kvm_check_async_pf_completion?
>>
>> For example, should kvm_arch_interrupt_allowed return zero if the VCPU
>> is in the INIT_RECEIVED state?
> 
> Yeah, that probably makes sense beyond async_pf.

Wait: If you perform a proper reset on INIT already, we would clear IF
thus prevent also async_pf injections. On the other hand,
kvm_arch_can_inject_async_page_present returns true if apf.msr_val &
KVM_ASYNC_PF_ENABLED is not set - shouldn't that be cleared on reset as
well? Hmm...

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

next prev parent reply	other threads:[~2013-03-12 13:01 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-12 11:44 [PATCH] KVM: x86: Convert INIT and SIPI signals into synchronously handled events Jan Kiszka
2013-03-12 12:06 ` Paolo Bonzini
2013-03-12 12:29   ` Gleb Natapov
2013-03-12 12:29   ` Paolo Bonzini
2013-03-12 12:46     ` Jan Kiszka
2013-03-12 12:49       ` Gleb Natapov
2013-03-12 12:52         ` Jan Kiszka
2013-03-12 12:53           ` Gleb Natapov
2013-03-12 13:09       ` Paolo Bonzini
2013-03-12 13:13         ` Jan Kiszka
2013-03-12 12:58   ` Jan Kiszka
2013-03-12 13:01     ` Jan Kiszka [this message]
2013-03-12 13:13       ` Paolo Bonzini
2013-03-12 13:16         ` Jan Kiszka
2013-03-12 13:25           ` Gleb Natapov
2013-03-12 13:27             ` Jan Kiszka
2013-03-12 13:37               ` Gleb Natapov
2013-03-13  7:39   ` Jan Kiszka
2013-03-13  9:03     ` Paolo Bonzini
2013-03-13  9:08       ` Jan Kiszka
2013-03-12 13:12 ` Gleb Natapov
2013-03-12 13:21   ` Jan Kiszka
2013-03-12 13:41     ` Gleb Natapov
2013-03-12 13:43       ` Paolo Bonzini
2013-03-12 13:52         ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513F273F.5020108@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.