linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Longpeng (Mike)" <longpeng2@huawei.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Huangweidong <weidong.huang@huawei.com>,
	Gonglei <arei.gonglei@huawei.com>,
	wangxin <wangxinxin.wang@huawei.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	zhang.zhanghailiang@huawei.com
Subject: Re: [PATCH 3/4] KVM: VMX: simplify and fix vmx_vcpu_pi_load
Date: Fri, 28 Jul 2017 12:22:07 +0800	[thread overview]
Message-ID: <597ABBEF.2080802@huawei.com> (raw)
In-Reply-To: <20170606105707.23207-4-pbonzini@redhat.com>



On 2017/6/6 18:57, Paolo Bonzini wrote:

> The simplify part: do not touch pi_desc.nv, we can set it when the
> VCPU is first created.  Likewise, pi_desc.sn is only handled by
> vmx_vcpu_pi_load, do not touch it in __pi_post_block.
> 
> The fix part: do not check kvm_arch_has_assigned_device, instead
> check the SN bit to figure out whether vmx_vcpu_pi_put ran before.
> This matches what the previous patch did in pi_post_block.
> 
> Cc: Longpeng (Mike) <longpeng2@huawei.com>
> Cc: Huangweidong <weidong.huang@huawei.com>
> Cc: Gonglei <arei.gonglei@huawei.com>
> Cc: wangxin <wangxinxin.wang@huawei.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 68 ++++++++++++++++++++++++++++--------------------------
>  1 file changed, 35 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 0f4714fe4908..81047f373747 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2184,43 +2184,41 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>  	struct pi_desc old, new;
>  	unsigned int dest;
>  
> -	if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> -		!irq_remapping_cap(IRQ_POSTING_CAP)  ||
> -		!kvm_vcpu_apicv_active(vcpu))
> +	/*
> +	 * In case of hot-plug or hot-unplug, we may have to undo
> +	 * vmx_vcpu_pi_put even if there is no assigned device.  And we
> +	 * always keep PI.NDST up to date for simplicity: it makes the
> +	 * code easier, and CPU migration is not a fast path.
> +	 */
> +	if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
> +		return;


Hi Paolo,

I'm confused with the following scenario:

(suppose the VM has a assigned devices)
step 1. the running vcpu is be preempted
			--> vmx_vcpu_pi_put [ SET pi.sn ]
step 2. hot-unplug the assigned devices
step 3. the vcpu is scheduled in
			--> vmx_vcpu_pi_load [ CLEAR pi.sn ]
step 4. the running vcpu is be preempted again
			--> vmx_vcpu_pi_put [ direct return ]
step 5. the vcpu is migrated to another pcpu
step 6. the vcpu is scheduled in
			--> vmx_vcpu_pi_load [ above check fails and
			    continue to execute the follow parts ]

I think vmx_vcpu_pi_load should return direct in step6, because
vmx_vcpu_pi_put in step4 did nothing.
So maybe the above check has a potential problem.

Please kindly figure out if I misunderstand anything important :)

--
Regards,
Longpeng(Mike)

> +
> +	/*
> +	 * First handle the simple case where no cmpxchg is necessary; just
> +	 * allow posting non-urgent interrupts.
> +	 *
> +	 * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> +	 * PI.NDST: pi_post_block will do it for us and the wakeup_handler
> +	 * expects the VCPU to be on the blocked_vcpu_list that matches
> +	 * PI.NDST.
> +	 */
> +	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR ||
> +	    vcpu->cpu == cpu) {
> +		pi_clear_sn(pi_desc);
>  		return;
> +	}
>  
> +	/* The full case.  */
>  	do {
>  		old.control = new.control = pi_desc->control;
>  
> -		/*
> -		 * If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there
> -		 * are two possible cases:
> -		 * 1. After running 'pre_block', context switch
> -		 *    happened. For this case, 'sn' was set in
> -		 *    vmx_vcpu_put(), so we need to clear it here.
> -		 * 2. After running 'pre_block', we were blocked,
> -		 *    and woken up by some other guy. For this case,
> -		 *    we don't need to do anything, 'pi_post_block'
> -		 *    will do everything for us. However, we cannot
> -		 *    check whether it is case #1 or case #2 here
> -		 *    (maybe, not needed), so we also clear sn here,
> -		 *    I think it is not a big deal.
> -		 */
> -		if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) {
> -			if (vcpu->cpu != cpu) {
> -				dest = cpu_physical_id(cpu);
> -
> -				if (x2apic_enabled())
> -					new.ndst = dest;
> -				else
> -					new.ndst = (dest << 8) & 0xFF00;
> -			}
> +		dest = cpu_physical_id(cpu);
>  
> -			/* set 'NV' to 'notification vector' */
> -			new.nv = POSTED_INTR_VECTOR;
> -		}
> +		if (x2apic_enabled())
> +			new.ndst = dest;
> +		else
> +			new.ndst = (dest << 8) & 0xFF00;
>  
> -		/* Allow posting non-urgent interrupts */
>  		new.sn = 0;
>  	} while (cmpxchg(&pi_desc->control, old.control,
>  			new.control) != old.control);
> @@ -9259,6 +9257,13 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>  
>  	vmx->msr_ia32_feature_control_valid_bits = FEATURE_CONTROL_LOCKED;
>  
> +	/*
> +	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
> +	 * or POSTED_INTR_WAKEUP_VECTOR.
> +	 */
> +	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
> +	vmx->pi_desc.sn = 1;
> +
>  	return &vmx->vcpu;
>  
>  free_vmcs:
> @@ -11249,9 +11254,6 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>  		else
>  			new.ndst = (dest << 8) & 0xFF00;
>  
> -		/* Allow posting non-urgent interrupts */
> -		new.sn = 0;
> -
>  		/* set 'NV' to 'notification vector' */
>  		new.nv = POSTED_INTR_VECTOR;
>  	} while (cmpxchg(&pi_desc->control, old.control,


-- 
Regards,
Longpeng(Mike)

  reply	other threads:[~2017-07-28  4:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-06 10:57 [PATCH CFT 0/4] VT-d PI fixes Paolo Bonzini
2017-06-06 10:57 ` [PATCH 1/4] KVM: VMX: extract __pi_post_block Paolo Bonzini
2017-06-06 21:27   ` kbuild test robot
2017-06-06 10:57 ` [PATCH 2/4] KVM: VMX: avoid double list add with VT-d posted interrupts Paolo Bonzini
2017-06-06 12:30   ` Longpeng (Mike)
2017-06-06 12:35     ` Paolo Bonzini
2017-06-06 12:45       ` Longpeng (Mike)
2017-06-06 21:49   ` kbuild test robot
2017-06-08  6:50   ` Peter Xu
2017-06-08  6:53     ` Peter Xu
2017-06-08  7:00     ` Paolo Bonzini
2017-06-08  9:16       ` Peter Xu
2017-06-08 11:24         ` Paolo Bonzini
2017-06-09  2:50           ` Peter Xu
2017-06-09  7:29             ` Paolo Bonzini
2017-06-09  7:41               ` Peter Xu
2017-07-28  2:31   ` Longpeng (Mike)
2017-07-28  6:28     ` Paolo Bonzini
2017-06-06 10:57 ` [PATCH 3/4] KVM: VMX: simplify and fix vmx_vcpu_pi_load Paolo Bonzini
2017-07-28  4:22   ` Longpeng (Mike) [this message]
2017-07-28  5:14     ` Longpeng (Mike)
2017-06-06 10:57 ` [PATCH 4/4] KVM: VMX: simplify cmpxchg of PI descriptor control field Paolo Bonzini
2017-06-07  9:33 ` [PATCH CFT 0/4] VT-d PI fixes Gonglei (Arei)
2017-06-07 14:32   ` Paolo Bonzini
2017-07-11  8:55   ` Paolo Bonzini
2017-07-11  9:16     ` Gonglei (Arei)
2017-09-21  8:23       ` Longpeng (Mike)
2017-09-21  9:42         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=597ABBEF.2080802@huawei.com \
    --to=longpeng2@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=wangxinxin.wang@huawei.com \
    --cc=weidong.huang@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).