From: "Longpeng (Mike)" <longpeng2@huawei.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Huangweidong <weidong.huang@huawei.com>,
Gonglei <arei.gonglei@huawei.com>,
wangxin <wangxinxin.wang@huawei.com>,
"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: [PATCH 2/4] KVM: VMX: avoid double list add with VT-d posted interrupts
Date: Tue, 6 Jun 2017 20:45:22 +0800 [thread overview]
Message-ID: <5936A3E2.1090509@huawei.com> (raw)
In-Reply-To: <cf0d9d74-772e-e7cf-eab4-05e3daad44d6@redhat.com>
On 2017/6/6 20:35, Paolo Bonzini wrote:
>
>
> On 06/06/2017 14:30, Longpeng (Mike) wrote:
>>
>>
>> On 2017/6/6 18:57, Paolo Bonzini wrote:
>>
>>> In some cases, for example involving hot-unplug of assigned
>>> devices, pi_post_block can forget to remove the vCPU from the
>>> blocked_vcpu_list. When this happens, the next call to
>>> pi_pre_block corrupts the list.
>>>
>>> Fix this in two ways. First, check vcpu->pre_pcpu in pi_pre_block
>>> and WARN instead of adding the element twice in the list. Second,
>>> always do the list removal in pi_post_block if vcpu->pre_pcpu is
>>> set (not -1).
>>>
>>> The new code keeps interrupts disabled for the whole duration of
>>> pi_pre_block/pi_post_block. This is not strictly necessary, but
>>> easier to follow. For the same reason, PI.ON is checked only
>>> after the cmpxchg, and to handle it we just call the post-block
>>> code. This removes duplication of the list removal code.
>>>
>>> Cc: Longpeng (Mike) <longpeng2@huawei.com>
>>> Cc: Huangweidong <weidong.huang@huawei.com>
>>> Cc: Gonglei <arei.gonglei@huawei.com>
>>> Cc: wangxin <wangxinxin.wang@huawei.com>
>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>> arch/x86/kvm/vmx.c | 62 ++++++++++++++++++++++--------------------------------
>>> 1 file changed, 25 insertions(+), 37 deletions(-)
>>>
>>
>>
>> [...]
>>
>>
>>> @@ -11256,14 +11257,10 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>>> } while (cmpxchg(&pi_desc->control, old.control,
>>> new.control) != old.control);
>>>
>>> - if(vcpu->pre_pcpu != -1) {
>>> - spin_lock_irqsave(
>>> - &per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> + if (!WARN_ON_ONCE(vcpu->pre_pcpu == -1)) {
>>> + spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
>>> list_del(&vcpu->blocked_vcpu_list);
>>> - spin_unlock_irqrestore(
>>> - &per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> + spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
>>
>>
>> Hi Paolo,
>>
>> spin_lock_irqsave() will disable kernel preempt, but spin_lock() won't. is there
>> some potential problems ?
>
> Hi,
>
> This function (and pi_pre_block too's part where it takes the spin lock)
> runs with interrupts disabled now.
>
Oh, yes, please forgive my foolish.
We'll continue to find why the list is corrupt when repeat poweron/shutdown
Thanks.
> Thanks,
>
> Paolo
>
>> Regards,
>> Longpeng(Mike)
>>
>>> vcpu->pre_pcpu = -1;
>>> }
>>> }
>>> @@ -11283,7 +11280,6 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>>> */
>>> static int pi_pre_block(struct kvm_vcpu *vcpu)
>>> {
>>> - unsigned long flags;
>>> unsigned int dest;
>>> struct pi_desc old, new;
>>> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>>> @@ -11293,34 +11289,20 @@ static int pi_pre_block(struct kvm_vcpu *vcpu)
>>> !kvm_vcpu_apicv_active(vcpu))
>>> return 0;
>>>
>>> - vcpu->pre_pcpu = vcpu->cpu;
>>> - spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> - list_add_tail(&vcpu->blocked_vcpu_list,
>>> - &per_cpu(blocked_vcpu_on_cpu,
>>> - vcpu->pre_pcpu));
>>> - spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> + WARN_ON(irqs_disabled());
>>> + local_irq_disable();
>>> + if (!WARN_ON_ONCE(vcpu->pre_pcpu != -1)) {
>>> + vcpu->pre_pcpu = vcpu->cpu;
>>> + spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
>>> + list_add_tail(&vcpu->blocked_vcpu_list,
>>> + &per_cpu(blocked_vcpu_on_cpu,
>>> + vcpu->pre_pcpu));
>>> + spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
>>> + }
>>>
>>> do {
>>> old.control = new.control = pi_desc->control;
>>>
>>> - /*
>>> - * We should not block the vCPU if
>>> - * an interrupt is posted for it.
>>> - */
>>> - if (pi_test_on(pi_desc) == 1) {
>>> - spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> - list_del(&vcpu->blocked_vcpu_list);
>>> - spin_unlock_irqrestore(
>>> - &per_cpu(blocked_vcpu_on_cpu_lock,
>>> - vcpu->pre_pcpu), flags);
>>> - vcpu->pre_pcpu = -1;
>>> -
>>> - return 1;
>>> - }
>>> -
>>> WARN((pi_desc->sn == 1),
>>> "Warning: SN field of posted-interrupts "
>>> "is set before blocking\n");
>>> @@ -11345,7 +11327,12 @@ static int pi_pre_block(struct kvm_vcpu *vcpu)
>>> } while (cmpxchg(&pi_desc->control, old.control,
>>> new.control) != old.control);
>>>
>>> - return 0;
>>> + /* We should not block the vCPU if an interrupt is posted for it. */
>>> + if (pi_test_on(pi_desc) == 1)
>>> + __pi_post_block(vcpu);
>>> +
>>> + local_irq_enable();
>>> + return (vcpu->pre_pcpu == -1);
>>> }
>>>
>>> static int vmx_pre_block(struct kvm_vcpu *vcpu)
>>> @@ -11361,12 +11348,13 @@ static int vmx_pre_block(struct kvm_vcpu *vcpu)
>>>
>>> static void pi_post_block(struct kvm_vcpu *vcpu)
>>> {
>>> - if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
>>> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
>>> - !kvm_vcpu_apicv_active(vcpu))
>>> + if (vcpu->pre_pcpu == -1)
>>> return;
>>>
>>> + WARN_ON(irqs_disabled());
>>> + local_irq_disable();
>>> __pi_post_block(vcpu);
>>> + local_irq_enable();
>>> }
>>>
>>> static void vmx_post_block(struct kvm_vcpu *vcpu)
>>
>>
>
> .
>
--
Regards,
Longpeng(Mike)
next prev parent reply other threads:[~2017-06-06 12:45 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-06 10:57 [PATCH CFT 0/4] VT-d PI fixes Paolo Bonzini
2017-06-06 10:57 ` [PATCH 1/4] KVM: VMX: extract __pi_post_block Paolo Bonzini
2017-06-06 21:27 ` kbuild test robot
2017-06-06 10:57 ` [PATCH 2/4] KVM: VMX: avoid double list add with VT-d posted interrupts Paolo Bonzini
2017-06-06 12:30 ` Longpeng (Mike)
2017-06-06 12:35 ` Paolo Bonzini
2017-06-06 12:45 ` Longpeng (Mike) [this message]
2017-06-06 21:49 ` kbuild test robot
2017-06-08 6:50 ` Peter Xu
2017-06-08 6:53 ` Peter Xu
2017-06-08 7:00 ` Paolo Bonzini
2017-06-08 9:16 ` Peter Xu
2017-06-08 11:24 ` Paolo Bonzini
2017-06-09 2:50 ` Peter Xu
2017-06-09 7:29 ` Paolo Bonzini
2017-06-09 7:41 ` Peter Xu
2017-07-28 2:31 ` Longpeng (Mike)
2017-07-28 6:28 ` Paolo Bonzini
2017-06-06 10:57 ` [PATCH 3/4] KVM: VMX: simplify and fix vmx_vcpu_pi_load Paolo Bonzini
2017-07-28 4:22 ` Longpeng (Mike)
2017-07-28 5:14 ` Longpeng (Mike)
2017-06-06 10:57 ` [PATCH 4/4] KVM: VMX: simplify cmpxchg of PI descriptor control field Paolo Bonzini
2017-06-07 9:33 ` [PATCH CFT 0/4] VT-d PI fixes Gonglei (Arei)
2017-06-07 14:32 ` Paolo Bonzini
2017-07-11 8:55 ` Paolo Bonzini
2017-07-11 9:16 ` Gonglei (Arei)
2017-09-21 8:23 ` Longpeng (Mike)
2017-09-21 9:42 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5936A3E2.1090509@huawei.com \
--to=longpeng2@huawei.com \
--cc=arei.gonglei@huawei.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=wangxinxin.wang@huawei.com \
--cc=weidong.huang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.