All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com
Subject: Re: [PATCH] KVM: VMX: fix lockdep warning on posted intr wakeup
Date: Fri, 10 Mar 2023 09:00:00 -0800	[thread overview]
Message-ID: <ZAtiEO/DST05GRRN@google.com> (raw)
In-Reply-To: <20230310155955.29652-1-yan.y.zhao@intel.com>

On Fri, Mar 10, 2023, Yan Zhao wrote:
> Use rcu list to break the possible circular locking dependency reported
> by lockdep.
> 
> path 1, ``sysvec_kvm_posted_intr_wakeup_ipi()`` --> ``pi_wakeup_handler()``
>          -->  ``kvm_vcpu_wake_up()`` --> ``try_to_wake_up()``,
>          the lock sequence is
>          &per_cpu(wakeup_vcpus_on_cpu_lock, cpu) --> &p->pi_lock.

Heh, that's an unfortunate naming collision.  It took me a bit of staring to
realize pi_lock is a scheduler lock, not a posted interrupt lock.

> path 2, ``schedule()`` --> ``kvm_sched_out()`` --> ``vmx_vcpu_put()`` -->
>         ``vmx_vcpu_pi_put()`` --> ``pi_enable_wakeup_handler()``,
>          the lock sequence is
>          &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu).
> 
> path 3, ``task_rq_lock()``,
>         the lock sequence is &p->pi_lock --> &rq->__lock
> 
> lockdep report:
>  Chain exists of:
>    &p->pi_lock --> &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu)
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                CPU1
>         ----                ----
>    lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
>                             lock(&rq->__lock);
>                             lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
>    lock(&p->pi_lock);
> 
>   *** DEADLOCK ***

I don't think there's a deadlock here.  pi_wakeup_handler() is called from IRQ
context, pi_enable_wakeup_handler() disable IRQs before acquiring
wakeup_vcpus_on_cpu_lock, and "cpu" in pi_enable_wakeup_handler() is guaranteed
to be the current CPU, i.e. the same CPU.  So CPU0 and CPU1 can't be contending
for the same wakeup_vcpus_on_cpu_lock in this scenario.

vmx_vcpu_pi_load() does do cross-CPU locking, but finish_task_switch() drops
rq->__lock before invoking the sched_in notifiers.

> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  arch/x86/kvm/vmx/posted_intr.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 94c38bea60e7..e3ffc45c0a7b 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -90,7 +90,7 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>  	 */
>  	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) {
>  		raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
> -		list_del(&vmx->pi_wakeup_list);
> +		list_del_rcu(&vmx->pi_wakeup_list);
>  		raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));

_If_ there is indeed a possible deadlock, there technically needs to be an explicit 
synchonize_rcu() before freeing the vCPU.  In practice, there are probably multiple
synchonize_rcu() calls in the destruction path, not to mention that it would take a
minor miracle for pi_wakeup_handler() to get stalled long enough to achieve a
use-after-free.

>  	}
>  
> @@ -153,7 +153,7 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
>  	local_irq_save(flags);
>  
>  	raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
> -	list_add_tail(&vmx->pi_wakeup_list,
> +	list_add_tail_rcu(&vmx->pi_wakeup_list,
>  		      &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu));
>  	raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));


> @@ -219,16 +219,14 @@ void pi_wakeup_handler(void)
>  {
>  	int cpu = smp_processor_id();
>  	struct list_head *wakeup_list = &per_cpu(wakeup_vcpus_on_cpu, cpu);
> -	raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, cpu);
>  	struct vcpu_vmx *vmx;
>  
> -	raw_spin_lock(spinlock);
> -	list_for_each_entry(vmx, wakeup_list, pi_wakeup_list) {
> -
> +	rcu_read_lock();

This isn't strictly necessary, IRQs are disabled.

> +	list_for_each_entry_rcu(vmx, wakeup_list, pi_wakeup_list) {
>  		if (pi_test_on(&vmx->pi_desc))
>  			kvm_vcpu_wake_up(&vmx->vcpu);
>  	}
> -	raw_spin_unlock(spinlock);
> +	rcu_read_unlock();
>  }
>  
>  void __init pi_init_cpu(int cpu)
> 
> base-commit: 89400df96a7570b651404bbc3b7afe627c52a192
> -- 
> 2.17.1
> 

  reply	other threads:[~2023-03-10 17:02 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-10 15:59 [PATCH] KVM: VMX: fix lockdep warning on posted intr wakeup Yan Zhao
2023-03-10 17:00 ` Sean Christopherson [this message]
2023-03-13  8:15   ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAtiEO/DST05GRRN@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.