public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: irqfd: fix shutdown deadlock by moving SRCU sync outside resampler_lock
@ 2026-03-16  7:20 Sonam Sanju
  2026-03-17 16:27 ` Sonam Sanju
  0 siblings, 1 reply; 12+ messages in thread
From: Sonam Sanju @ 2026-03-16  7:20 UTC (permalink / raw)
  To: pbonzini; +Cc: kvm, linux-kernel, Sonam Sanju

irqfd_resampler_shutdown() calls synchronize_srcu_expedited() while
holding kvm->irqfds.resampler_lock.  This can deadlock when multiple
irqfd_shutdown workers run concurrently on the kvm-irqfd-cleanup
workqueue during VM teardown (e.g. crosvm shutdown on Android):

  CPU A (mutex holder)               CPU B/C/D (mutex waiters)
  irqfd_shutdown()                   irqfd_shutdown()
   irqfd_resampler_shutdown()         irqfd_resampler_shutdown()
    mutex_lock(resampler_lock)  <---- mutex_lock(resampler_lock)  // BLOCKED
    list_del_rcu(...)                     ...blocked...
    synchronize_srcu_expedited()      // Waiters block workqueue,
      // waits for SRCU grace            preventing SRCU grace
      // period which requires            period from completing
      // workqueue progress          --- DEADLOCK ---

The synchronize_srcu_expedited() in the else branch is called directly
within the mutex.  In the if-last branch, kvm_unregister_irq_ack_notifier()
also calls synchronize_srcu_expedited() internally.  Both paths can
block indefinitely because:

  1. synchronize_srcu_expedited() waits for an SRCU grace period
  2. SRCU grace period completion needs workqueue workers to run
  3. The blocked mutex waiters occupy workqueue slots, preventing
     progress
  4. The mutex holder never releases the lock -> deadlock

Fix by performing all list manipulations and the last-entry check under
the mutex, then releasing the mutex before the SRCU synchronization.
This is safe because:

  - list_del_rcu() removes the irqfd from resampler->list under the
    mutex, so no concurrent reader or writer can access it.
  - When last==true, list_del_rcu(&resampler->link) has already removed
    the resampler from kvm->irqfds.resampler_list under the mutex, so
    no other worker can find or operate on this resampler.
  - kvm_unregister_irq_ack_notifier() uses its own locking
    (kvm->irq_lock) and is safe to call without resampler_lock.
  - synchronize_srcu_expedited() does not require any KVM mutex.
  - kfree(resampler) is safe after SRCU sync guarantees all readers
    have finished.

Signed-off-by: Sonam Sanju <sonam.sanju@intel.com>
---
 virt/kvm/eventfd.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 0e8b8a2c5b79..27bcf2b1a81d 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -93,6 +93,7 @@ irqfd_resampler_shutdown(struct kvm_kernel_irqfd *irqfd)
 {
 	struct kvm_kernel_irqfd_resampler *resampler = irqfd->resampler;
 	struct kvm *kvm = resampler->kvm;
+	bool last = false;
 
 	mutex_lock(&kvm->irqfds.resampler_lock);
 
@@ -100,19 +101,27 @@ irqfd_resampler_shutdown(struct kvm_kernel_irqfd *irqfd)
 
 	if (list_empty(&resampler->list)) {
 		list_del_rcu(&resampler->link);
+		last = true;
+	}
+
+	mutex_unlock(&kvm->irqfds.resampler_lock);
+
+	/*
+	 * synchronize_srcu_expedited() (called explicitly below, or internally
+	 * by kvm_unregister_irq_ack_notifier()) must not be invoked under
+	 * resampler_lock.  Holding the mutex while waiting for an SRCU grace
+	 * period creates a deadlock: the blocked mutex waiters occupy workqueue
+	 * slots that the SRCU grace period machinery needs to make forward
+	 * progress.
+	 */
+	if (last) {
 		kvm_unregister_irq_ack_notifier(kvm, &resampler->notifier);
-		/*
-		 * synchronize_srcu_expedited(&kvm->irq_srcu) already called
-		 * in kvm_unregister_irq_ack_notifier().
-		 */
 		kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
 			    resampler->notifier.gsi, 0, false);
 		kfree(resampler);
 	} else {
 		synchronize_srcu_expedited(&kvm->irq_srcu);
 	}
-
-	mutex_unlock(&kvm->irqfds.resampler_lock);
 }
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-06 23:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16  7:20 [PATCH] KVM: irqfd: fix shutdown deadlock by moving SRCU sync outside resampler_lock Sonam Sanju
2026-03-17 16:27 ` Sonam Sanju
2026-03-20 12:56   ` Vineeth Pillai (Google)
2026-03-23  5:33     ` Sonam Sanju
2026-03-23  6:42       ` [PATCH v2] KVM: irqfd: fix deadlock by moving synchronize_srcu out of resampler_lock Sonam Sanju
2026-03-31 18:17         ` Sean Christopherson
2026-03-31 20:51           ` Paul E. McKenney
2026-04-01  9:47             ` Sonam Sanju
2026-04-06 23:09             ` Paul E. McKenney
2026-04-01  9:34         ` Kunwu Chan
2026-04-01 14:24           ` Sonam Sanju
2026-04-06 14:20             ` Kunwu Chan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox