From: Sean Christopherson <seanjc@google.com>
To: "K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Shuah Khan <shuah@kernel.org>, Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Sean Christopherson <seanjc@google.com>
Cc: linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
xen-devel@lists.xenproject.org, kvm@vger.kernel.org,
linux-kselftest@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
K Prateek Nayak <kprateek.nayak@amd.com>,
David Matlack <dmatlack@google.com>
Subject: [PATCH v3 05/13] KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock
Date: Thu, 22 May 2025 16:52:15 -0700 [thread overview]
Message-ID: <20250522235223.3178519-6-seanjc@google.com> (raw)
In-Reply-To: <20250522235223.3178519-1-seanjc@google.com>
Add an irqfd to its target eventfd's waitqueue while holding irqfds.lock,
which is mildly terrifying but functionally safe. irqfds.lock is taken
inside the waitqueue's lock, but if and only if the eventfd is being
released, i.e. that path is mutually exclusive with registration as KVM
holds a reference to the eventfd (and obviously must do so to avoid UAF).
This will allow using the eventfd's waitqueue to enforce KVM's requirement
that eventfd is assigned to at most one irqfd, without introducing races.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
virt/kvm/eventfd.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 99274d60335d..04877b297267 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -204,6 +204,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
int ret = 0;
if (flags & EPOLLIN) {
+ /*
+ * WARNING: Do NOT take irqfds.lock in any path except EPOLLHUP,
+ * as KVM holds irqfds.lock when registering the irqfd with the
+ * eventfd.
+ */
u64 cnt;
eventfd_ctx_do_read(irqfd->eventfd, &cnt);
@@ -225,6 +230,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
/* The eventfd is closing, detach from KVM */
unsigned long iflags;
+ /*
+ * Taking irqfds.lock is safe here, as KVM holds a reference to
+ * the eventfd when registering the irqfd, i.e. this path can't
+ * be reached while kvm_irqfd_add() is running.
+ */
spin_lock_irqsave(&kvm->irqfds.lock, iflags);
/*
@@ -296,16 +306,21 @@ static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh,
list_add_tail(&irqfd->list, &kvm->irqfds.items);
- spin_unlock_irq(&kvm->irqfds.lock);
-
/*
* Add the irqfd as a priority waiter on the eventfd, with a custom
* wake-up handler, so that KVM *and only KVM* is notified whenever the
- * underlying eventfd is signaled.
+ * underlying eventfd is signaled. Temporarily lie to lockdep about
+ * holding irqfds.lock to avoid a false positive regarding potential
+ * deadlock with irqfd_wakeup() (see irqfd_wakeup() for details).
*/
init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+ spin_release(&kvm->irqfds.lock.dep_map, _RET_IP_);
add_wait_queue_priority(wqh, &irqfd->wait);
+ spin_acquire(&kvm->irqfds.lock.dep_map, 0, 0, _RET_IP_);
+
+ spin_unlock_irq(&kvm->irqfds.lock);
+
p->ret = 0;
}
--
2.49.0.1151.ga128411c76-goog
next prev parent reply other threads:[~2025-05-22 23:52 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-22 23:52 [PATCH v3 00/13] KVM: Make irqfd registration globally unique Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 01/13] KVM: Use a local struct to do the initial vfs_poll() on an irqfd Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 02/13] KVM: Acquire SCRU lock outside of irqfds.lock during assignment Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 03/13] KVM: Initialize irqfd waitqueue callback when adding to the queue Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 04/13] KVM: Add irqfd to KVM's list via the vfs_poll() callback Sean Christopherson
2025-05-22 23:52 ` Sean Christopherson [this message]
2025-05-22 23:52 ` [PATCH v3 06/13] sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority() Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 07/13] xen: privcmd: Don't mark eventfd waiter as EXCLUSIVE Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 08/13] sched/wait: Add a waitqueue helper for fully exclusive priority waiters Sean Christopherson
2025-05-30 8:45 ` K Prateek Nayak
2025-05-22 23:52 ` [PATCH v3 09/13] KVM: Disallow binding multiple irqfds to an eventfd with a priority waiter Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 10/13] KVM: Drop sanity check that per-VM list of irqfds is unique Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 11/13] KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 12/13] KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD Sean Christopherson
2025-05-22 23:52 ` [PATCH v3 13/13] KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements Sean Christopherson
2025-05-23 7:23 ` Sairaj Kodilkar
2025-05-23 14:33 ` Sean Christopherson
2025-05-26 3:36 ` Sairaj Kodilkar
2025-05-23 11:14 ` [PATCH v3 00/13] KVM: Make irqfd registration globally unique Peter Zijlstra
2025-05-30 8:49 ` K Prateek Nayak
2025-06-24 19:38 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250522235223.3178519-6-seanjc@google.com \
--to=seanjc@google.com \
--cc=decui@microsoft.com \
--cc=dmatlack@google.com \
--cc=haiyangz@microsoft.com \
--cc=jgross@suse.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=kys@microsoft.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=shuah@kernel.org \
--cc=sstabellini@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=wei.liu@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox