From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
K Prateek Nayak <kprateek.nayak@amd.com>,
David Matlack <dmatlack@google.com>,
Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Subject: [PATCH v2 00/12] KVM: Make irqfd registration globally unique
Date: Mon, 19 May 2025 11:55:02 -0700 [thread overview]
Message-ID: <20250519185514.2678456-1-seanjc@google.com> (raw)
Ingo/Peter,
Any objection to taking this through the KVM tree? (6.17 or later)
Assuming no one objects to the KVM changes...
Rework KVM's irqfd registration to require that an eventfd is bound to at
most one irqfd throughout the entire system. KVM currently disallows
binding an eventfd to multiple irqfds for a single VM, but doesn't reject
attempts to bind an eventfd to multiple VMs.
This is obvious an ABI change, but I'm fairly confident that it won't
break userspace, because binding an eventfd to multiple irqfds hasn't
truly worked since commit e8dbf19508a1 ("kvm/eventfd: Use priority waitqueue
to catch events before userspace"). A somewhat undocumented, and perhaps
even unintentional, side effect of suppressing eventfd notifications for
userspace is that the priority+exclusive behavior also suppresses eventfd
notifications for any subsequent waiters, even if they are priority waiters.
I.e. only the first VM with an irqfd+eventfd binding will get notifications.
And for IRQ bypass, a.k.a. device posted interrupts, globally unique
bindings are a hard requirement (at least on x86; I assume other archs are
the same). KVM and the IRQ bypass manager kinda sorta handle this, but in
the absolute worst way possible (IMO). Instead of surfacing an error to
userspace, KVM silently ignores IRQ bypass registration errors.
The motivation for this series is to harden against userspace goofs. AFAIK,
we (Google) have never actually had a bug where userspace tries to assign
an eventfd to multiple VMs, but the possibility has come up in more than one
bug investigation (our intra-host, a.k.a. copyless, migration scheme
transfers eventfds from the old to the new VM when updating the host VMM).
v2: Use guard(spinlock_irqsave). [Prateek]
v1: https://lore.kernel.org/all/20250401204425.904001-1-seanjc@google.com
Sean Christopherson (12):
KVM: Use a local struct to do the initial vfs_poll() on an irqfd
KVM: Acquire SCRU lock outside of irqfds.lock during assignment
KVM: Initialize irqfd waitqueue callback when adding to the queue
KVM: Add irqfd to KVM's list via the vfs_poll() callback
KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock
sched/wait: Add a waitqueue helper for fully exclusive priority
waiters
KVM: Disallow binding multiple irqfds to an eventfd with a priority
waiter
sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority()
KVM: Drop sanity check that per-VM list of irqfds is unique
KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test
KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD
KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements
include/linux/kvm_irqfd.h | 1 -
include/linux/wait.h | 2 +
kernel/sched/wait.c | 22 ++-
tools/testing/selftests/kvm/Makefile.kvm | 1 +
tools/testing/selftests/kvm/arm64/vgic_irq.c | 12 +-
.../testing/selftests/kvm/include/kvm_util.h | 40 ++++++
tools/testing/selftests/kvm/irqfd_test.c | 130 ++++++++++++++++++
.../selftests/kvm/x86/xen_shinfo_test.c | 21 +--
virt/kvm/eventfd.c | 130 +++++++++++++-----
9 files changed, 294 insertions(+), 65 deletions(-)
create mode 100644 tools/testing/selftests/kvm/irqfd_test.c
base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f
--
2.49.0.1101.gccaa498523-goog
next reply other threads:[~2025-05-19 19:55 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-19 18:55 Sean Christopherson [this message]
2025-05-19 18:55 ` [PATCH v2 01/12] KVM: Use a local struct to do the initial vfs_poll() on an irqfd Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 02/12] KVM: Acquire SCRU lock outside of irqfds.lock during assignment Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 03/12] KVM: Initialize irqfd waitqueue callback when adding to the queue Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 04/12] KVM: Add irqfd to KVM's list via the vfs_poll() callback Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 05/12] KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 06/12] sched/wait: Add a waitqueue helper for fully exclusive priority waiters Sean Christopherson
2025-05-20 19:17 ` Peter Zijlstra
2025-05-20 20:57 ` Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 07/12] KVM: Disallow binding multiple irqfds to an eventfd with a priority waiter Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 08/12] sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority() Sean Christopherson
2025-05-20 19:18 ` Peter Zijlstra
2025-05-20 22:20 ` Sean Christopherson
2025-05-21 11:42 ` Peter Zijlstra
2025-05-21 14:44 ` Michael Kelley
2025-05-21 15:05 ` Sean Christopherson
2025-05-21 13:22 ` Jürgen Groß
2025-05-19 18:55 ` [PATCH v2 09/12] KVM: Drop sanity check that per-VM list of irqfds is unique Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 10/12] KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 11/12] KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD Sean Christopherson
2025-05-19 18:55 ` [PATCH v2 12/12] KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250519185514.2678456-1-seanjc@google.com \
--to=seanjc@google.com \
--cc=dmatlack@google.com \
--cc=jgross@suse.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=oleksandr_tyshchenko@epam.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=sstabellini@kernel.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).