From: Nicolas Saenz Julienne <nsaenz@amazon.com>
To: Sean Christopherson <seanjc@google.com>
Cc: <kvm@vger.kernel.org>, <pbonzini@redhat.com>,
<linux-kernel@vger.kernel.org>, <graf@amazon.de>,
<dwmw2@infradead.org>, <fgriffo@amazon.com>, <anelkz@amazon.de>,
<peterz@infradead.org>
Subject: Re: [RFC] KVM: Allow polling vCPUs for events
Date: Thu, 5 Oct 2023 16:27:54 +0000 [thread overview]
Message-ID: <CW0NB512KP2E.2ZZD07F49LND3@amazon.com> (raw)
In-Reply-To: <ZR35gq1NICwhOUAS@google.com>
Hi Sean,
Thanks for taking the time to look at this.
On Wed Oct 4, 2023 at 11:47 PM UTC, Sean Christopherson wrote:
[...]
> > @@ -3996,6 +4002,39 @@ static int kvm_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
> > return 0;
> > }
> >
> > +static __poll_t kvm_vcpu_poll(struct file *file, poll_table *wait)
> > +{
> > + struct kvm_vcpu *vcpu = file->private_data;
> > +
> > + if (!vcpu->poll_mask)
> > + return EPOLLERR;
> > +
> > + switch (READ_ONCE(vcpu->mode)) {
> > + case OUTSIDE_GUEST_MODE:
> > + /*
> > + * Make sure writes to vcpu->request are visible before the
> > + * mode changes.
> > + */
>
> Huh? There are no writes to vcpu->request anywhere in here.
My thinking was the vcpu->requests load below could've been speculated
ahead of vcpu->mode's store, this will miss events when first entering
poll().
Since you pointed this out, I thought about it further. There is still
room for a race with the code as is, we need read vcpu->requests only
after poll_wait() returns, so as to make sure concurrent
kvm_make_request()/kvm_vcpu_kick() either wake up poll, or are visible
through the vcpu->requests check that precedes sleeping.
[...]
> > + WRITE_ONCE(vcpu->mode, OUTSIDE_GUEST_MODE);
>
> This does not look remotely safe on multiple fronts. For starters, I don't see
> anything in the .poll() infrastructure that provides serialization, e.g. if there
> are multiple tasks polling then this will be "interesting".
Would allowing only one poller be acceptable?
> And there is zero chance this is race-free, e.g. nothing prevents the vCPU task
> itself from changing vcpu->mode from POLLING_FOR_EVENTS to something else.
>
> Why on earth is this mucking with vcpu->mode? Ignoring for the moment that using
> vcpu->requests as the poll source is never going to happen, there's zero reason
IIUC accessing vcpu->requests in the kvm_vcpu_poll() is out of the
question? Aren't we're forced to do so in order to avoid the race I
mention above.
> to write vcpu->mode. From a correctness perspective, AFAICT there's no need for
> any shenanigans at all, i.e. kvm_make_vcpu_request() could blindly and unconditionally
> call wake_up_interruptible().
I was fixated with the halt/vtl_return use-cases, where we're either
running the vCPU or polling, and it seemed a decent way to policy
whether calling wake_up_interruptible() is needed. Clearly not the case,
I'll get rid of all the vcpu->mode mucking. :)
> I suspect what you want is a fast way to track if there *may* be pollers. Keying
> off and *writing* vcpu->mode makes no sense to me.
>
> I think what you want is something like this, where kvm_vcpu_poll() could use
> atomic_fetch_or() and atomic_fetch_andnot() to manipulate vcpu->poll_mask.
> Or if we only want to support a single poller at a time, it could be a vanilla
> u64. I suspect getting the poll_mask manipulation correct for multiple pollers
> would be tricky, e.g. to avoid false negatives and leave a poller hanging.
I'll have a go at the multiple poller approach.
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 486800a7024b..5a260fb3b248 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -259,6 +259,14 @@ static inline bool kvm_kick_many_cpus(struct cpumask *cpus, bool wait)
> return true;
> }
>
> +static inline bool kvm_request_is_being_polled(struct kvm_vcpu *vcpu,
> + unsigned int req)
> +{
> + u32 poll_mask = kvm_request_to_poll_mask(req);
> +
> + return (atomic_read(vcpu->poll_mask) & poll_mask)
> +}
> +
> static void kvm_make_vcpu_request(struct kvm_vcpu *vcpu, unsigned int req,
> struct cpumask *tmp, int current_cpu)
> {
> @@ -285,6 +293,9 @@ static void kvm_make_vcpu_request(struct kvm_vcpu *vcpu, unsigned int req,
> if (cpu != -1 && cpu != current_cpu)
> __cpumask_set_cpu(cpu, tmp);
> }
> +
> + if (kvm_request_is_being_polled(vcpu, req))
> + wake_up_interruptible(...);
> }
>
> bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
I'll use this approach.
So since we have to provide a proper uAPI, do you have anything against
having user-space set the polling mask through an ioctl? Also any
suggestions on how kvm_request_to_poll_mask() should look like. For ex.
VSM mostly cares for regular interrupts/timers, so mapping
KVM_REQ_UNBLOCK, KVM_REQ_HV_STIMER, KVM_REQ_EVENT, KVM_REQ_SMI,
KVM_REQ_NMI
to a KVM_POLL_INTERRUPTS_FLAG would work. We can then have ad-hoc flags
for async-pf, kvmclock updates, dirty logging, etc...
Nicolas
next prev parent reply other threads:[~2023-10-05 16:33 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-01 11:13 [RFC] KVM: Allow polling vCPUs for events Nicolas Saenz Julienne
2023-10-04 23:47 ` Sean Christopherson
2023-10-05 16:27 ` Nicolas Saenz Julienne [this message]
2023-10-06 1:16 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CW0NB512KP2E.2ZZD07F49LND3@amazon.com \
--to=nsaenz@amazon.com \
--cc=anelkz@amazon.de \
--cc=dwmw2@infradead.org \
--cc=fgriffo@amazon.com \
--cc=graf@amazon.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox