kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoffer Dall <cdall@linaro.org>
To: Andrew Jones <drjones@redhat.com>
Cc: marc.zyngier@arm.com, pbonzini@redhat.com,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org
Subject: Re: [PATCH v2 2/9] KVM: Add documentation for VCPU requests
Date: Tue, 4 Apr 2017 19:23:40 +0200	[thread overview]
Message-ID: <20170404172340.GQ11752@cbox> (raw)
In-Reply-To: <20170404170600.w6snnecqoi4aqv4d@kamzik.brq.redhat.com>

On Tue, Apr 04, 2017 at 07:06:00PM +0200, Andrew Jones wrote:
> On Tue, Apr 04, 2017 at 05:24:03PM +0200, Christoffer Dall wrote:
> > Hi Drew,
> > 
> > On Fri, Mar 31, 2017 at 06:06:51PM +0200, Andrew Jones wrote:
> > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > ---
> > >  Documentation/virtual/kvm/vcpu-requests.rst | 114 ++++++++++++++++++++++++++++
> > >  1 file changed, 114 insertions(+)
> > >  create mode 100644 Documentation/virtual/kvm/vcpu-requests.rst
> > > 
> > > diff --git a/Documentation/virtual/kvm/vcpu-requests.rst b/Documentation/virtual/kvm/vcpu-requests.rst
> > > new file mode 100644
> > > index 000000000000..ea4a966d5c8a
> > > --- /dev/null
> > > +++ b/Documentation/virtual/kvm/vcpu-requests.rst
> > > @@ -0,0 +1,114 @@
> > > +=================
> > > +KVM VCPU Requests
> > > +=================
> > > +
> > > +Overview
> > > +========
> > > +
> > > +KVM supports an internal API enabling threads to request a VCPU thread to
> > > +perform some activity.  For example, a thread may request a VCPU to flush
> > > +its TLB with a VCPU request.  The API consists of only four calls::
> > > +
> > > +  /* Check if VCPU @vcpu has request @req pending. Clears the request. */
> > > +  bool kvm_check_request(int req, struct kvm_vcpu *vcpu);
> > > +
> > > +  /* Check if any requests are pending for VCPU @vcpu. */
> > > +  bool kvm_request_pending(struct kvm_vcpu *vcpu);
> > > +
> > > +  /* Make request @req of VCPU @vcpu. */
> > > +  void kvm_make_request(int req, struct kvm_vcpu *vcpu);
> > > +
> > > +  /* Make request @req of all VCPUs of the VM with struct kvm @kvm. */
> > > +  bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
> > > +
> > > +Typically a requester wants the VCPU to perform the activity as soon
> > > +as possible after making the request.  This means most requests,
> > > +kvm_make_request() calls, are followed by a call to kvm_vcpu_kick(),
> > > +and kvm_make_all_cpus_request() has the kicking of all VCPUs built
> > > +into it.
> > > +
> > > +VCPU Kicks
> > > +----------
> > > +
> > > +A VCPU kick does one of three things:
> > > +
> > > + 1) wakes a sleeping VCPU (which sleeps outside guest mode).
> > 
> > You could clarify this to say that a sleeping VCPU is a VCPU thread
> > which is not runnable and placed on waitqueue, and waking it makes
> > the thread runnable again.
> > 
> > > + 2) sends an IPI to a VCPU currently in guest mode, in order to bring it
> > > +    out.
> > > + 3) nothing, when the VCPU is already outside guest mode and not sleeping.
> > > +
> > > +VCPU Request Internals
> > > +======================
> > > +
> > > +VCPU requests are simply bit indices of the vcpu->requests bitmap.  This
> > > +means general bitops[1], e.g. clear_bit(KVM_REQ_UNHALT, &vcpu->requests),
> > > +may also be used.  The first 8 bits are reserved for architecture
> > > +independent requests, all additional bits are available for architecture
> > > +dependent requests.
> > 
> > Should we explain the ones that are generically defined and how they're
> > supposed to be used?  For example, we don't use them on ARM, and I don't
> > think I understand why another thread would ever make a PENDING_TIMER
> > request on a vcpu?
> 
> Yes, I agree the general requests should be described.  I'll have to
> figure out how :-)  Describing KVM_REQ_UNHALT will likely lead to a
> subsection on kvm_vcpu_block(), as you bring up below.
> 
> > 
> > > +
> > > +VCPU Requests with Associated State
> > > +===================================
> > > +
> > > +Requesters that want the requested VCPU to handle new state need to ensure
> > > +the state is observable to the requested VCPU thread's CPU at the time the
> > 
> > nit: need to ensure that the newly written state is observable ... by
> > the time it observed the request.
> > 
> > > +CPU observes the request.  This means a write memory barrier should be
> >                                                                  ^^^
> > 							         must
> > 
> > > +insert between the preparation of the state and the write of the VCPU
> >     ^^^
> >    inserted
> > 
> > I would rephrase this as: '... after writing the new state to memory and
> > before setting the VCPU request bit.'
> > 
> > 
> > > +request bitmap.  Additionally, on the requested VCPU thread's side, a
> > > +corresponding read barrier should be issued after reading the request bit
> >                                 ^^^       ^^^
> > 			       must      inserted (for consistency)
> > 
> > 
> > 
> > > +and before proceeding to use the state associated with it.  See the kernel
> >                             ^^^    ^
> > 		           read    new
> > 
> > 
> > > +memory barrier documentation [2].
> > 
> > I think it would be great if this document explains if this is currently
> > taken care of by the API you explain above or if there are cases where
> > people have to explicitly insert these barriers, and in that case, which
> > barriers they should use (if we know at this point already).
> 
> Will do.  The current API does take care of it.  I'll state that.  I'd
> have to grep around to see if there are any non-API users that also need
> barriers, but as they could change, I probably wouldn't want to call them
> out is the doc.  So I guess I'll still just wave my hand at that type of
> use.
> 

Sounds good.

> > 
> > > +
> > > +VCPU Requests and Guest Mode
> > > +============================
> > > +
> > 
> > I feel like an intro about the overall goal here is missing.  How about
> > something like this:
> > 
> >   When making requests to VCPUs, we want to avoid the receiving VCPU
> >   executing inside the guest for an arbitrary long time without handling
> >   the request.  The way we prevent this from happening is by keeping
> >   track of when a VCPU is running and sending an IPI to the physical CPU
> >   running the VCPU when that is the case.  However, each architecture
> >   implementation of KVM must take great care to ensure that requests are
> >   not missed when a VCPU stops running at the same time when a request
> >   is received.
> > 
> > Also, I'm not sure what the semantics are with kvm_vcpu_block().  Is it
> > ok to send a request to a VCPU and then the VCPU blocks and goes to
> > sleep forever even though there are pending requests?
> > kvm_vcpu_check_block() doesn't seem to check vcpu->requests which would
> > indicate that this is the case, but maybe architectures that actually do
> > use requests implement something else themselves?
> 
> I'll add a kvm_vcpu_block() subsection as part of the KVM_REQ_UNHALT
> documentation.
> 
> > 
> > > +As long as the guest is either in guest mode, in which case it gets an IPI
> > 
> > guest is in guest mode?
> 
> oops, s/guest/vcpu/
> 
> > 
> > Perhaps this could be more clearly written as:
> > 
> > As long as the VCPU is running, it is marked as having vcpu->mode =
> > IN_GUEST MODE.  A requesting thread observing IN_GUEST_MODE will send an
> > IPI to the CPU running the VCPU thread.  On the other hand, when a
> > requesting thread observes vcpu->mode == OUTSIDE_GUEST_MODE, it will not send
> > any IPIs, but will simply set the request bit, a the VCPU thread will be
> > able to check the requests before running the VCPU again.  However, the
> > transition...
> > 
> > > +and will definitely see the request, or is outside guest mode, but has yet
> > > +to do its final request check, and therefore when it does, it will see the
> > > +request, then things will work.  However, the transition from outside to
> > > +inside guest mode, after the last request check has been made, opens a
> > > +window where a request could be made, but the VCPU would not see until it
> > > +exits guest mode some time later.  See the table below.
> > 
> > This text, and the table below, only deals with the details of entering
> > the guest.  Should we talk about kvm_vcpu_exiting_guest_mode() and
> > anything related to exiting the guest?
> 
> I think all !IN_GUEST_MODE should behave the same, so I was avoiding
> the use of EXITING_GUEST_MODE and OUTSIDE_GUEST_MODE, which wouldn't be
> hard to address, but then I'd also have to address
> READING_SHADOW_PAGE_TABLES, which may complicate the document more than
> necessary.  I'm not sure we need to address a VCPU exiting guest mode,
> other than making sure it's clear that a VCPU that exits must check
> requests before it enters again.

But the problem is that kvm_make_all_cpus_request() only sends IPIs to
CPUs where the mode was different from OUTSIDE_GUEST_MODE, so there it's
about !OUTSIDE_GUEST_MODE rather than !IN_GUEST_MODE, so there's some
subtlety here which I feel like it's dangerous to paper over.

> 
> > 
> > > +
> > > ++------------------+-----------------+----------------+--------------+
> > > +| vcpu->mode       | done last check | kick sends IPI | request seen |
> > > ++==================+=================+================+==============+
> > > +| IN_GUEST_MODE    |      N/A        |      YES       |     YES      |
> > > ++------------------+-----------------+----------------+--------------+
> > > +| !IN_GUEST_MODE   |      NO         |      NO        |     YES      |
> > > ++------------------+-----------------+----------------+--------------+
> > > +| !IN_GUEST_MODE   |      YES        |      NO        |     NO       |
> > > ++------------------+-----------------+----------------+--------------+
> > > +
> > > +To ensure the third scenario shown in the table above cannot happen, we
> > > +need to ensure the VCPU's mode change is observable by all CPUs prior to
> > > +its final request check and that a requester's request is observable by
> > > +the requested VCPU prior to the kick.  To do that we need general memory
> > > +barriers between each pair of operations involving mode and requests, i.e.
> > > +
> > > +  CPU_i                                  CPU_j
> > > +-------------------------------------------------------------------------
> > > +  vcpu->mode = IN_GUEST_MODE;            kvm_make_request(REQ, vcpu);
> > > +  smp_mb();                              smp_mb();
> > > +  if (kvm_request_pending(vcpu))         if (vcpu->mode == IN_GUEST_MODE)
> > > +      handle_requests();                     send_IPI(vcpu->cpu);
> > > +
> > > +Whether explicit barriers are needed, or reliance on implicit barriers is
> > > +sufficient, is architecture dependent.  Alternatively, an architecture may
> > > +choose to just always send the IPI, as not sending it, when it's not
> > > +necessary, is just an optimization.
> > 
> > Is this universally true?  This is certainly true on ARM, because we
> > disable interrupts before doing all this, so the IPI remains pending and
> > causes an immediate exit, but if any of the above is done with
> > interrupts enabled, just sending an IPI does nothing to ensure the
> > request is observed.  Perhaps this is not a case we should care about.
> 
> I'll try to make this less generic, as some architectures may not work
> this way.  Indeed, s390 doesn't seem to have kvm_vcpu_kick(), so I guess
> things don't work this way for them.
> 
> > 
> > > +
> > > +Additionally, the error prone third scenario described above also exhibits
> > > +why a request-less VCPU kick is almost never correct.  Without the
> > > +assurance that a non-IPI generating kick will still result in an action by
> > > +the requested VCPU, as the final kvm_request_pending() check does, then
> > > +the kick may not initiate anything useful at all.  If, for instance, a
> > > +request-less kick was made to a VCPU that was just about to set its mode
> > > +to IN_GUEST_MODE, meaning no IPI is sent, then the VCPU may continue its
> > > +entry without actually having done whatever it was the kick was meant to
> > > +initiate.
> > 
> > Indeed.
> > 
> > 
> > > +
> > > +References
> > > +==========
> > > +
> > > +[1] Documentation/core-api/atomic_ops.rst
> > > +[2] Documentation/memory-barriers.txt
> > > -- 
> > > 2.9.3
> > > 
> > 
> > This is a great writeup!  I enjoyed reading it and it made me think more
> > carefully about a number of things, so I definitely think we should
> > merge this.
> >
> 
> Thanks Christoffer!  I'll take all your suggestions above and try to
> answer your questions for v2.
> 

Awesome, I hope Radim finds this useful for his series and the rework
later on.

Thanks,
-Christoffer

  reply	other threads:[~2017-04-04 17:23 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 16:06 [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Andrew Jones
2017-03-31 16:06 ` [PATCH v2 1/9] KVM: add kvm_request_pending Andrew Jones
2017-04-04 15:30   ` Christoffer Dall
2017-04-04 16:41     ` Andrew Jones
2017-04-05 13:10       ` Radim Krčmář
2017-04-05 17:39         ` Christoffer Dall
2017-04-05 18:30           ` Paolo Bonzini
2017-04-05 20:20           ` Radim Krčmář
2017-04-06 12:02             ` Andrew Jones
2017-04-06 14:37               ` Christoffer Dall
2017-04-06 15:08                 ` Andrew Jones
2017-04-07 15:33                   ` Paolo Bonzini
2017-04-08 18:19                     ` Christoffer Dall
2017-04-06 14:25             ` Christoffer Dall
2017-04-07 13:15               ` Radim Krčmář
2017-04-08 18:23                 ` Christoffer Dall
2017-04-08 19:32                   ` Paolo Bonzini
2017-04-11 21:06                     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 2/9] KVM: Add documentation for VCPU requests Andrew Jones
2017-04-04 15:24   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-04-04 17:23       ` Christoffer Dall [this message]
2017-04-04 17:36         ` Paolo Bonzini
2017-04-05 14:11         ` Radim Krčmář
2017-04-05 17:45           ` Christoffer Dall
2017-04-05 18:29             ` Paolo Bonzini
2017-04-05 20:46               ` Radim Krčmář
2017-04-06 14:29                 ` Christoffer Dall
2017-04-07 11:44                   ` Paolo Bonzini
2017-04-06 14:27               ` Christoffer Dall
2017-04-06 10:18   ` Christian Borntraeger
2017-04-06 12:08     ` Andrew Jones
2017-04-06 12:29     ` Radim Krčmář
2017-03-31 16:06 ` [PATCH v2 3/9] KVM: arm/arm64: prepare to use vcpu requests Andrew Jones
2017-04-04 15:34   ` Christoffer Dall
2017-04-04 17:06     ` Andrew Jones
2017-03-31 16:06 ` [PATCH v2 4/9] KVM: arm/arm64: replace vcpu->arch.pause with a vcpu request Andrew Jones
2017-04-04 13:39   ` Marc Zyngier
2017-04-04 14:47     ` Andrew Jones
2017-04-04 14:51       ` Paolo Bonzini
2017-04-04 15:05         ` Marc Zyngier
2017-04-04 17:07         ` Andrew Jones
2017-04-04 16:04   ` Christoffer Dall
2017-04-04 16:24     ` Paolo Bonzini
2017-04-04 17:19       ` Christoffer Dall
2017-04-04 17:35         ` Paolo Bonzini
2017-04-04 17:57           ` Christoffer Dall
2017-04-04 18:15             ` Paolo Bonzini
2017-04-04 18:38               ` Christoffer Dall
2017-04-04 18:18           ` Andrew Jones
2017-04-04 18:59             ` Paolo Bonzini
2017-04-04 17:57     ` Andrew Jones
2017-04-04 19:04       ` Christoffer Dall
2017-04-04 20:10         ` Paolo Bonzini
2017-04-05  7:09           ` Christoffer Dall
2017-04-05 11:37             ` Paolo Bonzini
2017-04-06 14:14               ` Christoffer Dall
2017-04-07 11:47                 ` Paolo Bonzini
2017-04-08  8:35                   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 5/9] KVM: arm/arm64: replace vcpu->arch.power_off " Andrew Jones
2017-04-04 17:37   ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 6/9] KVM: arm/arm64: use a vcpu request on irq injection Andrew Jones
2017-04-04 17:42   ` Christoffer Dall
2017-04-04 18:27     ` Andrew Jones
2017-04-04 18:59     ` Paolo Bonzini
2017-04-04 18:51   ` Paolo Bonzini
2017-03-31 16:06 ` [PATCH v2 7/9] KVM: arm/arm64: PMU: remove request-less vcpu kick Andrew Jones
2017-04-04 17:46   ` Christoffer Dall
2017-04-04 18:29     ` Andrew Jones
2017-04-04 19:35       ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 8/9] KVM: arm/arm64: fix race in kvm_psci_vcpu_on Andrew Jones
2017-04-04 19:42   ` Christoffer Dall
2017-04-05  8:35     ` Andrew Jones
2017-04-05  8:50       ` Christoffer Dall
2017-04-05  9:12         ` Andrew Jones
2017-04-05  9:30           ` Christoffer Dall
2017-03-31 16:06 ` [PATCH v2 9/9] KVM: arm/arm64: avoid race by caching MPIDR Andrew Jones
2017-04-04 19:44   ` Christoffer Dall
2017-04-05  8:50     ` Andrew Jones
2017-04-05 11:03       ` Christoffer Dall
2017-04-05 11:14         ` Andrew Jones
2017-04-03 15:28 ` [PATCH v2 0/9] KVM: arm/arm64: race fixes and vcpu requests Christoffer Dall
2017-04-03 17:11   ` Paolo Bonzini
2017-04-04  7:27   ` Andrew Jones
2017-04-04 16:05     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170404172340.GQ11752@cbox \
    --to=cdall@linaro.org \
    --cc=drjones@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).